
This document provides detailed instructions on utilizing the Amorphic data platform to search, analyze, and share data efficiently. The process involves exploring data catalogs, applying filters, and managing data governance with advanced security options.
Begin your journey on the Amorphic platform with the data catalog. This feature allows you to search for data that has been accumulated over time within Amorphic, as well as data from your databases and other systems. When searching for a specific term, utilize the available filters to swiftly navigate through different search results and identify what you need.

For instance, when utilizing filters, focus on the domain filter, which helps in creating the medallion architecture within the system.

The domain filter categorizes data into bronze, silver, and gold layers. Currently, if you are interested in raw data, apply the bronze filter, which will display relevant results such as the Texas crash records.

Upon selecting a dataset, a description will be presented, offering insight into its content.

If intrigued by the dataset, delve deeper to examine the schema, column names, and any personally identifiable information (PII) data that is auto-tagged. You can further explore the dataset by accessing its detailed view.

To collect data, utilize the data sources within Amorphic. The platform supports seamless integration from nearly any source, enabling you to import various types of data with ease.

Once data is collected in Amorphic, it is stored within a dataset at an S3 path, similar to what is displayed here.

You can view the files associated with this dataset directly, or download each file as needed.

The profile section provides metrics on data performance, quality, and the range of values within the dataset.

This profile also captures data lineage, illustrating the flow of data within your data pipeline.

Each column is displayed, showing how data transitions from the bronze layer to silver, and then to gold, which subsequently powers dashboards or reports.

Data quality checks are enabled for the dataset to maintain control over data in the data lake.

The system displays the number of successful records and highlights columns that have failed the quality checks.

Numerous data quality checks are available to ensure the integrity of your datasets.

To share the dataset, simply click on the share option, allowing you to distribute it to other users.

You can share the dataset with individuals or entire departments as needed.

Row-level and column-level security features enable you to share only specific parts of the dataset without involving any identifiers.

For instance, if you wish to share only PII data, include the relevant columns for users with the classification PII tag.

Simultaneously, you can share non-PII data by excluding PII columns for users with the non-PII classification tag.

By providing row-level and column-level security, governance is maintained in a straightforward manner for end-users on Amorphic and AWS.

Thank you for using the Amorphic data platform.
