![[transformation.drawio.svg]] # Data layers in Enterprise Data Warehouse Bigquery,snowflake or databricks | | **RAW_PRD** <br>**(*_staging table)** | **STAGING** | **INTERMEDIATE** | **MART** | **SEMANTIC LAYER** | | -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Environment** | Snowflake | Snowflake | Snowflake | Snowflake | Cube Cloud | | **Description** | Represent raw data from your data sources | apply incremental of source table | Clean and prepare data for transformation <br>only joins from staging are allowed | Convert data into fact, dimension & bridge tables | View with applied business logic & business metrics ready for consumption by BI tool | | **Technical Rules** | 1:1 Replication of Operational Source Data done by [Ingestion tool](https://asadventure.atlassian.net/wiki/spaces/Lakehouse/pages/4463231001) | - **create surrogate key when…** <br> your raw data has a clear natural key (e.g., customer_id, order_id), and you want to create a consistent surrogate key early in the process.<br> <br>- incremental logic (dbt snapshot or dbt incremental)<br> <br>- store static reference data (csv,excel): <br> dbt seeds<br> <br>- Column-level Data type transformations <br> ex. string→date <br> , unix_timestamp → datetime<br> <br>- column rename ex. id_customer → customer_id<br> <br>- Rename columns & table names according to [naming conventions](https://asadventure.atlassian.net/wiki/spaces/Lakehouse/pages/4475158529) | - **create surrogate key when…** <br> your surrogate key needs to be derived from multiple sources (e.g., a composite key across multiple tables).<br> <br>- UNION <br> ex. customers_be & customers_nl<br> <br>- JOINS <br> ex. key_ mapping<br> <br>- usage of CTE’s<br> <br>- pivot tables | - check for uniqueness & not null on surrogate_key | - pre-aggregated view ready for consumption by BI tool<br> <br>- data is in star schema view format<br> <br>- security: <br>row-level + user-based security is added in semantic layer | | **dbt Materialization strategy** | NOT APPLICABLE | **dimension data** <br>- dbt snapshot <br>strategy: <br>a) **unique key + timestamp** (ex. ingestion_export_date, airbyte_extracted_at) <br>or <br>b) **unique key + check** (using columns if no timestamp available) <br> <br>**transactional data** <br>type 1 history is sufficient as it is kept in RAW_PRD so no extra incremental logic needed. | IF <20 M rows full refresh<br> <br>ELSE IF >20 M rows incremental + merge | IF <20 M rows full refresh<br> <br>ELSE IF >20 M rows incremental + merge | pre-aggregated views by semantic layer |