Migration

Streamline Data Transformations with dbt for CRM Migrations

Discover how dbt simplifies data orchestration in CRM migration projects. Learn about templating, YAML configurations, and robust testing to ensure clean and reliable data transitions.
Kyle Tuft, Data Engineer, DI Squared
April 2, 2025

Migrating CRM systems with dbt

Migrating CRM systems such as Salesforce and Hubspot involves handling significant volumes of complex data. Ensuring that this data is transformed, integrated, and made usable in the new system is no small feat. dbt (data build tool) provides a robust framework that streamlines these data transformations, making your migration project more efficient and less error-prone. This article delves into how dbt coordinates data flow, manages metadata through YAML configurations, and incorporates rigorous testing to secure a clean handoff between systems. 

CRM migration projects often demand the movement of data from legacy systems to modern solutions, all while ensuring data integrity and usability. This task is complicated by the sheer volume and variety of information—from customer records to sales data—requiring a systematic approach to transformation. With dbt, you can orchestrate the journey of raw data through a series of refined stages until it’s fully compatible with the target CRM environment. 


What you'll learn in this article

  • How dbt coordinates data flow from diverse sources, ensuring that every piece of data is correctly handled. 
  • The process of transforming, filtering, and testing data to prepare it for a seamless transition to your new CRM.
  • An in-depth, under-the-hood look at how dbt models are structured and how they function.
  • Practical examples of how dbt’s features translate to tangible benefits in real-world CRM migrations.

dbt overview 

dbt is a powerful data orchestration tool written primarily in Python. It allows you to chain together SQL statements—referred to as models—in a sequence that starts with raw data and ends with refined data products, often called data marts. This sequence forms a directed acyclic graph (DAG), which dbt uses to execute models in the proper order. Additionally, dbt leverages Jinja2—or just ‘jinja’—for templating for dynamic SQL generation and YAML files for managing metadata and configuration. Jinja templates allow you to reuse code snippets and, together with YAML, helps you to avoid unnecessary repetition. 
 
Ultimately, adopting dbt means more reliable, efficient, and maintainable data transformations that save time and reduce errors during critical CRM migrations.  


dbt in CRM migration projects 

If you’ve ever had a stray comma or other punctuation mark ruin your day, you’re aware of how important clean data is! In many CRM migration scenarios, such as moving from a legacy CRM to a modern solution, common challenges include data inconsistencies, duplicates, and varying formats. dbt can be used to coordinate the flow of data from various source systems to marts or tables that can be easily consumed by your new CRM environment. 


Consider a real-world scenario: a company needs to merge customer data from multiple tables that store addresses, phone numbers, and purchase histories. The legacy system may have these fields spread across different formats and with inconsistent naming conventions. dbt allows you to apply transformations that consolidate and clean the data, ensuring that only standardized, accurate information flows into the new system. By incorporating tests that check for null values, duplicate entries, and referential integrity, dbt helps ensure that the migrated data meets high quality standards and is ready for immediate use in the new CRM environment. 
 

Quickstart guide: dbt for CRM migration


Here's how to get started with dbt quickly and effectively. To install, you just need to run 

pip install dbt-core 

If you use a specific data warehouse such as Snowflake, BigQuery, Redshift or otherwise, check the documentation for which version of dbt to install.

Once dbt is installed on your machine, you simply navigate to your desired project directory such as:

cd dbt\dbt_crm
 

Once there, you can run the command:

dbt init

...to initialize a new project. dbt will automatically prompt you to authenticate with your data warehouse and name your user and development profiles – and it can even create your initial project structure for you. Alternatively, you can choose to set up your own structure using dbt guidelines.

Under the hood

Now that we’ve installed it, let’s see how dbt sets up a project. In our dbt_crm directory, you’ll see the structure of a typical dbt project. Most of the core logic resides within the model’s subdirectory. A typical dbt project is structured into layers. For example, many projects might contain three main types of models:

  1. source models: the raw data as it exists in your legacy CRM system.
  2. staging models: used for intermediate transformations.
  3. mart models: the final, refined data ready for the new CRM.

 

This layered approach not only ensures clarity in data processing but also makes it easier to debug and maintain your transformation logic.

A typical model might look like this:

{{ config (materialized='table') }} 
  
SELECT 
     field1, 
     field2, 
     ... 
 FROM {{ source('crm_raw', 'some_table') }} 
 WHERE field1 IS NOT NULL; 

You’ll notice:

  • A Jinja reference to a source on line 4  
  • Otherwise, standard SQL code in lines 2–7  
  • Comments can be added anywhere you like

Aside from Jinja snippets (e.g., source('crm_raw', 'some_table')), it looks much like normal SQL. These snippets are essential for building dbt’s dependency graph. When you run a dbt command, dbt resolves these references, understands the execution order, and runs each model in the proper sequence.

Schema (config) files

Understanding metadata and its configuration is essential for maintaining clarity in your data transformation process. In dbt projects, metadata is managed through YAML (.yml) configuration files. The fact that metadata is not merely descriptive, but functional, is a key strength of the dbt paradigm – your documentation remains in one place, keeping your project clean and reproducible.

There are two main types: model and source configs.  

Model files document every model and column, describing their roles, associated tests, and any specific data quality rules. Here’s an example:

version: 2 
  
models: 
   - name: installation_appointments 
     description: "Appointments data used for scheduling and tracking installs." 
     columns: 
       - name: crm_legacy_id 
         description: "Primary key field from the legacy CRM system." 
         tests: 
           - not_null 
           - unique 

This snippet shows the documentation and tests for model installation_appointments. Note the tests; if any of these fail (e.g., if crm_legacy_id is not unique), dbt will raise an error, preventing bad data from flowing downstream.  

Sources are documented in a similar structure, albeit with a slightly different hierarchy. Here we define the source schema and the underlying tables, noting each column’s name and data type:

version: 2 
  
sources: 
   - name: crm_raw 
     tables: 
       - name: some_table 
         columns: 
           - name: field1 
             data_type: varchar 
           - name: field2 
             data_type: timestamp 
   
   

Next steps

By now, you should have a solid understanding of how dbt can simplify and streamline your CRM migration projects. Beyond foundational transformations, dbt offers advanced features like incremental models, snapshotting for tracking changes over time, and extensive testing frameworks that make managing complex data pipelines both scalable and reliable. These capabilities not only enhance the efficiency of your migration but also build confidence in the integrity of your data.

To learn more, check out the official dbt Docs site or explore a dbt example repo. When you’re ready to install and dive deeper, refer to dbt’s installation guide for the most up-to-date instructions.

Need a helping hand?

If you’re ready to take your CRM migration project to the next level, consider diving deeper into the official dbt documentation or experimenting with a sample project. At DI Squared, we have guided organizations of all sizes—from Fortune 500 companies to small and midsize enterprises—through successful CRM system migrations while building data quality.  

Book your 1:1 consultation and learn how we can help optimize your CRM data pipeline using dbt.