skip to Main Content

I have my my dbt-project configured and running for dbt-postgres. Is there an easy way to port it to dbt- databricks without having to rewrite the models.
Context: I am building a dbt demo on postgres, databricks and Azure SQL for my organization to see how well it fit our environments.
What I’ve already tried : I have successfully running pipelines on dbt-postgres.

2

Answers


  1. dbt handles the DDL/DML boilerplate; for select statements, postgres and sparkSQL/databricks aren’t that far off, but you will have to make some tweaks to the models to get things running, and you’ll want to take advantage of some of the config in databricks, like partitioning.

    One way to ease this transition is to use dbt’s cross-database macros, which standardize syntax for functions like datediff, which can vary between dialects.

    Another option, if your models don’t contain much jinja, would be to use sqlglot to transpile between dialects.

    If you have the same data in postgres and databricks, you can use datafold’s data-diff to compare the results of the modeled data between datastores, to confirm you ported everything correctly.

    Login or Signup to reply.
  2. Fyi databricks is coming out with lateral aliases, which will ease a lot of the problems.
    https://docs.databricks.com/sql/language-manual/sql-ref-name-resolution.html
    Databricks Runtime 12.2 and above

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search