Snowflake quickstart
In this quickstart guide, you'll learn how to use dbt Cloud with Snowflake. It will show you how to:
- Create a new Snowflake worksheet.
- Load sample data into your Snowflake account.
- Connect dbt Cloud to Snowflake.
- Take a sample query and turn it into a model in your dbt project. A model in dbt is a select statement.
- Add tests to your models.
- Document your models.
- Schedule a job to run.
Snowflake also provides a quickstart for you to learn how to use dbt Cloud. It makes use of a different public dataset (Knoema Economy Data Atlas) than what's shown in this guide. For more information, refer to Accelerating Data Teams with dbt Cloud & Snowflake in the Snowflake docs.
:::tip Videos for you You can check out dbt Fundamentals for free if you're interested in course learning with videos.
You can also watch the YouTube video on dbt and Snowflake. :::
Prerequisites¶
- You have a dbt Cloud account.
- You have a trial Snowflake account. During trial account creation, make sure to choose the Enterprise Snowflake edition so you have
ACCOUNTADMINaccess. For a full implementation, you should consider organizational questions when choosing a cloud provider. For more information, see Introduction to Cloud Platforms in the Snowflake docs. For the purposes of this setup, all cloud providers and regions will work so choose whichever you’d like.
Create a new Snowflake worksheet¶
- Log in to your trial Snowflake account.
- In the Snowflake UI, click + Worksheet in the upper right corner to create a new worksheet.
Load data¶
The data used here is stored as CSV files in a public S3 bucket and the following steps will guide you through how to prepare your Snowflake account for that data and upload it.
-
Create a new virtual warehouse, two new databases (one for raw data, the other for future dbt development), and two new schemas (one for
jaffle_shopdata, the other forstripedata).To do this, run these SQL commands by typing them into the Editor of your new Snowflake worksheet and clicking Run in the upper right corner of the UI:
-
In the
rawdatabase andjaffle_shopandstripeschemas, create three tables and load relevant data into them:-
First, delete all contents (empty) in the Editor of the Snowflake worksheet. Then, run this SQL command to create the
customertable: -
Delete all contents in the Editor, then run this command to load data into the
customertable:- Delete all contents in the Editor (empty), then run this command to create thecopy into raw.jaffle_shop.customers (id, first_name, last_name) from 's3://dbt-tutorial-public/jaffle_shop_customers.csv' file_format = ( type = 'CSV' field_delimiter = ',' skip_header = 1 );orderstable: -
Delete all contents in the Editor, then run this command to load data into the
orderstable: - Delete all contents in the Editor (empty), then run this command to create the
paymenttable: - Delete all contents in the Editor, then run this command to load data into the
paymenttable: - Verify that the data is loaded by running these SQL queries. Confirm that you can see output for each one.
-
Connect dbt Cloud to Snowflake¶
There are two ways to connect dbt Cloud to Snowflake. The first option is Partner Connect, which provides a streamlined setup to create your dbt Cloud account from within your new Snowflake trial account. The second option is to create your dbt Cloud account separately and build the Snowflake connection yourself (connect manually). If you want to get started quickly, dbt Labs recommends using Partner Connect. If you want to customize your setup from the very beginning and gain familiarity with the dbt Cloud setup flow, dbt Labs recommends connecting manually.
Using Partner Connect allows you to create a complete dbt account with your Snowflake connection, a managed repository, environments, and credentials.
-
In the Snowflake UI, click on the home icon in the upper left corner. Click on your user, and then select Partner Connect. Find the dbt tile by scrolling or by searching for dbt in the search bar. Click the tile to connect to dbt.
If you’re using the classic version of the Snowflake UI, you can click the Partner Connect button in the top bar of your account. From there, click on the dbt tile to open up the connect box.
-
In the Connect to dbt popup, find the Optional Grant option and select the RAW and ANALYTICS databases. This will grant access for your new dbt user role to each database. Then, click Connect.
-
Click Activate when a popup appears:
- After the new tab loads, you will see a form. If you already created a dbt Cloud account, you will be asked to provide an account name. If you haven't created account, you will be asked to provide an account name and password.
-
After you have filled out the form and clicked Complete Registration, you will be logged into dbt Cloud automatically.
-
From your Account Settings in dbt Cloud (using the gear menu in the upper right corner), choose the "Partner Connect Trial" project and select snowflake in the overview table. Select edit and update the fields Database and Warehouse to be
analyticsandtransforming, respectively.
- Create a new project in dbt Cloud. From Account settings (using the gear menu in the top right corner), click + New Project.
- Enter a project name and click Continue.
-
For the warehouse, click Snowflake then Next to set up your connection.
-
Enter your Settings for Snowflake with:
-
Account — Find your account by using the Snowflake trial account URL and removing
snowflakecomputing.com. The order of your account information will vary by Snowflake version. For example, Snowflake's Classic console URL might look like:oq65696.west-us-2.azure.snowflakecomputing.com. The AppUI or Snowsight URL might look more like:snowflakecomputing.com/west-us-2.azure/oq65696. In both examples, your account will be:oq65696.west-us-2.azure. For more information, see Account Identifiers in the Snowflake docs. -
Role — Leave blank for now. You can update this to a default Snowflake role later.
- Database —
analytics. This tells dbt to create new models in the analytics database. - Warehouse —
transforming. This tells dbt to use the transforming warehouse that was created earlier.
-
-
Enter your Development Credentials for Snowflake with:
- Username — The username you created for Snowflake. The username is not your email address and is usually your first and last name together in one word.
- Password — The password you set when creating your Snowflake account.
- Schema — You’ll notice that the schema name has been auto created for you. By convention, this is
dbt_<first-initial><last-name>. This is the schema connected directly to your development environment, and it's where your models will be built when running dbt within the Cloud IDE. - Target name — Leave as the default.
- Threads — Leave as 4. This is the number of simultaneous connects that dbt Cloud will make to build models concurrently.
-
Click Test Connection. This verifies that dbt Cloud can access your Snowflake account.
- If the connection test succeeds, click Next. If it fails, you may need to check your Snowflake settings and credentials.
Set up a dbt Cloud managed repository¶
If you used Partner Connect, you can skip to initializing your dbt project as the Partner Connect provides you with a managed repository. Otherwise, you will need to create your repository connection.
Initialize your dbt project and start developing¶
Now that you have a repository configured, you can initialize your project and start development in dbt Cloud:
- Click Develop from the upper left. It might take a few minutes for your project to spin up for the first time as it establishes your git connection, clones your repo, and tests the connection to the warehouse.
- Above the file tree to the left, click Initialize your project. This builds out your folder structure with example models.
- Make your initial commit by clicking Commit. Use the commit message
initial commit. This creates the first commit to your managed repo and allows you to open a branch where you can add new dbt code. - You can now directly query data from your warehouse and execute
dbt run. You can try this out now:- Click + Create new file, add this query to the new file, and click Save as to save the new file:
- In the command line bar at the bottom, enter
dbt runand click Enter. You should see adbt run succeededmessage.
Build your first model¶
- Click Develop from the upper left of dbt Cloud. You need to create a new branch since the main branch is set to read-only mode.
- Click Create branch. You can name it
add-customers-model. - Click the ... next to the Models directory, then select Create file.
- Name the file
models/customers.sql, then click Create. -
Copy the following query into the file and click Save.
with customers as ( select id as customer_id, first_name, last_name from raw.jaffle_shop.customers ), orders as ( select id as order_id, user_id as customer_id, order_date, status from raw.jaffle_shop.orders ), customer_orders as ( select customer_id, min(order_date) as first_order_date, max(order_date) as most_recent_order_date, count(order_id) as number_of_orders from orders group by 1 ), final as ( select customers.customer_id, customers.first_name, customers.last_name, customer_orders.first_order_date, customer_orders.most_recent_order_date, coalesce(customer_orders.number_of_orders, 0) as number_of_orders from customers left join customer_orders using (customer_id) ) select * from final -
Enter
dbt runin the command prompt at the bottom of the screen. You should get a successful run and see the three models.
Later, you can connect your business intelligence (BI) tools to these views and tables so they only read cleaned up data rather than raw data in your BI tool.