Snowflake
Loading data from Snowflake
You can add a Snowflake source to connect directly between your Snowflake instance and your DCN. This will enable you to easily map data such as identifiers, associated identifiers and traits, from your existing data warehouse into your DCN from a view or a table using a predefined tabular data schema.
Granting Your DCN Access to Your Snowflake Account
First, before connecting your Snowflake account with your DCN, you will be required to execute the script below in order to pre-configure the necessary user, role, database, warehouse, schema during the configuration of the new source.
Make sure to select the All Queries option when executing the script in Snowflake
We recommend that you define a unique and secure for the user_password field
Organizing Your Data
In order for your view to be successfully ingested into your DCN, it must follow an established schema which is outlined within our Tabular Data Schema section.
Once the columns in the table or view follow the proper schema, you will be able to carry your first party data into your node from Snowflake once the connection is complete.
Adding a Snowflake Source via DCN UI
Once you have executed the script and ensured that the schema is properly formatted, you can add a new Snowflake source to your DCN in the Integrations section.
You can add Snowflake as a source via the following path Integrations > Sources > Add Source and proceed to the Snowflake source form.
There are fields which have a default value set and only need to be modified if required, the default values are the ones that were used in the script to set up your Snowflake account.
Username
The username of the read-only snowflake account. Defaults to OPTABLE_USER
.
Database
The database assigned to the read-only snowflake account. Defaults to OPTABLE_DATABASE
.
Schema
The schema for the read-only snowflake account. Defaults to OPTABLE_SCHEMA
Warehouse
The warehouse used to power the ingestion operations. Defaults to OPTABLE_WAREHOUSE
.
Role
The role of the read-only snowflake account. Defaults to OPTABLE_ROLE
.
You must fill out the Account ID, Password and Table View sections in order to connect with your Snowflake account and point towards the right view within your database.
At the ingestion frequency section, you can schedule the time at which you would like your DCN to ingest new data. The options are Now, Once, Daily and Hourly. Once everything is set, you can click on save and the source will be added to your DCN. Once done, a test connection will be performed and the ingestion will occur according to the set schedule. You can also manually force an ingestion from the listing by clicking on ingest, and the process will begin right away.
There are several states that your Snowflake Source can have such as Active, Configured, Processing and Error.
Active
Scheduling has been set and has ingested at least once.
Configured
Scheduling has been set and ingestion has not occurred yet.
Processing
Ingestion in progress.
Error
There was an issue with the ingestion. Settings need to be verified.
Adding a Snowflake Source via CLI
You can also add a Source via the CLI command optable-cli source create snowflake <name> <account> <password> <table>
, an example of what a created Snowflake Source would look like:
Refer to Snowflake's documentation for details on how to determine your account ID.
The remaining optional fields are:
If you use the recommended setup scripts for Snowflake, these values will be set as the defaults and you will not need to pass any of the above optional fields.
Additionally, you can pass:
Scheduling Ingestion from Snowflake
Passing no scheduling parameters at all results in the assumption that you want to create a Snowflake Source that ingests only once at the current time. The scheduling parameters are designed to infer as much as possible and it is possible to modify them:
If you only pass --ingestion-frequency and omit the --at option, that means you will ingest on the specified schedule (once, daily, hourly) starting at the current time and repeating at the current time.
Conversely, only passing the --at flag without the --ingestion-frequency flag means the ingestion will run only once at the time/date you specify. For example:
--at 2022-03-10T15:00:00-05:00 will create a snowflake source scheduled to ingest once on March 10, 2022 at 3 p.m.
Combining both flags allows you to specify any combination of the scheduling parameters. For example:
--ingestion-frequency daily --at 2022-03-10T15:00:00-05:00 creates a snowflake source scheduled to ingest every day at 3 p.m. starting on March 10, 2022.
Once complete, a test connection with your data warehouse will be performed while looking for the table or view that you have previously entered. If unsuccessful, you will be prompted an error message if the connection fails or the values entered do not match.
Source Update
You can also update source fields (except the scheduling fields) using optable-cli source update snowflake <id>
. You need to specify any combination of fields you want to update. The options are name, account, password, table, expiry, user, database, schema, warehouse and role. They all represent the same values as in source creation.
Manual Ingestion
If you want to manually refresh the source and bypass the scheduling that has been set, use command optable-cli source ingest <id>
and the source will refresh.
Once all of the parameters have been set, you will be able to see the Snowflake Source within the UI as well as CLI and ingestion will occur at the scheduled time.
Last updated