Databricks

Loading data from your Databricks delta lake.

The Databricks source enables you to seamlessly share tables from your Databricks Delta Lake with your DCN via the Delta Sharing protocol.

This allows for efficient mapping of identifiers and traits from tabular data schema-based tables without the need for data replication and complex data migrations or transformations.

Pre-requisites:

  • Have a Databricks account with sufficient permissions to create Delta Shares, add Recipients and add assets to Shares.

  • Create a table in Databricks that is based on the tabular data format containing the data you want to load into your DCN.

Setting up a Delta Share in Databricks:

  1. In Databricks, head to Data and then select Delta Sharing from the accordion menu that appears.

  2. Click Share Data and create a Delta Share specific to your DCN. This share should contain tables that you want to load into your DCN.

  3. Click Manage Assets and add the previously created delta table that is based on the tabular data format and click "Add"

  4. Once you've added the assets, you can then click Add Recipients.

  5. In the modal that opens, you can either select a previously created recipient, or create a new one that is specific to your DCN.

  6. Click Create & Add and copy the URL for the credentials file.

  7. Download the credentials file to your local machine & head to your DCN to create the source.

Creating a Databricks source.

  1. Head to Integrations -> Sources and click "Create".

  2. Select the Databricks source.

  3. Enter the name of the source, this is how your source will be identified when you want to use it across the platform.

  4. Enter the name of the Share, Schema and Table that you created in Databricks.

  5. Upload the previously downloaded credentials file.

  6. Click "Create"

Databricks Delta Sharing credentials are usually granted for two (2) weeks, after which you will have to upload a new credentials file.

Check with your Databricks administrator for the exact credentials lifetime.

With these steps completed, your DCN service account should now have access to the Databricks table that you granted access to.

Last updated