Google Cloud Storage

Loading data from your Google Cloud Storage bucket.

To integrate your existing Google Cloud Storage (GCS) bucket with your DCN, you can add a Google Cloud Storage source. This allows you to map data such as identifiers, associated identifiers, and traits from a bucket containing any of the supported file types formatted in the tabular data format.

By directly connecting your GCS bucket with your DCN, you can seamlessly integrate your data and gain valuable insights without the need for complex data migrations or transformations.

Pre-requisites:

  • Have a Google Cloud Storage account with permissions to manage buckets and sharing principals.

  • Load file(s) into the bucket that are based on the tabular data schema.

Steps:

  1. Open the source creation form in your DCN and copy your DCN service account from the form.

  2. In Google Cloud Storage, navigate to the bucket or subdirectory you want to grant your DCN access to.

  3. Click on the "Permissions" tab and then click "Grant Access"

  4. In the "Add Principal" field, paste the DCN service account that you copied earlier.

  5. Grant your DCNs service account the "Cloud Storage" "Storage Object Admin" role. This will allow your DCN to create and read files from this bucket.

  6. Click on the "Save" button to save the changes and grant your DCN access to your Google Cloud Storage Bucket.

  7. Return to the source creation form and enter the bucket and subdirectory name. (For example gs://mybucket/mydirectory/)

  8. Set the expiry & ingestion frequency.

  9. Click "Create"

With these steps completed, your DCN service account should now have access to the Google Cloud Storage bucket that you granted access to.

Notes:

Your DCN will automatically check for new files in the storage container at the frequency that you set at source creation. If your DCN finds multiple files, it will trigger multiple ingestions simultaneously.

Cloud storage sources have a default rejection threshold of 100%, meaning that even if your file contains 99% errored records, your DCN will ingest the remaining 1%. However, if a file is 100% invalid, it will return an error for that specific file and will continue to attempt to ingest the rest of the files in the container.

If you modify a file in the container, your DCN will see it as modified and attempt to re-ingest it. Files are deemed "New" or "Updated" if the "Last Modified" timestamp has changed.

Last updated