Supported data sources

Currently, Subsalt directly integrates with:

  • MySQL

  • Redshift

  • Snowflake

  • S3 (Parquet)

  • Azure storage accounts (Parquet)

  • Databricks

  • Microsoft SQL Server

If your organization's data source is not listed here, don't worry! Subsalt can connect to any database that supports tabular structured data. Please reach out to the Subsalt team for more information on connecting to your database.

Connecting to Parquet-based file systems

Subsalt automatically interprets each folder in the target directory as a potential table, with the data being nested in that folder. For example, if you have an S3 bucket called my-bucket that is structured like:

  • my-bucket

    • my-database

      • 📁 customers <- "customers" will be a table

        • file_1.parquet <- all files in this folder will be combined to make a single "customers" table

        • file_2.parquet

        • file_3.parquet

      • 📁 product_imgs <- since this folder does not have any tabular data it will be ignored

        • img1.png

        • img2.png

      • 📁 sales

        • file_1.parquet

        • file_2.parquet

        • file_3.parquet

        • file_4.parquet

Connecting to an S3 Bucket

When connecting to an S3 bucket, you will provide a URL; for the example above, the URL would look like this:

S3 Bucket URL: s3://my-bucket/my-database/

Connecting to Azure storage

When connecting to an Azure storage account, you will provide a connection string and a path.

For the example above the configuration would something look like this:

Connection string: BlobEndpoint=https://my-bucket.blob.core.windows.net/;SharedAccessSignature=sv=2022-11-02&ss=b&srt=co&sp=rl&se=2024-10-19T05:27:46Z&st=2024-10-18T21:27:46Z&spr=https&sig=JTrgXgQNvbmPRXumK5rWChznHQRLdaUrTnCjCUmac44%3D

Path: my-database/

Last updated