Data Lakehouse connector
The Databricks data catalog your whole team can trust.
The Dawiso Databricks connector turns your workspace into a searchable data catalog: every Unity Catalog schema, table, view and volume, with table and column lineage to every BI dashboard downstream.
First things first
What is a data connector?
A data connector is the bridge between a tool in your stack and the catalog that gives you a unified view of it. Once a connector is configured, it reaches into the source system on a schedule, reads out the metadata - schemas, tables, dashboards, jobs, ownership, lineage - and represents it inside the catalog. Your actual rows and values stay where they are.
Connectors are the reason a data catalog can answer questions like "which Power BI dashboard depends on this Snowflake table?" or "who owns the orders topic in Kafka?" - automatically, without anyone keeping a spreadsheet up to date.
Three properties separate a good connector from a brittle one: it should be read-only and safe, it should be incremental so a full re-scan isn't required for every refresh, and it should resolve lineage across system boundaries, not just inside one tool.
About the platform
What is Databricks?
Databricks is the data lakehouse platform that combines warehousing, data engineering and AI on one foundation. Teams at retailers, banks and pharma companies run their analytics, ML pipelines and GenAI workloads on Delta Lake tables governed by Unity Catalog.
Unity Catalog covers what lives inside Databricks. The hard part is what doesn't: the upstream Snowflake or Kafka topic that fed the table, the Power BI report consuming it downstream, the business glossary the data steward owns. That's where the Dawiso Databricks data catalog joins the picture: read-only, metadata-only, and cross-platform.
Architecture
How Dawiso connects to Databricks
A small read-only role on the Databricks side. The Dawiso scanner pulls metadata on a schedule. Everything ends up in your catalog, business-readable.
Source
Databricks workspace
- Unity Catalog metastore
- Delta tables, views
- Volumes
- Table & column lineage
Dawiso scanner
Read-only metadata
- Schema & object discovery
- Dependency resolution
- SQL flow parsing (optional)
- Sampling on opt-in
Catalog
Dawiso platform
- Searchable metadata
- Lineage & ownership
- Business glossary
- Policy & classifications
Connection details
- Protocol
- Databricks SQL + Unity Catalog REST API
- Authentication
- Personal Access Token; service principal supported
- Lineage
- Unity Catalog system tables (system.access.table_lineage, column_lineage) provide table- and column-level lineage.
Setup
Connect Databricks in 4 steps
- 01
Provision a service principal
Create a Databricks service principal with USE_CATALOG and SELECT on every catalog Dawiso should read. Grant access to system.access.* lineage tables.
- 02
Generate a PAT
Generate a Personal Access Token for the service principal. Store it in Dawiso's connection vault - it never leaves the platform.
- 03
Connect and pick catalogs
Add the workspace URL and PAT in Dawiso. Choose which Unity Catalogs to ingest in one comma-separated list.
- 04
Run ingestion
Scheduled incremental sync keeps the catalog current. Column-level lineage resolves from Unity Catalog on premium and enterprise tiers.
Capabilities
What you get with the Databricks connector
-
Unity Catalog mirror
Every catalog, schema, Delta table and view is searchable, with column descriptions, types, tags and the job that built it.
-
Column-level lineage
Cross-platform lineage from a Power BI visual through Databricks DLT pipelines down to the raw bronze table.
-
Unity Catalog tags in context
Object tags assigned in Unity Catalog are read into Dawiso, so masking and row-filter context stays visible alongside the catalog. Read-only, metadata-only.
-
PII classification
Classify a column once. Dawiso flags every Databricks column carrying email, IBAN or government IDs across all catalogs.
-
Ownership & certification
Mark tables as certified, deprecated or under review. The owner is visible directly in the catalog and on the Databricks side.
Business value
Why teams turn on the Databricks connector
- -70%
Fewer 'which table?' pings
Analysts find the certified gold Delta table in Dawiso instead of pinging the data team to ask which silver table maps to revenue.
- 8x
Faster impact analysis
Before altering a Databricks column, see exactly which jobs, BI reports and ML features depend on it. Hours, not days.
- Audit-ready
GDPR & DORA evidence
Sensitive columns are classified once and the policy follows them through Delta tables, joins and downstream BI, with a full audit trail.
Ready to catalog your Databricks?
Set up the connector in an afternoon. See your first lineage graph the same day.
Frequently asked questions
What is a data catalog in Databricks?
What is the difference between Databricks catalog and metastore?
What is a data catalog used for?
What permissions does Dawiso need in Databricks?
Does Dawiso copy our Databricks data?
How is column-level lineage built?
Does Dawiso work with workspace-level metastores?
Explore more connectors
Databricks is one of 30+ connectors. Bring your whole stack into the catalog.
-
Data Warehouse Snowflake -
Business Intelligence Power BI -
Business Intelligence Tableau -
Data Warehouse Google BigQuery -
Data Warehouse Amazon Redshift -
Database PostgreSQL