Skip to main content

ETL / ELT connector

The ADF data catalog your whole team can trust.

The Dawiso Azure Data Factory data catalog turns your factories into a searchable inventory: every pipeline, activity and integration runtime, next to the data they move.

Live connector Stable connector
ADF
Dawiso
Metadata-only · your data never leaves the source
Type
Cloud data integration service
Auth
Microsoft Entra service principal · Reader role
Sync
Scheduled, incremental
Direction
Read-only · metadata

First things first

What is a data connector?

Metadata-only Read-only access Incremental sync Cross-system lineage

A data connector is the bridge between a tool in your stack and the catalog that gives you a unified view of it. Once a connector is configured, it reaches into the source system on a schedule, reads out the metadata - schemas, tables, dashboards, jobs, ownership, lineage - and represents it inside the catalog. Your actual rows and values stay where they are.

Connectors are the reason a data catalog can answer questions like "which Power BI dashboard depends on this Snowflake table?" or "who owns the orders topic in Kafka?" - automatically, without anyone keeping a spreadsheet up to date.

Three properties separate a good connector from a brittle one: it should be read-only and safe, it should be incremental so a full re-scan isn't required for every refresh, and it should resolve lineage across system boundaries, not just inside one tool.

About the platform

What is Azure Data Factory?

Azure Data Factory (ADF) is Microsoft's managed cloud service for orchestrating ETL, ELT and data integration at scale. Pipelines chain activities (copy, transform, control), linked services define connections, and integration runtimes do the actual movement, hybrid or cloud-native. Common pairings: Synapse, Snowflake, ADLS Gen2 and Power BI.

ADF's monitoring view shows you which pipelines ran and which activities failed. What it doesn't show is the downstream warehouse table the pipeline wrote, the Power BI report that consumes it, the data product the business owns, or the policy that rides along. That's where the Dawiso Azure Data Factory data catalog joins the picture: read-only, metadata-only, and cross-platform.

Architecture

How Dawiso connects to ADF

A small read-only role on the ADF side. The Dawiso scanner pulls metadata on a schedule. Everything ends up in your catalog, business-readable.

Source

Azure tenant + subscriptions

  • Data factories
  • Pipelines & activities
  • Linked services & datasets
  • Triggers & run history
REST · JDBC

Dawiso scanner

Read-only metadata

  • Schema & object discovery
  • Dependency resolution
  • SQL flow parsing (optional)
  • Sampling on opt-in
Internal

Catalog

Dawiso platform

  • Searchable metadata
  • Lineage & ownership
  • Business glossary
  • Policy & classifications

Connection details

Protocol
Azure Data Factory REST API (v2018-06-01)
Authentication
Microsoft Entra service principal · Client ID + Client Secret · Reader role at subscription, resource group or factory scope
Lineage
Activity dependencies and dataset references mapped within each pipeline; linked services connect factories to the Snowflake, ADLS, Synapse and Power BI objects ingested from those systems

Setup

Connect ADF in 4 steps

  1. 01

    Register an Entra app

    In the Azure Portal, open Microsoft Entra ID > App registrations > + New registration. Name it Dawiso Integration. Save the Application (client) ID and the Directory (tenant) ID.

  2. 02

    Generate a client secret

    Under Certificates & secrets, click + New client secret. Pick an expiration that matches your rotation policy. Copy the secret value immediately; Azure displays it once.

  3. 03

    Assign Reader role

    Pick a scope (subscription, resource group or single factory). Open Access control (IAM) > Add role assignment > Reader, assign to the service principal. Lowest-privilege scope works fine.

  4. 04

    Connect and run ingestion

    Provide Tenant ID, Subscription ID, Client ID and Client Secret in Dawiso. Optionally filter factories with a JSON list. Scheduled incremental sync keeps everything current, including the last RunsHistoryInDays.

Capabilities

What you get with the ADF connector

  • Pipeline & factory catalog

    Every data factory, pipeline, activity, linked service and dataset is searchable, with owner, tags and the team responsible for the integration.

  • Activity & dataset mapping

    Activity dependencies and dataset references inside each pipeline are mapped, and linked services connect each factory to the warehouse and BI objects ingested from those systems.

  • Schedule & trigger visibility

    Triggers, schedules and SLAs sit next to the pipeline they activate, so 'is this table fresh enough for finance close?' has an answer.

  • Run history & freshness

    Pipeline and trigger run history (default 2 days, configurable) surfaces inside the catalog, next to the assets the pipeline built.

  • Ownership & certification

    Pipeline owners and tags from ADF land in the catalog. Promote a pipeline to certified, mark one as deprecated, and the business sees it.

  • Dependency visibility

    Before changing a linked service or pipeline, see which activities, datasets and pipelines reference it inside the factory. Seconds, not days.

Business value

Why teams turn on the ADF connector

  • −70%

    Faster root-cause analysis

    When a Power BI report is stale, find the failing ADF pipeline in one search instead of pinging the data engineer who knows the factory.

  • 10×

    Faster impact analysis

    Before changing a linked service, see which pipelines, activities and datasets reference it. Seconds, not days.

  • Audit-ready

    Integration traceability

    Every factory, pipeline, owner and run is in the catalog alongside the warehouse and BI assets ingested from your stack, so audit answers come from the platform.

Ready to catalog your ADF?

Set up the connector in an afternoon. See your first lineage graph the same day.

Frequently asked questions

Still curious? Talk to our team ->
What is the Azure data catalog?
Microsoft's catalog is now Microsoft Purview (formerly Azure Data Catalog). Dawiso complements it cross-platform: it reads Azure Data Factory pipeline metadata read-only and maps each activity to the datasets and linked services it references across your stack.
What is metadata in ADF?
ADF metadata describes pipelines, activities, datasets and linked services. Dawiso reads it read-only and maps the activity dependencies and dataset references inside each pipeline, so you can see what each activity references.
How to create a Data Catalog in Azure?
You can use Microsoft Purview, or connect Azure Data Factory to Dawiso: add a read-only service principal, choose the factories to ingest, and Dawiso builds a searchable catalog of pipelines, datasets and their dependencies in a few steps.
What permissions does Dawiso need in ADF?
A Microsoft Entra service principal with the built-in Reader role assigned at subscription, resource group or single-factory scope. Reader is read-only end to end. Dawiso never modifies, triggers or pauses your pipelines.
Does Dawiso copy our ADF data?
No. Dawiso queries the Azure Data Factory REST API (v2018-06-01) for metadata only: factory, pipeline, activity, linked service, dataset and run definitions. Row-level data your pipelines move stays inside your Azure subscription.
How are pipeline dependencies mapped?
From activity dependencies and dataset references inside each pipeline, and from linked-service connection metadata. Linked services connect ADF pipelines to the Snowflake, ADLS, Synapse and Power BI objects ingested from those systems.
Does the connector support Microsoft Fabric Data Factory?
The connector targets Azure Data Factory v2 via its REST API. Microsoft Fabric is the next-generation successor; Dawiso has a separate Microsoft Fabric connector and a Fabric metadata application. Contact Customer Success for the Fabric migration path.