Skip to main content

ETL / ELT connector

The Keboola data catalog your whole team can trust.

The Dawiso Keboola data catalog turns your stack into a searchable inventory: every project, bucket and transformation, with run history and downstream lineage.

Live connector Stable connector
Keboola
Dawiso
Metadata-only · your data never leaves the source
Type
Data integration platform
Auth
Keboola Storage API token · organization or project scope
Sync
Scheduled, incremental
Direction
Read-only · metadata

First things first

What is a data connector?

Metadata-only Read-only access Incremental sync Cross-system lineage

A data connector is the bridge between a tool in your stack and the catalog that gives you a unified view of it. Once a connector is configured, it reaches into the source system on a schedule, reads out the metadata - schemas, tables, dashboards, jobs, ownership, lineage - and represents it inside the catalog. Your actual rows and values stay where they are.

Connectors are the reason a data catalog can answer questions like "which Power BI dashboard depends on this Snowflake table?" or "who owns the orders topic in Kafka?" - automatically, without anyone keeping a spreadsheet up to date.

Three properties separate a good connector from a brittle one: it should be read-only and safe, it should be incremental so a full re-scan isn't required for every refresh, and it should resolve lineage across system boundaries, not just inside one tool.

About the platform

What is Keboola?

Keboola is a managed data operations platform: ingestion, transformation, orchestration and data apps under one roof, with native Snowflake or BigQuery storage underneath. Founded in 2008 and used by 34,000+ practitioners worldwide, it sits between source systems and the warehouse, doing the work Airflow + dbt + Fivetran usually split between them.

Keboola's own UI catalogues projects, buckets and components inside Keboola. What it doesn't cover is the warehouse table the transformation wrote, the Power BI report consuming it, the data product the business owns end to end. That's where the Dawiso Keboola data catalog joins the picture: read-only, metadata-only, and cross-platform.

Architecture

How Dawiso connects to Keboola

A small read-only role on the Keboola side. The Dawiso scanner pulls metadata on a schedule. Everything ends up in your catalog, business-readable.

Source

Keboola stack

  • Projects & buckets
  • Tables & columns
  • Transformations & components
  • Jobs & run history
REST · JDBC

Dawiso scanner

Read-only metadata

  • Schema & object discovery
  • Dependency resolution
  • SQL flow parsing (optional)
  • Sampling on opt-in
Internal

Catalog

Dawiso platform

  • Searchable metadata
  • Lineage & ownership
  • Business glossary
  • Policy & classifications

Connection details

Protocol
Keboola Management API + Storage API + (optional) Queue API
Authentication
Storage API token · organization-level (all projects) or project-level · multi-project token JSON supported
Lineage
Bucket-to-transformation-to-table lineage from configuration metadata; cross-platform reach into Snowflake, BigQuery and Power BI via ingested objects

Setup

Connect Keboola in 4 steps

  1. 01

    Pick a token type

    Decide between an organization-level token (all projects, filterable later) or a project-level token (one project per token, with multi-token JSON for several projects). Both come from Account Settings > Access Tokens.

  2. 02

    Generate the token

    In Keboola, open Account Settings (organization) or Project Settings (project). Click + New token. Name it (e.g. DawisoToken), set validity, save the value; Keboola displays it once.

  3. 03

    Connect in Dawiso

    Provide the Management & Storage URL of your Keboola stack (e.g. https://connection.keboola.com), the token, and optionally the Queue API URL to ingest job history.

  4. 04

    Run ingestion

    Filter projects via regex (organization tokens) or use the multi-token JSON. Scheduled incremental sync keeps buckets, transformations and job history current.

Capabilities

What you get with the Keboola connector

  • Project & bucket catalog

    Every Keboola project, bucket, table and transformation is searchable, with owner, tags and the team responsible for the data product.

  • Transformation lineage

    Lineage from input buckets through SQL, Python or R transformations to output tables, stitched cross-platform with Snowflake, BigQuery and Power BI.

  • Job history & freshness

    Job run history (when the Queue API is connected) and last-built timestamps sit inside the catalog, next to the assets the transformation built.

  • PII classification

    Classify a column once. Dawiso flags every Keboola table carrying email, IBAN or government IDs across all projects and stacks.

  • Ownership & certification

    Owners and tags from Keboola land in the catalog. Promote a table to certified, deprecate one, and the business sees it instantly.

  • Impact analysis

    Before changing a Keboola transformation, see exactly which downstream tables, dashboards and ML features depend on its output. Seconds, not days.

Business value

Why teams turn on the Keboola connector

  • −65%

    Fewer 'which table?' pings

    Analysts find the certified output table in Dawiso instead of pinging the data engineer to ask which staging bucket maps to revenue.

  • 10×

    Faster impact analysis

    Before changing a Keboola transformation, see exactly which downstream tables, dashboards and ML features depend on its output. Seconds, not days.

  • Audit-ready

    Pipeline traceability

    Every project, bucket, transformation and run is in the catalog with the downstream warehouse and BI assets, so audit answers come from the platform.

Ready to catalog your Keboola?

Set up the connector in an afternoon. See your first lineage graph the same day.

Frequently asked questions

Still curious? Talk to our team ->
Does Keboola have a data catalog?
Keboola includes a built-in Data Catalog for sharing data inside Keboola. Dawiso extends cataloging across your whole stack: it reads Keboola metadata read-only and links its components and tables to the warehouses, BI and pipelines downstream with end-to-end lineage.
What is a data catalog used for?
A data catalog makes every dataset and pipeline discoverable, documented and trustworthy. Dawiso turns Keboola metadata into one searchable catalog with ownership, classification and lineage for the whole business.
What permissions does Dawiso need in Keboola?
A Keboola Storage API token. Organization-level grants access to all projects (filterable in Dawiso). Project-level scopes to one project per token; you can list several via the multi-token JSON. Tokens can be read-only by setting Access level: restricted.
Does Dawiso copy our Keboola data?
No. Dawiso queries the Keboola Management API, Storage API and optionally the Queue API for metadata only: project, bucket, table, transformation, component and job definitions. Row-level data your transformations build stays in Keboola's storage.
Can Dawiso scan multiple Keboola projects?
Yes. With an organization-level token, pick projects via regex during data-source setup. With project-level tokens, list them in JSON: id, name, token per project. Both modes coexist on one connection.
Which Keboola stacks are supported?
All public Keboola stacks (e.g. connection.keboola.com, connection.eu-central-1.keboola.com). The connection form accepts any Management & Storage URL listed in Keboola's official documentation. Private deployments ingest via Dawiso Integration Runtime (DIR).