Quality Engine

Dashboards you'd stake your quarter on.

Every pipeline ships with a built-in quality engine that scores every column on five dimensions. If a number on your dashboard moves, you can trace it back to the exact row, the exact test, the exact timestamp — and know the moment a number goes stale. No mystery numbers in the board deck.

Data Quality

92%
43 sources assessed
Completeness
94%
Uniqueness
95%
Validity
98%
Freshness
12%
Referential
89%

A traffic light you can read in a meeting.

Every score rolls up into one of three colours. No dashboards full of knobs. No "what does 73 mean?"

≥ 80

Good

Ship it. The numbers on this column can be trusted for decisions.

≥ 50

Warning

Drill in. Something's off — investigate before the next board pack.

< 50

Critical

Stop the line. Don't publish dashboards against this column until it's fixed.

How the score is computed

Each column is graded on five dimensions, then rolled up into one number.

  • Completeness. Of all the cells that should have a value, how many do?
  • Uniqueness. Primary keys that are genuinely unique. No silent duplicates.
  • Validity. Audit pass rate across the automatically generated test suite.
  • Freshness. Age of the latest timestamp vs. your sync cadence.
  • Referential Integrity. Foreign keys that actually resolve to parent rows.

Every column. Every test. One screen.

When something breaks, you know where. Expand any source to see the exact column, the exact failing test, and the score that moved the needle.

Data Quality

Column-level quality assessment across your data sources.

92%
43 sources assessed
Completeness
94%
Non-null, non-empty values
Uniqueness
95%
Distinct values in key columns
Validity
98%
Data quality test pass rate
Freshness
12%
Timestamp recency
Referential
89%
FK relationship validity
Sources View all 43 tables →
stg_xero__contacts
62%
52 rows · 10 columns
Completeness
64%
Validity
100%
stg_salesforce__case_history
67%
193 rows · 6 columns
Completeness
71%
Validity
100%
stg_business_central__customers
74%
5 rows · 16 columns
Completeness
75%
Validity
100%
stg_business_central__vendors
76%
7 rows · 14 columns
Completeness
77%
Validity
100%
stg_business_central__employees
80%
7 rows · 13 columns
Completeness
80%
Validity
60%
Columns 13 total
birth_date 50%
100% complete
valid_ts
email 0%
0% complete
employee_id 100%
100% complete
uniquenot_null
department_id 100%
100% complete
fk_departments
manager_id 72%
86% complete
fk_employees
Close the loop

Every red light comes with a way to fix it.

Detecting a gap is the easy half. The quality engine ships with three remediation paths so the same person who sees a red badge can close it — no tickets, no engineering backlog, no waiting for the next sprint. Fix, re-run, watch the score recover.

Detected stg_bc__employees
email 0%
✗ not_null ✗ valid_email

Upload a spreadsheet

Patch historical rows with an ad-hoc CSV or Excel export. Merge on a primary key, no engineer required.

Connect a database

Link a legacy MySQL, Postgres, or SQL Server as a side source. We join the missing columns straight into staging.

Override the mapping

When two sources disagree, pick which one the engine should trust. One dropdown, no schema migration.

Rescored stg_bc__employees
email 94%
✓ not_null ✓ valid_email
One-click re-assessment No ops tickets Re-run locally or on schedule

Scoring runs wherever your data lives.

Whether your warehouse is the integrated data lake, Snowflake, Databricks, BigQuery, or Microsoft Fabric, the quality engine targets it natively.

Integrated
Integrated SQL Data Lake Integrated SQL Data Lake
Bring your own
Snowflake
Databricks
BigQuery
Microsoft Fabric
Zero YAML

Tests that write themselves.

You map a connector. We generate the tests. Every primary key gets a not-null and a uniqueness check. Every required field gets a not-null. Every timestamp gets a sanity bound. Every foreign key gets a join audit. No YAML, no hand-rolled SQL, no test-writing Jira tickets.

Auto-generated from schema
stg_business_central__employees
email varchar
0% complete
✓ not_null ✓ unique ✗ valid_email
employee_id bigint
100% complete
✓ not_null ✓ unique ✓ primary_key
manager_id bigint · fk
86% complete
✓ not_null ✗ fk_employees 1 orphan row
Generated from your schema. Zero YAML. Not-null, uniqueness, validity, freshness and referential integrity — regenerated on every source you add.

The test suite scales with your data model — thousands of checks per tenant, generated and re-generated automatically every time you add a source.

Living metric

Fresh or it doesn't count.

Every other quality tool grades your data on what's in the warehouse. We also grade when it got there. A "green" revenue number from Tuesday's batch is worse than a missing one — one tells you nothing, the other tells you a lie.

For every staging table, the engine stores the sync cadence, computes the gap between the latest timestamp and now, and fails the freshness test the moment the gap exceeds the SLA. Failed freshness flips the whole source to Warning or Critical — regardless of how clean the rows are.

Per-source SLAs Auto-tuned from cadence Alerts before the board sees it
Freshness monitor
Real-time
Source Last synced 24h Status
stg_salesforce__accounts just now
Fresh
stg_xero__invoices 2 min ago
Fresh
stg_business_central__gl 43 min ago
Fresh
stg_hubspot__deals 47 min ago
Warning
stg_quickbooks__payments 3 days ago
Stale
12% of sources breached SLA
Quality engine flagged at 09:42

Ready to stop second-guessing your dashboards?

Book a demo and we'll score your own data in under a day.