Your data testing pyramid is upside down. Here's how to flip it.

Your data testing pyramid is upside down. Here’s how to flip it.

Borrow from software testing: the pyramid. Unit tests at the base (most, fastest, cheapest). Integration tests in the middle. System tests at the top (fewest, slowest, most expensive).

Data pipelines need the same structure.

Unit tests at source: individual transformations, single-table logic. Fast. Many. Catch issues early. Integration tests in the middle: cross-table joins, pipeline segments. Moderate speed, fewer of them. System tests at the top: end-to-end flows, business metrics. Slow. Critical few.

Ideal split: 70% unit, 20% integration, 10% system.

Most teams? Inverted. Only system-level tests. Dashboard checks. Late detection. Slow feedback. By the time you know something’s broken, it’s already affected users.

The fix: move tests upstream. Unit tests in dbt. Source validation before transformation. Staging layer tests on the logic. Output tests only for critical business metrics.

It’s not about test count. It’s about who finds the bug first - your tests, or your stakeholders.

What percentage of your tests are unit tests versus end-to-end tests?