When building incremental models in dbt, it’s easy to manage new or updated data — but handling deleted rows flagged via an is_deleted
column? That’s trickier. You’re not alone if you’ve said:
“I want to upgrade our dbt run to be incremental, but our source data has an
is_deleted
column to indicate that a row was actually deleted. For our current run, we just filter those out, but if I want to make it incremental, I’m not finding any good way to delete rows from the target table that have been flagged. Can anyone point me in the right direction?”
Here’s the answer: You can manage deletions using dbt’s merge
strategy within incremental models by detecting and removing rows flagged as is_deleted
from the target table.
Let’s break it all down.
Understanding the Challenge with Deleted Rows in Incremental Models
Incremental models are designed to append or update records without replacing the entire table. But deletions present a unique challenge. If a row in your source is flagged as is_deleted = true
, it won’t automatically be removed from your target table — unless you explicitly code for it.
Use Case: When You Have an is_deleted
Column
Your source table might look like this:
id | name | updated_at | is_deleted |
---|---|---|---|
1 | John | 2024-04-01 | false |
2 | Jane | 2024-04-02 | true |
In a full-refresh model, you’d just filter out the deleted records with a where is_deleted = false
. But with incremental models, that deleted record (Jane) would remain in the target unless removed.
How to Use DBT’s Merge Strategy to Delete Flagged Rows
The solution lies in dbt’s incremental_strategy='merge'
. This allows you to perform updates and deletions based on a condition.
Here’s how you can structure your dbt model:
{{ config(
materialized='incremental',
unique_key='id',
incremental_strategy='merge'
) }}
with source as (
select *
from {{ source('your_source_schema', 'your_table') }}
),
filtered as (
select *
from source
where updated_at > (select coalesce(max(updated_at), '1900-01-01') from {{ this }})
)
select
id,
name,
updated_at,
is_deleted
from filtered
Then add a post-hook or customize your merge logic to delete rows from your target where is_deleted = true
.
Using delete_condition
in dbt for Merge Strategy
If you’re on a warehouse that supports MERGE (like Snowflake, BigQuery, or Redshift), dbt allows a delete_condition
:
{{ config(
materialized='incremental',
unique_key='id',
incremental_strategy='merge',
delete_condition="source.is_deleted = true"
) }}
This tells dbt: “If a row in the source is flagged is_deleted = true
, then remove it from the target table.”
Key Considerations
- Your
unique_key
must be reliable, as it dictates how dbt matches source and target rows. - Ensure the
updated_at
field is properly indexed and up-to-date to avoid skipped deletions. - For Snowflake, the merge will include
WHEN MATCHED AND is_deleted = true THEN DELETE
.
Testing Deleted Rows Are Removed
Use dbt tests to validate the deletions:
models:
- name: your_incremental_model
tests:
- dbt_expectations.expect_column_values_to_be_in_set:
column: is_deleted
values: [false]
This ensures no true
flags remain in your target model.
Alternative Approach: Soft Deletes via an Active Flag
If your platform doesn’t support deletes in MERGE
, consider maintaining an is_active
flag:
case
when is_deleted = true then false
else true
end as is_active
Then downstream models can filter only where is_active = true
.
Conclusion
Handling is_deleted
flags in dbt incremental models is absolutely possible — and powerful — with the right configuration. By leveraging dbt’s merge
strategy and the delete_condition
, you ensure your target tables reflect reality, even when rows are removed from the source.
Say goodbye to stale deleted records, and hello to cleaner, more accurate data!
FAQs
Can dbt delete rows from target tables in incremental models?
Yes, by using the merge
strategy and specifying a delete_condition
.
What if my warehouse doesn’t support MERGE
?
Use soft deletes with an is_active
flag or build post-hooks to remove flagged rows.
How do I prevent deleted rows from appearing in downstream models?
Filter on is_active = true
or is_deleted = false
in your final models.
Is the delete_condition
supported in all warehouses?
No, it depends. Snowflake and BigQuery support it; check dbt’s docs for your adapter.
Can I test that deleted rows are truly gone?
Yes, use dbt tests
to ensure is_deleted = true
rows no longer exist.
Focus Keywords: dbt incremental delete flagged rows
Slug: dbt-incremental-delete-flagged-rows-is-deleted
Meta Description: Learn how to delete flagged rows in dbt incremental models using the is_deleted column to maintain accurate target tables.
Alt text image: dbt incremental delete flagged rows is_deleted