Table Of Contents
Table Of Contents

2024-11-22 - History layer - joins for late arriving records

Release

Status: Available

Type: DataOps

Date: 2024-11-22

Problem

We have a customer that runs a relational database for their core application, but the updates to that database can happen in any order which means we receive records that are timestamped before or after the logical sequence of events.. eg a customer subscription might be created before the customer exists. This is only an issue when we come to join multiple concepts together using SCD loads.

Solution

The solution was to join tiles on the primary key then sequence the timestamps in such a way that we always get a match to related tables, even if the timestamp doesn’t quite line up.

Leverage the Magic

This was an update to our jinja join template to wrap table joins in additional logic required to get a consistent record.

ADI

Tricky, but it works !

Last Refreshed

Doc Refreshed: 2024-11-29