Key Concepts
Point in Time Join

Point In Time Join

The point in time join is an important concept in computing features. The core idea is that features are constantly evolving, and we need point-in-time correctness when building features to train our model.

For example, consider an e-commerce app. You might be interested in a given feature that describes the total amount of time they spent on the app in the past week. Maybe some weeks the user had little to no activity, while others had much higher activity. For each training set record, we'd need to travel back in time to understand what the users features were at that instance in time.

In Glacius, we perform a point in time join on some label datasource. These are the events we're interested in (ad clicks, users churned, items bought, etc.)

We then perform a point in time join on the feature bundle's datasource to compute what each feature is for each row of the label datasource to build the offline features. For online materialization, since we only care about what the current features are, we compute the join with the current timestamp.