Defining And Registering Features
1. Instantiate a Client
To start, let's instantiate a client. We'll specify the namespace "development" as we are simply playing around. We will be using this client to register features and trigger jobs later. Make sure to input your API key here that we generated earlier.
from glacius import SnowflakeSource, Client
client = Client(api_key="***", namespace="development")2. Define A Data Source
Let's define a data source. We'll use snowflake as an example. This table contains items interaction data on items for our hypothetical e-commerce app. We'll make sure to specify the timestamp_col as this allows Glacius to perform point-in-time joins.
from glacius import SnowflakeSource
item_engagement_data_source = SnowflakeSource(
name = "global_item_engagement_data",
description = "item engagement data",
timestamp_col = "timestamp",
table = "global_item_engagement_data",
database = "gradiently",
schema="public"
)3. Defining Your Feature Bundle
Features are grouped into logical groups called feature bundles. A feature bundle is a logical grouping of features that share an entity and a datasource. For example, we could have a feature bundle for user features, another for item features, and another for user-item features.
Let's define our bundle here and add some aggregation features. we'll also need to specify the entity this bundle is attached to. Glacius also supports composite entities, but in this example, we have a simple single entity with a single join key.
This defines the following feature (total items clicked) across the different time windows [1,3,5,7] days and also within these categories:
- electronics_accessories
- fashion_apparel
- home_garden
from glacius import FeatureBundle, Entity, Int32
user_entity = Entity(keys=["user_id"])
user_bundle = FeatureBundle(
name="user_feature_bundle",
description="user features on item engagement data",
source=item_engagement_data_source,
entity=user_entity,
)
categories = ["electronics_accessories", "fashion_apparel", "home_garden"]
time_windows = [1,3,5,7]
for category in categories:
user_bundle.add_features([
Feature(
name = f"total_items_clicked_{category}_{t}d",
description = f"total items clicked over {t} days",
expr = when(col("product_category") == category).then(col("item_click")).otherwise(0),
dtype = Int32,
agg=Aggregation(method=AggregationType.SUM, window=timedelta(days=t))
) for t time_windows
])
4. Register Your Feature Bundle
Finally, let's register our bundle. You can now view it from the UI!
response = client.register(feature_bundles=[user_bundle],
commit_msg="Added user feature bundle containing click features\
for electronics, fashion, and home garden")