Defining and Registering Features

Defining And Registering Features

1. Instantiate a Client

To start, let's instantiate a client. We'll specify the namespace "development" as we are simply playing around. We will be using this client to register features and trigger jobs later. Make sure to input your API key here that we generated earlier.

main.py
from glacius import SnowflakeSource, Client
 
client = Client(api_key="***", namespace="development")
2. Define A Data Source

Let's define a data source. We'll use snowflake as an example. This table contains items interaction data on items for our hypothetical e-commerce app. We'll make sure to specify the timestamp_col as this allows Glacius to perform point-in-time joins.

main.py
from glacius import SnowflakeSource
 
item_engagement_data_source = SnowflakeSource(
    name = "global_item_engagement_data",
    description = "item engagement data",
    timestamp_col = "timestamp",
    table = "global_item_engagement_data",
    database = "gradiently",
    schema="public" 
)
3. Defining Your Feature Bundle

Features are grouped into logical groups called feature bundles. A feature bundle is a logical grouping of features that share an entity and a datasource. For example, we could have a feature bundle for user features, another for item features, and another for user-item features.

Let's define our bundle here and add some aggregation features. we'll also need to specify the entity this bundle is attached to. Glacius also supports composite entities, but in this example, we have a simple single entity with a single join key.

This defines the following feature (total items clicked) across the different time windows [1,3,5,7] days and also within these categories:

  • electronics_accessories
  • fashion_apparel
  • home_garden
main.py
from glacius import FeatureBundle, Entity, Int32
 
 
user_entity = Entity(keys=["user_id"])
 
user_bundle = FeatureBundle(
    name="user_feature_bundle",
    description="user features on item engagement data",
    source=item_engagement_data_source,
    entity=user_entity,    
)
 
categories = ["electronics_accessories", "fashion_apparel", "home_garden"]
time_windows = [1,3,5,7]
 
for category in categories:  
  user_bundle.add_features([
      Feature(
          name = f"total_items_clicked_{category}_{t}d",
          description = f"total items clicked over {t} days",
          expr = when(col("product_category") == category).then(col("item_click")).otherwise(0),
          dtype = Int32,
          agg=Aggregation(method=AggregationType.SUM, window=timedelta(days=t))
      ) for t time_windows
  ])
 
4. Register Your Feature Bundle

Finally, let's register our bundle. You can now view it from the UI!

main.py
response = client.register(feature_bundles=[user_bundle],
                          commit_msg="Added user feature bundle containing click features\
                          for electronics, fashion, and home garden")