Quick Start Tutorial: Feature Management¶
Learning Objectives¶
In this tutorial you will learn:
- How to view the lineage of a feature
- How to manage the readiness of a feature
- How to be informed of the readiness of a feature list
- How to manage the status of a feature list
- How FeatureByte deployment guardrails work
- How to check the feature job status
- How to manage versioning
- How to disable deployment
Set up the prerequisites¶
Learning Objectives
In this section you will:
- start your local featurebyte server
- import libraries
- learn about catalogs
- activate a pre-built catalog
# library imports
import pandas as pd
import numpy as np
import random
# load the featurebyte SDK
import featurebyte as fb
# start the local server, then wait for it to be healthy before proceeding
fb.playground()
19:59:21 | INFO | Using configuration file at: C:\Users\colin\.featurebyte\config.yaml 19:59:21 | INFO | Active profile: local (http://127.0.0.1:8088) 19:59:21 | INFO | SDK version: 0.2.2 19:59:21 | INFO | Active catalog: default 19:59:21 | INFO | 1 feature lists, 4 features deployed 19:59:21 | INFO | (1/4) Starting featurebyte services 19:59:23 | INFO | (2/4) Creating local spark feature store 19:59:23 | INFO | (3/4) Import datasets 19:59:24 | INFO | Dataset grocery already exists, skipping import 19:59:24 | INFO | Dataset healthcare already exists, skipping import 19:59:24 | INFO | Dataset creditcard already exists, skipping import 19:59:24 | INFO | (4/4) Playground environment started successfully. Ready to go! 🚀
Create a pre-built catalog for this tutorial, with the data, metadata, and features already set up¶
Note that creating a pre-built catalog is not a step you will do in real-life. This is a function specific to this quick-start tutorial to quickly skip over many of the preparatory steps and get you to a point where you can materialize features.
In a real-life project you would do data modeling, declaring the tables, entities, and the associated metadata. This would not be a frequent task, but forms the basis for best-practice feature engineering.
# get the functions to create a pre-built catalog
from prebuilt_catalogs import *
# create a new catalog for this tutorial
catalog = create_tutorial_catalog(PrebuiltCatalog.QuickStartFeatureManagement)
Cleaning up existing tutorial catalogs
19:59:25 | INFO | Catalog activated: quick start feature management 20230512:1926
Cleaning catalog: quick start feature management 20230512:1926 1 deployments Done! |████████████████████████████████████████| 100% in 15.1s (0.07%/s)
19:59:46 | INFO | Catalog activated: default 19:59:46 | INFO | Catalog activated: quick start feature management 20230512:1959
Building a quick start catalog for feature management named [quick start feature management 20230512:1959] Creating new catalog Catalog created Registering the source tables Registering the entities Tagging the entities to columns in the data tables Populating the feature store with example features Saving Feature(s) |████████████████████████████████████████| 1/1 [100%] in 0.2s Loading Feature(s) |████████████████████████████████████████| 1/1 [100%] in 0.2s Saving Feature(s) |████████████████████████████████████████| 5/5 [100%] in 1.1s Loading Feature(s) |████████████████████████████████████████| 5/5 [100%] in 0.9s Saving Feature(s) |████████████████████████████████████████| 4/4 [100%] in 0.9s Loading Feature(s) |████████████████████████████████████████| 4/4 [100%] in 0.8s Saving Feature(s) |████████████████████████████████████████| 3/3 [100%] in 2.3s Loading Feature(s) |████████████████████████████████████████| 3/3 [100%] in 0.6s Setting feature readiness Deploying feature list Loading Feature(s) |████████████████████████████████████████| 4/4 [100%] in 0.7s Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 1:03.4 (0.02%/s) |████████████████████ | ▅▇▇ 50% in 50s (~50s, 0.0%
Manage Feature Readiness¶
Learning Objectives
In this section you will learn:
- how to change the readiness of a feature
- the meaning of each readiness value
Feature readiness¶
To help differentiate features that are in the prototype stage and features that are ready for production, a feature version can have one of four readiness levels:
PRODUCTION_READY: ready for deployment in production environments.
PUBLIC_DRAFT: shared for feedback purposes.
DRAFT: in the prototype stage.
DEPRECATED`: not advised for use in either training or prediction.
# list the features in the catalog - note the readiness of each
catalog.list_features()
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a4cb74023ec6436d152 | InvoiceUniqueProductGroups | OBJECT | DEPRECATED | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:21.240 |
1 | 645e2a4cb74023ec6436d154 | InvoiceUniqueProductGroupCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:13.384 |
2 | 645e2a4bb74023ec6436d150 | InvoiceDiscountAmount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:11.810 |
3 | 645e2a49b74023ec6436d14e | InvoiceItemCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:10.590 |
4 | 645e2a48b74023ec6436d14c | CustomerYearOfBirth | INT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:09.351 |
5 | 645e2a48b74023ec6436d148 | CustomerSpend_14d | FLOAT | PRODUCTION_READY | True | [GROCERYINVOICE] | [GROCERYINVOICE] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:08.738 |
6 | 645e2a46b74023ec6436d146 | CustomerInventory_24w | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:07.530 |
7 | 645e2a45b74023ec6436d144 | CustomerInventory_28d | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:06.056 |
8 | 645e2a44b74023ec6436d142 | StateMeanLongitude | FLOAT | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.801 |
9 | 645e2a43b74023ec6436d140 | StateMeanLatitude | FLOAT | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.265 |
10 | 645e2a43b74023ec6436d13e | StateAvgInvoiceAmount_28d | FLOAT | DRAFT | False | [GROCERYCUSTOMER, GROCERYINVOICE] | [GROCERYINVOICE] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:03.615 |
11 | 645e2a41b74023ec6436d13c | StateInventory_28d | OBJECT | DRAFT | False | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [INVOICEITEMS] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:02.265 |
12 | 645e2a40b74023ec6436d138 | StatePopulation | FLOAT | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:00.759 |
13 | 645e2a3eb74023ec6436d134 | unused experimental feature | DATE | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 11:59:59.265 |
14 | 645e2a3eb74023ec6436d132 | StateName | VARCHAR | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 11:59:58.712 |
Example: Set features to production ready¶
# change the state features to be production ready
for feature_name in catalog.list_features().name:
feature = catalog.get_feature(feature_name)
# does the feature name contain the word "state"?
if "State" in feature.name:
feature.update_readiness("PRODUCTION_READY")
# list the features in the catalog - note the readiness of each
catalog.list_features()
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a4cb74023ec6436d152 | InvoiceUniqueProductGroups | OBJECT | DEPRECATED | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:21.240 |
1 | 645e2a4cb74023ec6436d154 | InvoiceUniqueProductGroupCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:13.384 |
2 | 645e2a4bb74023ec6436d150 | InvoiceDiscountAmount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:11.810 |
3 | 645e2a49b74023ec6436d14e | InvoiceItemCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:10.590 |
4 | 645e2a48b74023ec6436d14c | CustomerYearOfBirth | INT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:09.351 |
5 | 645e2a48b74023ec6436d148 | CustomerSpend_14d | FLOAT | PRODUCTION_READY | True | [GROCERYINVOICE] | [GROCERYINVOICE] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:08.738 |
6 | 645e2a46b74023ec6436d146 | CustomerInventory_24w | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:07.530 |
7 | 645e2a45b74023ec6436d144 | CustomerInventory_28d | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:06.056 |
8 | 645e2a44b74023ec6436d142 | StateMeanLongitude | FLOAT | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.801 |
9 | 645e2a43b74023ec6436d140 | StateMeanLatitude | FLOAT | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.265 |
10 | 645e2a43b74023ec6436d13e | StateAvgInvoiceAmount_28d | FLOAT | PRODUCTION_READY | False | [GROCERYCUSTOMER, GROCERYINVOICE] | [GROCERYINVOICE] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:03.615 |
11 | 645e2a41b74023ec6436d13c | StateInventory_28d | OBJECT | PRODUCTION_READY | False | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [INVOICEITEMS] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:02.265 |
12 | 645e2a40b74023ec6436d138 | StatePopulation | FLOAT | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:00.759 |
13 | 645e2a3eb74023ec6436d134 | unused experimental feature | DATE | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 11:59:59.265 |
14 | 645e2a3eb74023ec6436d132 | StateName | VARCHAR | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 11:59:58.712 |
List Unsaved Features¶
Features that have not been saved will not persist once you close your Notebook. It is helpful to check that you have saved each feature that you wished to keep.
# create a feature without saving it
grocery_items_table = catalog.get_table("INVOICEITEMS")
grocery_items_view = grocery_items_table.get_view()
invoice_unique_product_ids = grocery_items_view.groupby("GroceryInvoiceGuid", category="GroceryProductGuid").aggregate(
None,
method=fb.AggFunc.COUNT,
feature_name="InvoiceUniqueProductIds"
)
invoice_unique_product_count = invoice_unique_product_ids.cd.unique_count()
# list unsaved features
fb.list_unsaved_features()
# note that the feature we just created hasn't been named, so cannot be saved
object_id | variable_name | name | catalog | active_catalog | |
---|---|---|---|---|---|
0 | 645e2abfb74023ec6436d167 | invoice_unique_product_ids | InvoiceUniqueProductIds | quick start feature management 20230512:1959 | True |
1 | 645e2abfb74023ec6436d169 | invoice_unique_product_count | None | quick start feature management 20230512:1959 | True |
Manage Feature List Status¶
Learning Objectives
In this section you will learn:
- how to change the status of a feature list
- the meaning of each status value
- how to deploy a feature list
Feature list status¶
Feature lists can be assigned one of five status levels to differentiate between experimental feature lists and those suitable for deployment or already deployed.
- DEPLOYED: Assigned to feature list with at least one deployed version.
- TEMPLATE: For feature lists as reference templates or safe starting points.
- PUBLIC_DRAFT: For feature lists shared for feedback purposes.
- DRAFT: For feature lists in the prototype stage.
- DEPRECATED: For outdated or unnecessary feature lists.
# list the feature lists in the catalog - note the status of each
# Note the readiness fraction which represents the proportion of features that are production ready
# Note the online fraction which represents the proportion of features that are being used in production
catalog.list_feature_lists()
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a53b74023ec6436d164 | InvoiceFeatureList | 3 | DRAFT | False | 0.0 | 0.0 | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [groceryinvoice] | 2023-05-12 12:00:22.110 |
1 | 645e2a50b74023ec6436d15c | CustomerFeatureList | 4 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [grocerycustomer] | 2023-05-12 12:00:18.285 |
2 | 645e2a4db74023ec6436d156 | StateFeatureList | 5 | DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [frenchstate] | 2023-05-12 12:00:15.559 |
3 | 645e2a3fb74023ec6436d136 | very short feature list | 1 | DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER] | [grocerycustomer] | 2023-05-12 11:59:59.883 |
Example: Make a feature list public¶
When a feature list is reviewed and ready for sharing with other users, change its status to PUBLIC_DRAFT
# get the state feature list
state_feature_list = catalog.get_feature_list("StateFeatureList")
# update the status to public draft
state_feature_list.update_status("PUBLIC_DRAFT")
Loading Feature(s) |████████████████████████████████████████| 5/5 [100%] in 1.2s
# list the feature lists in the catalog - note the status of each
catalog.list_feature_lists()
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a53b74023ec6436d164 | InvoiceFeatureList | 3 | DRAFT | False | 0.0 | 0.0 | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [groceryinvoice] | 2023-05-12 12:00:22.110 |
1 | 645e2a50b74023ec6436d15c | CustomerFeatureList | 4 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [grocerycustomer] | 2023-05-12 12:00:18.285 |
2 | 645e2a4db74023ec6436d156 | StateFeatureList | 5 | PUBLIC_DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [frenchstate] | 2023-05-12 12:00:15.559 |
3 | 645e2a3fb74023ec6436d136 | very short feature list | 1 | DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER] | [grocerycustomer] | 2023-05-12 11:59:59.883 |
Example: Deploy a feature list¶
Deploying a feature list changes its status to published
# deploy the state feature list
deployment = state_feature_list.deploy(make_production_ready=True)
deployment.enable()
Loading Feature(s) |████████████████████████████████████████| 5/5 [100%] in 1.2s Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 48.4s (0.02%/s)
# list the feature lists in the catalog - note the status of each
catalog.list_feature_lists()
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a53b74023ec6436d164 | InvoiceFeatureList | 3 | DRAFT | False | 0.0 | 0.0 | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [groceryinvoice] | 2023-05-12 12:00:22.110 |
1 | 645e2a50b74023ec6436d15c | CustomerFeatureList | 4 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [grocerycustomer] | 2023-05-12 12:00:18.285 |
2 | 645e2a4db74023ec6436d156 | StateFeatureList | 5 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [frenchstate] | 2023-05-12 12:00:15.559 |
3 | 645e2a3fb74023ec6436d136 | very short feature list | 1 | DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER] | [grocerycustomer] | 2023-05-12 11:59:59.883 |
Versioning¶
Learning Objectives
In this section you will learn:
- about feature versions and feature list versions
- how to change the table cleaning operations for a feature
- how to change the feature job settings for a feature
- how to manage the default versions for a feature or a feature list
- how to create new versions of features and feature lists
Concept: Feature version¶
A Feature Version enables the reuse of a Feature with varying feature job settings or distinct cleaning operations.
If the availability or freshness of the data source change, new versions of the feature can be generated with a new feature job setting. On the other hand, if changes occur in the data quality of the data sources, new versions of the feature can be created with new cleaning operations that address the new quality issues.
To ensure the seamless inference of Machine Learning tasks that depend on the feature, old versions of the feature can still be served without any disruption.
Example: Get table cleaning operations for a feature¶
# get the InvoiceDiscountAmount feature
invoice_discount_amount = catalog.get_feature("InvoiceDiscountAmount")
# list the feature versions
display(invoice_discount_amount.list_versions())
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a4bb74023ec6436d150 | InvoiceDiscountAmount | V230512 | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:11.799 |
# No cleaning has been operated on this feature
invoice_discount_amount.info()['table_cleaning_operation']
{'this': [], 'default': []}
Example: Updating table cleaning operations for a feature¶
# update the data cleaning operations in the InvoiceDiscountAmount feature
new_version = invoice_discount_amount.create_new_version(
table_cleaning_operations=[
fb.TableCleaningOperation(
table_name="INVOICEITEMS",
column_cleaning_operations=[
fb.ColumnCleaningOperation(
column_name="Discount",
cleaning_operations=[
fb.MissingValueImputation(imputed_value=0.0),
fb.ValueBeyondEndpointImputation(type="less_than", end_point=0, imputed_value=None)
],
)
],
)
]
)
# list the feature versions
feature_versions = invoice_discount_amount.list_versions()
# sort by created_at ascending
feature_versions.sort_values(by="created_at", ascending=True, inplace=True)
# display only the InvoiceUniqueProductGroups feature - note the new version that has been created
display(feature_versions)
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 645e2a4bb74023ec6436d150 | InvoiceDiscountAmount | V230512 | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:11.799 |
0 | 645e2b001640038c49aaeeaa | InvoiceDiscountAmount | V230512_1 | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:03:13.305 |
# Check new cleaning info
new_version.info()['table_cleaning_operation']['this']
[{'table_name': 'INVOICEITEMS', 'column_cleaning_operations': [{'column_name': 'Discount', 'cleaning_operations': [{'imputed_value': 0.0, 'type': 'missing'}, {'imputed_value': None, 'type': 'less_than', 'end_point': 0}]}]}]
# Check feature definition file
new_version.definition
# Generated by SDK version: 0.2.2
from bson import ObjectId
from featurebyte import ColumnCleaningOperation
from featurebyte import ItemTable
from featurebyte import MissingValueImputation
from featurebyte import ValueBeyondEndpointImputation
# item_table name: "INVOICEITEMS", event_table name: "GROCERYINVOICE"
item_table = ItemTable.get_by_id(ObjectId("645e2a37b74023ec6436d12c"))
item_view = item_table.get_view(
event_suffix=None,
view_mode="manual",
drop_column_names=[],
column_cleaning_operations=[
ColumnCleaningOperation(
column_name="Discount",
cleaning_operations=[
MissingValueImputation(imputed_value=0.0),
ValueBeyondEndpointImputation(
type="less_than", end_point=0, imputed_value=None
),
],
)
],
event_drop_column_names=["record_available_at"],
event_column_cleaning_operations=[],
event_join_column_names=[
"Timestamp",
"GroceryInvoiceGuid",
"GroceryCustomerGuid",
"tz_offset",
],
)
feat = item_view.groupby(
by_keys=["GroceryInvoiceGuid"], category=None
).aggregate(
value_column="Discount",
method="sum",
feature_name="InvoiceDiscountAmount",
skip_fill_na=True,
)
output = feat
Concept: Default feature version¶
The default version of a feature streamlines the process of reusing features by providing the most appropriate version. Additionally, it simplifies the creation of new versions of feature lists.
By default, the feature's version with the highest level of readiness is considered, unless the user overrides this selection. In cases where multiple versions share the highest level of readiness, the most recent version is automatically chosen as the default.
When a feature is accessed without specifying a version ID but only by its name, the default version is automatically retrieved.
Example: The version we just created should be the default as no other version has a higher readiness and it is the latest version.¶
new_version.is_default
True
Example: Get the feature job settings for a feature¶
Note that changing feature job settings will only affect time-aware features e.g. features created using aggregate_over. It will not affect features based upon simple aggregation.
# get the CustomerInventory_28d feature
customer_inventory_28d_feature = catalog.get_feature("CustomerInventory_28d")
# list the feature versions
display(customer_inventory_28d_feature.list_versions())
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a45b74023ec6436d144 | CustomerInventory_28d | V230512 | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:06.046 |
# Get the feature job settings for CustomerInventory_28d
customer_inventory_28d_feature.info()['table_feature_job_setting']['this']
[{'table_name': 'GROCERYINVOICE', 'feature_job_setting': {'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}}]
Note that the table name here is the event table associated with the item table as the timestamp originates from this table.
# show the feature job settings for the grocery invoice table
grocery_invoice_table = catalog.get_table("GROCERYINVOICE")
grocery_invoice_table.default_feature_job_setting
FeatureJobSetting(blind_spot='145', frequency='60m', time_modulo_frequency='90s')
Example: Change the feature job settings for a feature¶
Note that changing feature job settings will only affect time-aware features e.g. features created using aggregate_over. It will not affect features based upon simple aggregation.
# update the data cleaning operations for the InvoiceDiscountAmount feature to be more conservative
new_version = customer_inventory_28d_feature.create_new_version(
table_feature_job_settings=[
fb.TableFeatureJobSetting(
table_name="GROCERYINVOICE",
feature_job_setting=fb.FeatureJobSetting(
blind_spot="160s",
frequency="60m",
time_modulo_frequency="90s",
)
),
]
)
# list the feature versions
feature_versions = customer_inventory_28d_feature.list_versions()
# sort by created_at ascending
feature_versions.sort_values(by="created_at", ascending=True, inplace=True)
# note that the new version is a draft, and that the old version remains production ready
display(feature_versions)
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 645e2a45b74023ec6436d144 | CustomerInventory_28d | V230512 | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:06.046 |
0 | 645e2b061640038c49aaeeb2 | CustomerInventory_28d | V230512_1 | OBJECT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:03:18.593 |
# Get the feature job settings for the new version
new_version.info()['table_feature_job_setting']['this']
[{'table_name': 'GROCERYINVOICE', 'feature_job_setting': {'blind_spot': '160s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}}]
# Check feature definition file
new_version.definition
# Generated by SDK version: 0.2.2
from bson import ObjectId
from featurebyte import DimensionTable
from featurebyte import FeatureJobSetting
from featurebyte import ItemTable
# dimension_table name: "GROCERYPRODUCT"
dimension_table = DimensionTable.get_by_id(ObjectId("645e2a3ab74023ec6436d12d"))
dimension_view = dimension_table.get_view(
view_mode="manual", drop_column_names=[], column_cleaning_operations=[]
)
# item_table name: "INVOICEITEMS", event_table name: "GROCERYINVOICE"
item_table = ItemTable.get_by_id(ObjectId("645e2a37b74023ec6436d12c"))
item_view = item_table.get_view(
event_suffix=None,
view_mode="manual",
drop_column_names=[],
column_cleaning_operations=[],
event_drop_column_names=["record_available_at"],
event_column_cleaning_operations=[],
event_join_column_names=[
"Timestamp",
"GroceryInvoiceGuid",
"GroceryCustomerGuid",
"tz_offset",
],
)
joined_view = item_view.join(dimension_view, on=None, how="left", rsuffix="")
grouped = joined_view.groupby(
by_keys=["GroceryCustomerGuid"], category="ProductGroup"
).aggregate_over(
value_column=None,
method="count",
windows=["28d"],
feature_names=["CustomerInventory_28d"],
feature_job_setting=FeatureJobSetting(
blind_spot="160s", frequency="3600s", time_modulo_frequency="90s"
),
skip_fill_na=True,
)
feat = grouped["CustomerInventory_28d"]
output = feat
Example: Change default feature version mode¶
The new version of CustomerInventory_28d is not the default as this version is a draft and the prior version is production ready.
new_version.is_default
False
The default can be changed only if the default version mode is set as manual.
# guardrail if default version mode is not MANUAL
# the new version cannot be set as the default
try:
new_version.as_default_version()
except Exception as ex:
print(ex)
Cannot set default feature ID when default version mode is not MANUAL.
# downgrade current feature readiness to public draft first
customer_inventory_28d_feature.update_readiness("PUBLIC_DRAFT")
# upgrade new version readiness to public draft, new version becomes default
new_version.update_readiness("PUBLIC_DRAFT")
print(new_version.is_default, customer_inventory_28d_feature.is_default)
# change mode to manual and set the original version as default
customer_inventory_28d_feature.update_default_version_mode("MANUAL")
customer_inventory_28d_feature.as_default_version()
print(customer_inventory_28d_feature.is_default)
# change new version as default
new_version.as_default_version()
print(new_version.is_default)
True
# upgrade new version readiness to production ready
new_version.update_readiness("PRODUCTION_READY", ignore_guardrails=True)
Example: Cannot have more than one production ready version of a feature¶
# change the readiness of the original version of CustomerInventory_28d to production ready
try:
customer_inventory_28d_feature.update_readiness("PRODUCTION_READY")
except Exception as ex:
print("Error changing the readiness of the new version to production ready")
print(ex)
Error changing the readiness of the new version to production ready Found another feature version that is already PRODUCTION_READY. Please deprecate the feature "CustomerInventory_28d" with ID 645e2a45b74023ec6436d144 first before promoting the promoted version as there can only be one feature version that is production ready at any point in time. We are unable to promote the feature with ID 645e2b061640038c49aaeeb2 right now.
Example: Create version of a Feature List¶
The Feature List Version allows the use of the latest version of each feature. Upon creation of a new feature list version, the latest default versions of features are employed, unless particular feature versions are specified.
# the current default of the feature list has all feature versions production ready
customer_feature_list = catalog.get_feature_list("CustomerFeatureList")
customer_feature_list.list_features()
Loading Feature(s) |████████████████████████████████████████| 4/4 [100%] in 0.8s
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a48b74023ec6436d14c | CustomerYearOfBirth | V230512 | INT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:09.343 |
1 | 645e2a48b74023ec6436d148 | CustomerSpend_14d | V230512 | FLOAT | PRODUCTION_READY | True | [GROCERYINVOICE] | [GROCERYINVOICE] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:08.727 |
2 | 645e2a46b74023ec6436d146 | CustomerInventory_24w | V230512 | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:07.520 |
3 | 645e2a45b74023ec6436d144 | CustomerInventory_28d | V230512 | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:06.046 |
# create new version with the new default of CustomerInventory_28d
new_feature_list_version = customer_feature_list.create_new_version()
Loading Feature(s) |████████████████████████████████████████| 4/4 [100%] in 0.8s
new_feature_list_version.list_features()
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2b061640038c49aaeeb2 | CustomerInventory_28d | V230512_1 | OBJECT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:03:18.593 |
1 | 645e2a48b74023ec6436d14c | CustomerYearOfBirth | V230512 | INT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:09.343 |
2 | 645e2a48b74023ec6436d148 | CustomerSpend_14d | V230512 | FLOAT | PRODUCTION_READY | True | [GROCERYINVOICE] | [GROCERYINVOICE] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:08.727 |
3 | 645e2a46b74023ec6436d146 | CustomerInventory_24w | V230512 | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:07.520 |
# check readiness
new_feature_list_version.info()['production_ready_fraction']
{'this': 0.75, 'default': 1.0}
Default version of a Feature List¶
The default version of a feature list is the version with the highest fraction of production ready features.
The new version is not the default as its production_ready_fraction is lower than the prior version for CustomerInventory_28d.
# the new version is not the default as it is production_ready_fraction is lower than the current default
new_feature_list_version.is_default
False
Deleting Drafts¶
While prototyping, you may create and experiment with many features and feature lists. To avoid feature explosions you should do regular cleanups of unused features and feature lists. Note that you cannot delete features or feature lists that have ever been deployed.
Example: Deleting a feature¶
# list all of the features in this catalog
display(catalog.list_features())
# note the feature called "unused experimental feature" - this is a feature that is not being used in production, and was rejected as unhelpful
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a4cb74023ec6436d152 | InvoiceUniqueProductGroups | OBJECT | DEPRECATED | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:21.240 |
1 | 645e2a4cb74023ec6436d154 | InvoiceUniqueProductGroupCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:13.384 |
2 | 645e2b001640038c49aaeeaa | InvoiceDiscountAmount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:11.810 |
3 | 645e2a49b74023ec6436d14e | InvoiceItemCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:10.590 |
4 | 645e2a48b74023ec6436d14c | CustomerYearOfBirth | INT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:09.351 |
5 | 645e2a48b74023ec6436d148 | CustomerSpend_14d | FLOAT | PRODUCTION_READY | True | [GROCERYINVOICE] | [GROCERYINVOICE] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:08.738 |
6 | 645e2a46b74023ec6436d146 | CustomerInventory_24w | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:07.530 |
7 | 645e2b061640038c49aaeeb2 | CustomerInventory_28d | OBJECT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:06.056 |
8 | 645e2a44b74023ec6436d142 | StateMeanLongitude | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.801 |
9 | 645e2a43b74023ec6436d140 | StateMeanLatitude | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.265 |
10 | 645e2a43b74023ec6436d13e | StateAvgInvoiceAmount_28d | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER, GROCERYINVOICE] | [GROCERYINVOICE] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:03.615 |
11 | 645e2a41b74023ec6436d13c | StateInventory_28d | OBJECT | PRODUCTION_READY | True | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [INVOICEITEMS] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:02.265 |
12 | 645e2a40b74023ec6436d138 | StatePopulation | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:00.759 |
13 | 645e2a3eb74023ec6436d134 | unused experimental feature | DATE | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 11:59:59.265 |
14 | 645e2a3eb74023ec6436d132 | StateName | VARCHAR | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 11:59:58.712 |
# get the feature
feature_to_delete = catalog.get_feature("unused experimental feature")
# delete the feature
feature_to_delete.delete()
# list all the features and note that the deleted feature no longer appears
display(catalog.list_features())
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a4cb74023ec6436d152 | InvoiceUniqueProductGroups | OBJECT | DEPRECATED | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:21.240 |
1 | 645e2a4cb74023ec6436d154 | InvoiceUniqueProductGroupCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:13.384 |
2 | 645e2b001640038c49aaeeaa | InvoiceDiscountAmount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:11.810 |
3 | 645e2a49b74023ec6436d14e | InvoiceItemCount | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:10.590 |
4 | 645e2a48b74023ec6436d14c | CustomerYearOfBirth | INT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:09.351 |
5 | 645e2a48b74023ec6436d148 | CustomerSpend_14d | FLOAT | PRODUCTION_READY | True | [GROCERYINVOICE] | [GROCERYINVOICE] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:08.738 |
6 | 645e2a46b74023ec6436d146 | CustomerInventory_24w | OBJECT | PRODUCTION_READY | True | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:07.530 |
7 | 645e2b061640038c49aaeeb2 | CustomerInventory_28d | OBJECT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 12:00:06.056 |
8 | 645e2a44b74023ec6436d142 | StateMeanLongitude | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.801 |
9 | 645e2a43b74023ec6436d140 | StateMeanLatitude | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:04.265 |
10 | 645e2a43b74023ec6436d13e | StateAvgInvoiceAmount_28d | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER, GROCERYINVOICE] | [GROCERYINVOICE] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:03.615 |
11 | 645e2a41b74023ec6436d13c | StateInventory_28d | OBJECT | PRODUCTION_READY | True | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [INVOICEITEMS] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:02.265 |
12 | 645e2a40b74023ec6436d138 | StatePopulation | FLOAT | PRODUCTION_READY | True | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [frenchstate] | [frenchstate] | 2023-05-12 12:00:00.759 |
13 | 645e2a3eb74023ec6436d132 | StateName | VARCHAR | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [grocerycustomer] | [grocerycustomer] | 2023-05-12 11:59:58.712 |
Example: Deleting a feature list¶
# list all of the feature lists in the catalog
display(catalog.list_feature_lists())
# note the feature list called "very short feature list" - this is a feature list that is not being used in production, and was rejected as unhelpful
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a53b74023ec6436d164 | InvoiceFeatureList | 3 | DRAFT | False | 0.0 | 0.0 | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [groceryinvoice] | 2023-05-12 12:00:22.110 |
1 | 645e2a50b74023ec6436d15c | CustomerFeatureList | 4 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [grocerycustomer] | 2023-05-12 12:00:18.285 |
2 | 645e2a4db74023ec6436d156 | StateFeatureList | 5 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [frenchstate] | 2023-05-12 12:00:15.559 |
3 | 645e2a3fb74023ec6436d136 | very short feature list | 1 | DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER] | [grocerycustomer] | 2023-05-12 11:59:59.883 |
# get the feature list
feature_list_to_delete = catalog.get_feature_list("very short feature list")
# delete the feature
feature_list_to_delete.delete()
Loading Feature(s) |████████████████████████████████████████| 1/1 [100%] in 0.2s
# list all the feature lists and note that the deleted feature list no longer appears
display(catalog.list_feature_lists())
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a53b74023ec6436d164 | InvoiceFeatureList | 3 | DRAFT | False | 0.0 | 0.0 | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [groceryinvoice] | 2023-05-12 12:00:22.110 |
1 | 645e2a50b74023ec6436d15c | CustomerFeatureList | 4 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [grocerycustomer] | 2023-05-12 12:00:18.285 |
2 | 645e2a4db74023ec6436d156 | StateFeatureList | 5 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [frenchstate] | 2023-05-12 12:00:15.559 |
Deployment Guardrails¶
FeatureByte has guardrails for deployment, and will prevent you from deploying a feature list that has features that are not production ready.
Learning Objectives
In this section you will learn:
- how to check the readiness of a feature list
- how to deploy a feature list
Example: Check readiness of a feature list¶
The Feature List Readiness metric provides information to users about the readiness status of a Feature List. This metric represents the percentage of features that are production ready within the given feature list.
# get the invoice feature list
invoice_feature_list = catalog.get_feature_list("InvoiceFeatureList")
# check feature list is ready to be deployed
invoice_feature_list.info()['production_ready_fraction']
Loading Feature(s) |████████████████████████████████████████| 3/3 [100%] in 0.6s
{'this': 0.0, 'default': 0.0}
invoice_feature_list.list_features()
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a4cb74023ec6436d152 | InvoiceUniqueProductGroups | V230512 | OBJECT | DEPRECATED | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:21.226 |
1 | 645e2a4bb74023ec6436d150 | InvoiceDiscountAmount | V230512 | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:11.799 |
2 | 645e2a49b74023ec6436d14e | InvoiceItemCount | V230512 | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS] | [INVOICEITEMS] | [groceryinvoice] | [groceryinvoice] | 2023-05-12 12:00:10.580 |
Example: Deploy a feature list when production readiness is not 100%¶
# deploy the invoice feature list
try:
bad_deployment = invoice_feature_list.deploy()
bad_deployment.enable()
except Exception as ex:
print("Error deploying the invoice feature list")
print(ex)
Loading Feature(s) |████████████████████████████████████████| 3/3 [100%] in 0.7s Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Working... |⚠︎ | 0% in 6.1s (0.00%/s) Error deploying the invoice feature list Traceback (most recent call last): File "/opt/venv/lib/python3.8/site-packages/celery/app/trace.py", line 451, in trace_task R = retval = fun(*args, **kwargs) File "/opt/venv/lib/python3.8/site-packages/celery/app/trace.py", line 734, in __protected_call__ return self.run(*args, **kwargs) File "/opt/venv/lib/python3.8/site-packages/featurebyte/worker/task_executor.py", line 145, in execute_io_task return run_async(execute_task(self.request.id, **payload)) File "/opt/venv/lib/python3.8/site-packages/featurebyte/worker/task_executor.py", line 70, in run_async return future.result() File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result return self.__get_result() File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception File "/opt/venv/lib/python3.8/site-packages/featurebyte/worker/task_executor.py", line 120, in execute_task return_val = await executor.execute() File "/opt/venv/lib/python3.8/site-packages/featurebyte/worker/task_executor.py", line 97, in execute await self.task.execute() File "/opt/venv/lib/python3.8/site-packages/featurebyte/worker/task/deployment_create_update.py", line 42, in execute await self.app_container.deploy_service.update_deployment( File "/opt/venv/lib/python3.8/site-packages/featurebyte/service/deploy.py", line 409, in update_deployment await self.update_feature_list( File "/opt/venv/lib/python3.8/site-packages/featurebyte/service/deploy.py", line 269, in update_feature_list await self._validate_deployed_operation(document, deployed) File "/opt/venv/lib/python3.8/site-packages/featurebyte/service/deploy.py", line 191, in _validate_deployed_operation raise DocumentUpdateError( featurebyte.exception.DocumentUpdateError: Only FeatureList object of all production ready features can be deployed.
# show the feature lists - note that the invoice feature list has not been deployed
catalog.list_feature_lists()
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a53b74023ec6436d164 | InvoiceFeatureList | 3 | DRAFT | False | 0.0 | 0.0 | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [groceryinvoice] | 2023-05-12 12:00:22.110 |
1 | 645e2a50b74023ec6436d15c | CustomerFeatureList | 4 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [grocerycustomer] | 2023-05-12 12:00:18.285 |
2 | 645e2a4db74023ec6436d156 | StateFeatureList | 5 | DEPLOYED | True | 1.0 | 1.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [frenchstate] | 2023-05-12 12:00:15.559 |
Example: Disabling a deployment¶
# helper function to disable deployment for a specific feature list
def disable_deployment(feature_list_name):
# list deployments
deployments = catalog.list_deployments()
# just the ones matching this feature list name
deployments = deployments.loc[deployments.feature_list_name == feature_list_name]
# disable
for id in deployments.id:
deployment = catalog.get_deployment_by_id(id)
deployment.disable()
# disable the deployments
disable_deployment("CustomerFeatureList")
disable_deployment("StateFeatureList")
Done! |████████████████████████████████████████| 100% in 18.2s (0.06%/s) Done! |████████████████████████████████████████| 100% in 15.1s (0.07%/s)
# show the feature lists status
catalog.list_feature_lists()
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 645e2a53b74023ec6436d164 | InvoiceFeatureList | 3 | DRAFT | False | 0.0 | 0.0 | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [groceryinvoice] | 2023-05-12 12:00:22.110 |
1 | 645e2a50b74023ec6436d15c | CustomerFeatureList | 4 | PUBLIC_DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [grocerycustomer] | 2023-05-12 12:00:18.285 |
2 | 645e2a4db74023ec6436d156 | StateFeatureList | 5 | PUBLIC_DRAFT | False | 1.0 | 0.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [frenchstate] | 2023-05-12 12:00:15.559 |
Next Steps¶
Now that you've completed the quick-start feature management tutorial, you can put your knowledge into practice or learn more:
1) Put your knowledge into practice by creating features in the "credit card dataset feature engineering playground" or "healthcare dataset feature engineering playground" catalogs
2) Learn more about feature engineering via the "Deep Dive Feature Engineering" tutorial
3) Learn about data modeling via the "Deep Dive Data Modeling" tutorial