14. Compute historical feature values
Compute historical feature values¶
Historical feature values are needed to train and test Machine Learning models.
Let's take the feature list we just created and compute feature values for a given observation table.
In [1]:
Copied!
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
15:35:03 | INFO | SDK version: 1.0.2.dev46 15:35:03 | INFO | No catalog activated. 15:35:03 | INFO | Using profile: tutorial 15:35:03 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 15:35:03 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 15:35:03 | INFO | SDK version: 1.0.2.dev46 15:35:03 | INFO | No catalog activated. 15:35:03 | INFO | Catalog activated: Grocery Dataset Tutorial
List feature lists in Catalog¶
In [2]:
Copied!
catalog.list_feature_lists()
catalog.list_feature_lists()
Out[2]:
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | primary_entity | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 662b59023bbb17418af6e696 | Customer x ProductGroup Simple FeatureList | 9 | DRAFT | False | 0.0 | 0.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [customer, productgroup] | [customer, productgroup] | 2024-04-26T07:34:53.267000 |
Get Feature List from Catalog¶
In [3]:
Copied!
simple_feature_list = catalog.get_feature_list("Customer x ProductGroup Simple FeatureList")
simple_feature_list = catalog.get_feature_list("Customer x ProductGroup Simple FeatureList")
Loading Feature(s) |████████████████████████████████████████| 9/9 [100%] in 0.6s
Get an observation table¶
In [4]:
Copied!
# List observation tables
catalog.list_observation_tables()
# List observation tables
catalog.list_observation_tables()
Out[4]:
id | name | type | shape | feature_store_name | created_at | |
---|---|---|---|---|---|---|
0 | 662b57ec0afca1c7b079f598 | Preview Table with 10 items | view | [10, 2] | playground | 2024-04-26T07:29:55.010000 |
1 | 662b57d8daab72d046ad966e | In_Store_Customer_x_ProductGroup_Spending_next... | observation_table | [1000, 4] | playground | 2024-04-26T07:29:38.704000 |
2 | 662b57c572b2fff854399a7a | In_Store_Customer_x_ProductGroup_2023_1K | uploaded_file | [1000, 3] | playground | 2024-04-26T07:29:18.860000 |
In [5]:
Copied!
# Get observation table: 'In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K'
training_observations = catalog.get_observation_table(
"In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
)
# Get observation table: 'In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K'
training_observations = catalog.get_observation_table(
"In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
)
Compute historical features¶
In [6]:
Copied!
# Create historical feature table
table_name =\
"Simple Training Simple Training for In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
training_data_table = simple_feature_list.compute_historical_feature_table(
training_observations,
historical_feature_table_name=table_name,
)
# Create historical feature table
table_name =\
"Simple Training Simple Training for In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
training_data_table = simple_feature_list.compute_historical_feature_table(
training_observations,
historical_feature_table_name=table_name,
)
Done! |████████████████████████████████████████| 100% in 38.0s (0.03%/s)
In [7]:
Copied!
display(training_data_table.to_pandas())
display(training_data_table.to_pandas())
Downloading table |████████████████████████████████████████| 1000/1000 [100%] in
GROCERYCUSTOMERGUID | POINT_IN_TIME | PRODUCTGROUP | CUSTOMER_x_PRODUCTGROUP_Sum_of_TotalCost_next_2_weeks | CUSTOMER_Age_band | CUSTOMER_Latest_invoice_Amount | CUSTOMER_Count_of_invoice_14d | CUSTOMER_Avg_of_invoice_Amount_14d | CUSTOMER_Std_of_invoice_Amount_14d | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d | CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w | CUSTOMER_x_PRODUCTGROUP_Sum_of_item_TotalCost_14d | CUSTOMER_x_PRODUCTGROUP_Time_Since_Latest_Timestamp | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 699efd7f-aba2-4515-9335-2c8040a94f9f | 2023-12-11 08:51:22 | Fromages | 14.18 | 80-84 | 13.13 | 4 | 13.860000 | 3.720329 | 0.179880 | 0.683171 | 6.00 | 166.499167 |
1 | 125dfe7d-eac0-4eab-94d8-1cd008e1641c | 2023-05-16 09:00:11 | Laits | 1.85 | 30-34 | 5.82 | 1 | 5.820000 | 0.000000 | -1.000000 | 0.645410 | 0.00 | 2653.102500 |
2 | 326b6ccb-0891-49fe-acbf-31d06c6d9e67 | 2023-03-20 13:34:55 | Céréales | 0.00 | 35-39 | 24.79 | 1 | 24.790000 | 0.000000 | 1.414202 | 0.624311 | 0.00 | 532.296944 |
3 | e42fa5f3-7737-4c6a-9ef4-856f113e60bd | 2023-12-18 19:04:45 | Fromages | 9.00 | 25-29 | 4.76 | 4 | 12.860000 | 7.439772 | -0.637723 | 0.649094 | 11.36 | 241.682222 |
4 | dde029d7-ceca-4e44-aad0-38e22ba11b74 | 2023-09-08 15:00:07 | Pains | 3.49 | 40-44 | 22.71 | 6 | 10.605000 | 8.144472 | 0.930581 | 0.740797 | 2.50 | 50.218611 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
995 | e883912f-82c4-4ca8-bfa9-0bdeb46dd4c5 | 2023-06-28 20:16:15 | Céréales | 0.00 | 70-74 | 3.00 | 1 | 3.000000 | 0.000000 | NaN | 0.724302 | 0.00 | 1994.139722 |
996 | cc96d96e-5d02-48dd-b742-d2a0ef633c43 | 2023-03-07 10:00:46 | Laits | 2.00 | 55-59 | 30.21 | 0 | NaN | NaN | NaN | 0.655556 | 0.00 | 984.909444 |
997 | 1b82b9eb-cc54-4cc4-a7e3-9a7417faa8a5 | 2023-11-16 20:44:02 | Laits | 2.69 | 40-44 | 4.00 | 3 | 8.143333 | 2.931033 | -1.413608 | 0.541741 | 0.00 | 1970.083056 |
998 | c0ca0bda-e7f5-4748-9b14-0e7ba9a07a47 | 2023-04-06 14:58:43 | Laits | 2.32 | 65-69 | 17.20 | 10 | 20.122000 | 13.270356 | -0.079666 | 0.808877 | 4.64 | 241.992500 |
999 | a0588833-ba78-41a4-b36a-d36bcd68e27e | 2023-09-16 13:40:19 | Fromages | 0.00 | 20-24 | 5.50 | 6 | 6.561667 | 2.902714 | -0.493589 | 0.699924 | 0.00 | 1578.428611 |
1000 rows × 13 columns
In [8]:
Copied!
### List historical feature tables from catalog
catalog.list_historical_feature_tables()
### List historical feature tables from catalog
catalog.list_historical_feature_tables()
Out[8]:
id | name | feature_store_name | observation_table_name | shape | created_at | |
---|---|---|---|---|---|---|
0 | 662b592bab51a900e2d8e938 | Simple Training Simple Training for In_Store_C... | playground | In_Store_Customer_x_ProductGroup_Spending_next... | [1000, 13] | 2024-04-26T07:35:40.835000 |
Concepts in this tutorial¶
SDK reference for¶
- Historical feature table
- FeatureList.compute historical feature table()
- FeatureList.compute_historical_features() to compute directly a data frame
In [ ]:
Copied!