Ideated Features¶
Prerequisites
This page uses the client and wait_for_task helpers defined in API Overview.
After running an ideation pipeline, you can retrieve the individual features it generated — including the Python SDK code to reproduce or modify each feature. This is particularly useful for:
- Inspecting how features are constructed
- Reproducing features outside the pipeline
- Modifying features (e.g., changing windows, filters, or aggregations)
- Feeding feature definitions to AI tools for analysis or improvement
List Suggested Features¶
Retrieve all features generated by a feature ideation:
# Get the feature ideation ID from the pipeline
response = client.get(f"/pipeline/{pipeline_id}/feature_ideation")
feature_ideation_id = response.json().get("feature_ideation_id")
# List suggested features (paginated)
response = client.get(
f"/catalog/feature_ideation/{feature_ideation_id}/suggested_features",
params={"page": 1, "page_size": 50},
)
features = response.json()["data"]
for f in features[:5]:
print(f"{f['feature_name']} (relevance: {f.get('relevance_score', 'N/A')})")
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
page |
integer | No | Page number (default: 1) |
page_size |
integer | No | Items per page (default: 10) |
sort_by |
string | No | Field to sort by |
sort_dir |
string | No | Sort direction: "asc" or "desc" |
search |
string | No | Search by feature name |
readiness |
string | No | Filter by readiness: "NEW", "DRAFT", "PRODUCTION_READY" |
signal_type |
string | No | Filter by signal type (e.g., "aggregate", "transform") |
dtype |
string | No | Filter by data type |
suggested |
boolean | No | true for ideated features only, false for existing catalog features |
Returns a paginated response. Each item in data contains:
| Field | Type | Description |
|---|---|---|
id |
string | Feature ID |
feature_name |
string | Feature name |
feature_description |
string | Human-readable description of what the feature computes |
code |
string | Python SDK code to reproduce this feature |
relevance_score |
float | Semantic relevance score (higher = more relevant to the use case) |
relevance_explanation |
string | Why this feature is relevant |
predictive_power_score |
float | Predictive power score from EDA |
primary_entity |
array | Entity names this feature is computed for |
primary_table |
array | Source table names used |
signal_type |
string | Feature engineering type (e.g., "aggregate", "transform", "lookup") |
inputs_description |
string | Description of input columns used |
filter_description |
string | Description of any filters applied |
readiness |
string | "NEW" for ideated features not yet saved to catalog |
complexity |
float | Feature engineering complexity score |
dtype |
string | Output data type (e.g., "FLOAT", "INT") |
feature_type |
string | "numerical" or "categorical" |
is_naive_prediction |
boolean | Whether this is a naive prediction baseline feature |
suggested |
boolean | true if generated by ideation |
Get Feature SDK Code¶
The code field contains the Python SDK code to reproduce the feature:
Example output — a ratio feature comparing weekly averages on the same weekday to a longer-term naive baseline:
"""
SDK code to create ITEM_STORE_Avg_of_sales_amounts_28cD_same_Forecast_Weekday_
To_Naive_ITEM_STORE_Avg_of_sales_amounts_182cD
Feature description:
Ratio of the item-store's average sales amount on the same weekday (28 calendar days)
to the overall average sales amount (182 calendar days).
"""
import featurebyte as fb
# Activate catalog
catalog = fb.Catalog.activate("M5 Sales Amount Forecasting")
# Get view from SALES_AMOUNT time series table
sales_amount_view = catalog.get_view("SALES_AMOUNT")
# Extract day_of_week from the forecast point
context = fb.Context.get_by_id("69d5cf6d7791713392c73226")
forecast_day_of_week = context.get_forecast_point_feature().dt.day_of_week
forecast_day_of_week.name = "FORECAST_Day_Of_Week"
# Extract weekday from source table
sales_amount_view["Weekday"] = sales_amount_view["date"].dt.day_of_week
# Group by item_store, segmented by weekday
sales_amount_view_by_item_store = sales_amount_view.groupby(["item_store_id"])
sales_amount_view_by_item_store_by_weekday = sales_amount_view.groupby(
["item_store_id"], category="Weekday"
)
# Avg sales amount per weekday over 28 calendar days (dictionary feature)
avg_by_weekday_28cd = sales_amount_view_by_item_store_by_weekday.aggregate_over(
"sales_amount",
method="avg",
feature_names=["ITEM_STORE_Avg_of_sales_amounts_by_Weekday_28cD"],
windows=[fb.CalendarWindow(unit="DAY", size=28)],
)["ITEM_STORE_Avg_of_sales_amounts_by_Weekday_28cD"]
# Extract value matching the forecast weekday
avg_same_weekday_28cd = avg_by_weekday_28cd.cd.get_value(forecast_day_of_week)
avg_same_weekday_28cd.name = (
"ITEM_STORE_Avg_of_sales_amounts_28cD_same_Forecast_Weekday"
)
# Avg sales amount over 182 calendar days (naive baseline)
avg_182cd = sales_amount_view_by_item_store.aggregate_over(
"sales_amount",
method="avg",
feature_names=["ITEM_STORE_Avg_of_sales_amounts_182cD"],
windows=[fb.CalendarWindow(unit="DAY", size=182)],
)["ITEM_STORE_Avg_of_sales_amounts_182cD"]
# Ratio: weekday-specific average / overall average
ratio = avg_same_weekday_28cd / avg_182cd
ratio.name = "ITEM_STORE_Avg_of_sales_amounts_28cD_same_Forecast_Weekday_To_Naive_ITEM_STORE_Avg_of_sales_amounts_182cD"
# Save and describe
ratio.save()
ratio.update_feature_type("numeric")
ratio.update_description(
"Ratio of the item-store's average sales amount on the same weekday (28 calendar days) "
"to the overall average sales amount (182 calendar days)."
)
This code can be executed directly in a Python environment with the FeatureByte SDK to recreate the feature, or modified to create variants (e.g., different windows, aggregation methods, or entity groupings).
Get Full Feature Metadata and Lineage¶
For a deeper understanding of how a feature is constructed, use the metadata endpoint:
suggested_feature_id = features[0]["id"]
response = client.get(
f"/catalog/feature_ideation/suggested_feature_metadata/{suggested_feature_id}",
)
metadata = response.json()
Response fields:
| Field | Type | Description |
|---|---|---|
sdk_code |
string | Formatted Python SDK code to reproduce the feature |
lineage |
object | Complete construction graph (see below) |
Feature Lineage¶
The lineage object shows the step-by-step construction of the feature as a directed graph:
| Field | Type | Description |
|---|---|---|
lineage.nodes |
array | Ordered list of operations in the feature construction |
lineage.edges |
array | Data flow connections between nodes |
lineage.inputs |
array | Input columns from source tables |
Each node in lineage.nodes contains:
| Field | Type | Description |
|---|---|---|
node_id |
string | Unique node identifier |
is_input |
boolean | Whether this node represents a source column |
node_type |
string | Operation type (e.g., "groupby", "aggregate", "filter", "lookup") |
title |
string | Human-readable operation title |
description |
string | What this operation does |
code |
string | Python code for this specific step |
output_type |
string | Output type of this operation |
output_name |
string | Name of the output variable |
This is useful for understanding complex features that chain multiple operations (e.g., a ratio of two aggregations with a filter applied).
Preview Feature Values¶
Preview computed values for a feature on a sample observation set:
response = client.get(
f"/catalog/feature_ideation/feature_preview/{suggested_feature_id}",
)
preview = response.json()
Example: Inspect and Modify a Feature¶
# 1. Find top features by relevance
response = client.get(
f"/catalog/feature_ideation/{feature_ideation_id}/suggested_features",
params={"page_size": 10, "sort_by": "relevance_score", "sort_dir": "desc"},
)
top_features = response.json()["data"]
# 2. Get the SDK code for the best feature
best = top_features[0]
print(f"Feature: {best['feature_name']}")
print(f"Relevance: {best['relevance_score']}")
print(f"Description: {best['feature_description']}")
print(f"\nCode:\n{best['code']}")
# 3. Get the full lineage to understand construction steps
response = client.get(
f"/catalog/feature_ideation/suggested_feature_metadata/{best['id']}",
)
metadata = response.json()
lineage = metadata.get("lineage", {})
for node in lineage.get("nodes", []):
if not node.get("is_input"):
print(f"\nStep: {node['title']}")
print(f" {node['description']}")
print(f" Code: {node['code']}")