Skip to content

Ideated Features

Prerequisites

This page uses the client and wait_for_task helpers defined in API Overview.

After running an ideation pipeline, you can retrieve the individual features it generated — including the Python SDK code to reproduce or modify each feature. This is particularly useful for:

  • Inspecting how features are constructed
  • Reproducing features outside the pipeline
  • Modifying features (e.g., changing windows, filters, or aggregations)
  • Feeding feature definitions to AI tools for analysis or improvement

List Suggested Features

Retrieve all features generated by a feature ideation:

# Get the feature ideation ID from the pipeline
response = client.get(f"/pipeline/{pipeline_id}/feature_ideation")
feature_ideation_id = response.json().get("feature_ideation_id")

# List suggested features (paginated)
response = client.get(
    f"/catalog/feature_ideation/{feature_ideation_id}/suggested_features",
    params={"page": 1, "page_size": 50},
)
features = response.json()["data"]

for f in features[:5]:
    print(f"{f['feature_name']} (relevance: {f.get('relevance_score', 'N/A')})")

Parameters:

Parameter Type Required Description
page integer No Page number (default: 1)
page_size integer No Items per page (default: 10)
sort_by string No Field to sort by
sort_dir string No Sort direction: "asc" or "desc"
search string No Search by feature name
readiness string No Filter by readiness: "NEW", "DRAFT", "PRODUCTION_READY"
signal_type string No Filter by signal type (e.g., "aggregate", "transform")
dtype string No Filter by data type
suggested boolean No true for ideated features only, false for existing catalog features

Returns a paginated response. Each item in data contains:

Field Type Description
id string Feature ID
feature_name string Feature name
feature_description string Human-readable description of what the feature computes
code string Python SDK code to reproduce this feature
relevance_score float Semantic relevance score (higher = more relevant to the use case)
relevance_explanation string Why this feature is relevant
predictive_power_score float Predictive power score from EDA
primary_entity array Entity names this feature is computed for
primary_table array Source table names used
signal_type string Feature engineering type (e.g., "aggregate", "transform", "lookup")
inputs_description string Description of input columns used
filter_description string Description of any filters applied
readiness string "NEW" for ideated features not yet saved to catalog
complexity float Feature engineering complexity score
dtype string Output data type (e.g., "FLOAT", "INT")
feature_type string "numerical" or "categorical"
is_naive_prediction boolean Whether this is a naive prediction baseline feature
suggested boolean true if generated by ideation

Get Feature SDK Code

The code field contains the Python SDK code to reproduce the feature:

# Get a specific feature's code
feature = features[0]
print(feature["code"])

Example output — a ratio feature comparing weekly averages on the same weekday to a longer-term naive baseline:

"""
SDK code to create ITEM_STORE_Avg_of_sales_amounts_28cD_same_Forecast_Weekday_
To_Naive_ITEM_STORE_Avg_of_sales_amounts_182cD

Feature description:
Ratio of the item-store's average sales amount on the same weekday (28 calendar days)
to the overall average sales amount (182 calendar days).
"""

import featurebyte as fb

# Activate catalog
catalog = fb.Catalog.activate("M5 Sales Amount Forecasting")

# Get view from SALES_AMOUNT time series table
sales_amount_view = catalog.get_view("SALES_AMOUNT")

# Extract day_of_week from the forecast point
context = fb.Context.get_by_id("69d5cf6d7791713392c73226")
forecast_day_of_week = context.get_forecast_point_feature().dt.day_of_week
forecast_day_of_week.name = "FORECAST_Day_Of_Week"

# Extract weekday from source table
sales_amount_view["Weekday"] = sales_amount_view["date"].dt.day_of_week

# Group by item_store, segmented by weekday
sales_amount_view_by_item_store = sales_amount_view.groupby(["item_store_id"])
sales_amount_view_by_item_store_by_weekday = sales_amount_view.groupby(
    ["item_store_id"], category="Weekday"
)

# Avg sales amount per weekday over 28 calendar days (dictionary feature)
avg_by_weekday_28cd = sales_amount_view_by_item_store_by_weekday.aggregate_over(
    "sales_amount",
    method="avg",
    feature_names=["ITEM_STORE_Avg_of_sales_amounts_by_Weekday_28cD"],
    windows=[fb.CalendarWindow(unit="DAY", size=28)],
)["ITEM_STORE_Avg_of_sales_amounts_by_Weekday_28cD"]

# Extract value matching the forecast weekday
avg_same_weekday_28cd = avg_by_weekday_28cd.cd.get_value(forecast_day_of_week)
avg_same_weekday_28cd.name = (
    "ITEM_STORE_Avg_of_sales_amounts_28cD_same_Forecast_Weekday"
)

# Avg sales amount over 182 calendar days (naive baseline)
avg_182cd = sales_amount_view_by_item_store.aggregate_over(
    "sales_amount",
    method="avg",
    feature_names=["ITEM_STORE_Avg_of_sales_amounts_182cD"],
    windows=[fb.CalendarWindow(unit="DAY", size=182)],
)["ITEM_STORE_Avg_of_sales_amounts_182cD"]

# Ratio: weekday-specific average / overall average
ratio = avg_same_weekday_28cd / avg_182cd
ratio.name = "ITEM_STORE_Avg_of_sales_amounts_28cD_same_Forecast_Weekday_To_Naive_ITEM_STORE_Avg_of_sales_amounts_182cD"

# Save and describe
ratio.save()
ratio.update_feature_type("numeric")
ratio.update_description(
    "Ratio of the item-store's average sales amount on the same weekday (28 calendar days) "
    "to the overall average sales amount (182 calendar days)."
)

This code can be executed directly in a Python environment with the FeatureByte SDK to recreate the feature, or modified to create variants (e.g., different windows, aggregation methods, or entity groupings).

Get Full Feature Metadata and Lineage

For a deeper understanding of how a feature is constructed, use the metadata endpoint:

suggested_feature_id = features[0]["id"]

response = client.get(
    f"/catalog/feature_ideation/suggested_feature_metadata/{suggested_feature_id}",
)
metadata = response.json()

Response fields:

Field Type Description
sdk_code string Formatted Python SDK code to reproduce the feature
lineage object Complete construction graph (see below)

Feature Lineage

The lineage object shows the step-by-step construction of the feature as a directed graph:

Field Type Description
lineage.nodes array Ordered list of operations in the feature construction
lineage.edges array Data flow connections between nodes
lineage.inputs array Input columns from source tables

Each node in lineage.nodes contains:

Field Type Description
node_id string Unique node identifier
is_input boolean Whether this node represents a source column
node_type string Operation type (e.g., "groupby", "aggregate", "filter", "lookup")
title string Human-readable operation title
description string What this operation does
code string Python code for this specific step
output_type string Output type of this operation
output_name string Name of the output variable

This is useful for understanding complex features that chain multiple operations (e.g., a ratio of two aggregations with a filter applied).

Preview Feature Values

Preview computed values for a feature on a sample observation set:

response = client.get(
    f"/catalog/feature_ideation/feature_preview/{suggested_feature_id}",
)
preview = response.json()

Example: Inspect and Modify a Feature

# 1. Find top features by relevance
response = client.get(
    f"/catalog/feature_ideation/{feature_ideation_id}/suggested_features",
    params={"page_size": 10, "sort_by": "relevance_score", "sort_dir": "desc"},
)
top_features = response.json()["data"]

# 2. Get the SDK code for the best feature
best = top_features[0]
print(f"Feature: {best['feature_name']}")
print(f"Relevance: {best['relevance_score']}")
print(f"Description: {best['feature_description']}")
print(f"\nCode:\n{best['code']}")

# 3. Get the full lineage to understand construction steps
response = client.get(
    f"/catalog/feature_ideation/suggested_feature_metadata/{best['id']}",
)
metadata = response.json()

lineage = metadata.get("lineage", {})
for node in lineage.get("nodes", []):
    if not node.get("is_input"):
        print(f"\nStep: {node['title']}")
        print(f"  {node['description']}")
        print(f"  Code: {node['code']}")