Feature Refinement¶

Get the Best Model from a Pipeline¶

Retrieve the model trained during ideation:

client = fb.Configurations().get_client()

# Find pipelines for your use case
response = client.get("/pipeline", params={"use_case_id": use_case_id})
pipeline = response.json()["data"][0]  # most recent
pipeline_id = pipeline["_id"]

# Get model IDs from the model-train step
response = client.get(f"/pipeline/{pipeline_id}")
pipeline_data = response.json()

ml_model_id = None
for group in pipeline_data["groups"]:
    for step in group["steps"]:
        if step["step_type"] == "model-train" and step.get("ml_model_ids"):
            ml_model_id = step["ml_model_ids"][0]
            break

Create a Feature List from Key Importance¶

Extract the top features by importance from the ideation model:

response = client.post(
    "/feature_list_from_model",
    json={
        "mode": "Feature key importance based",
        "ml_model_id": ml_model_id,
        "top_n": 200,
        "importance_threshold_percentage": 0.90,
    },
)

task_id = response.json()["id"]
task = wait_for_task(client, task_id)

Parameters:

Parameter	Type	Required	Description
`mode`	string	Yes	Selection mode: `"Feature key importance based"` or `"Feature importance based"`
`ml_model_id`	string	Yes	ID of the trained model to extract features from
`top_n`	integer	No	Maximum number of feature keys to select (default: 200, max: 500)
`importance_threshold_percentage`	float	No	Cumulative importance threshold between 0 and 1 (default: 0.90)
`feature_list_name`	string	No	Custom name for the generated feature list

The endpoint selects features until either top_n is reached or the cumulative importance exceeds the threshold, whichever comes first. The "Feature key importance based" mode unbundles dictionary features and extracts the most important keys, creating a single feature for each selected key. The "Feature importance based" mode keeps dictionary features as-is and selects at the individual feature level.

Inspect the Refined Feature List¶

feature_list_from_model_id = task.get("payload", {}).get("output_document_id")

response = client.get(f"/feature_list_from_model/{feature_list_from_model_id}")
result = response.json()

feature_list_id = result["feature_list_id"]
feature_keys_count = result["feature_keys_created_count"]

# Get full feature list details
response = client.get(f"/feature_list/{feature_list_id}")
feature_list = response.json()
print(f"Features: {len(feature_list['feature_ids'])}")

Feature list from model response fields (GET /feature_list_from_model/{id}):

Field	Type	Description
`id`	string	Feature list from model ID
`mode`	string	Selection mode used
`ml_model_id`	string	Source model ID
`top_n`	integer	Maximum features requested
`importance_threshold_percentage`	float	Cumulative importance threshold used
`feature_keys_created_count`	integer	Number of feature keys selected
`features_selected_count`	integer	Number of individual features created
`feature_list_id`	string	ID of the generated feature list

Feature list response fields (GET /feature_list/{id}):

Field	Type	Description
`id`	string	Feature list ID
`name`	string	Feature list name
`feature_ids`	array	List of feature IDs in this list

Adding custom features

To augment a feature list with additional features (e.g., SDK-created features), use the SDK's FeatureList API. See the SDK Reference for details.