Feature Selection¶

Run Feature Selection¶

response = client.post(
    "/feature_selection",
    json={
        "feature_ideation_id": feature_ideation_id,
        "target_feature_count": 100,
        "mode": "GENAI_BASED",
    },
)
task_id = response.json()["id"]
task = wait_for_task(client, task_id)

feature_selection_id = task.get("payload", {}).get("output_document_id")
print(f"Feature selection: {feature_selection_id}")

Parameters:

Parameter	Type	Required	Description
`feature_ideation_id`	string	Yes	ID of the feature ideation to select from
`feature_selection_name`	string	No	Custom name for the selection
`mode`	string	No	Selection mode: `"GENAI_BASED"` (default) or `"RULE_BASED"`
`target_feature_count`	integer	No	Target number of features to select (default: 50, max: 500)
`use_relevance_score`	boolean	No	Use semantic relevance scores in selection (default: `true`)
`use_predictive_power_score`	boolean	No	Use predictive power scores in selection (default: `true`)
`remove_redundant_features`	boolean	No	Remove highly correlated features (default: `true`)
`remove_dictionary_and_vector`	boolean	No	Exclude dictionary and vector features (default: `true`)
`remove_low_added_value_features`	boolean	No	Remove features with low marginal value (default: `true`)
`keep_always_observation_features`	boolean	No	Always keep features from the observation table (default: `true`)
`rule`	object	No	Rule-based selection parameters (only when `mode` is `"RULE_BASED"`)
`rule.top_n_overall`	integer	No	Maximum features overall (default: 100)
`rule.top_m_per_theme`	integer	No	Maximum features per theme (default: 5)
`rule.logic_operator`	string	No	`"OR"` (default) or `"AND"` — how to combine top_n and top_m
`observation_table_id`	string	No	Observation table for SHAP evaluation
`feature_ids`	array	No	Restrict selection to specific feature IDs

Get Selection Results¶

response = client.get(f"/feature_selection/{feature_selection_id}")
selection = response.json()

print(f"Candidates: {selection['nb_candidates']}")
print(f"Selected: {selection['nb_selected']}")

Response fields:

Field	Type	Description
`id`	string	Feature selection ID
`nb_candidates`	integer	Number of candidate features evaluated
`nb_selected`	integer	Number of features selected
`feature_ids`	array	IDs of selected features
`data`	array	Per-feature results with `feature_name`, `selection_rank`, `selection_rationale`
`signal_range`	string	Signal range description

Create Feature List from Selection¶

Create a feature list containing the selected features:

response = client.post(f"/feature_selection/{feature_selection_id}/feature_list")
task_id = response.json()["id"]
task = wait_for_task(client, task_id)

feature_list_id = task.get("payload", {}).get("output_document_id")