API Overview¶
All operations documented here can also be performed through the FeatureByte User Interface. The API is useful for automation, scripting, and CI/CD integration.
The REST API provides access to operations that are not available through the Python SDK. While the SDK covers catalog setup, table registration, entity creation, and other foundational operations, the API is required for:
- Pipeline orchestration — creating and running feature ideation pipelines
- Forecast automation — generating observation tables for forecast use cases
- Model training and evaluation — training models, resolving templates, leaderboards, evaluation plots
- Batch predictions — generating and downloading prediction tables
- Deployment — creating deployments, generating deployment SQL
- Source table analysis — AI-powered table type detection and summaries
- Semantic detection — automatic column semantic type detection
Quick start
For an end-to-end walkthrough, start with the API tutorials:
- Credit Default Tutorial — binary classification with 7 tables, covering the full workflow from setup to deployment
- Store Sales Forecast Tutorial — time series forecasting with forecast automation
Getting an API Client¶
The client object handles authentication and provides get, post, patch, and delete methods. All request and response bodies use JSON.
Asynchronous Tasks¶
Many API operations are asynchronous. They return a task ID immediately, and you poll for completion:
import time
def wait_for_task(client, task_id, poll_interval=30):
"""Poll a task until completion. Returns the full task response."""
while True:
task = client.get(f"/task/{task_id}").json()
if task["status"] in ("SUCCESS", "FAILURE"):
return task
time.sleep(poll_interval)
This helper is used throughout the examples in this documentation.
Task Response¶
All async endpoints return a Task object. When polling GET /task/{task_id}, the response contains:
| Field | Type | Description |
|---|---|---|
id |
string | Task ID |
status |
string | "PENDING", "STARTED", "SUCCESS", "FAILURE", "REVOKED" |
payload |
object | Task parameters. On success, payload.output_document_id contains the created resource ID |
traceback |
string | Error details if status is "FAILURE" |
start_time |
datetime | When the task started |
date_done |
datetime | When the task completed |
progress |
object | Current progress metrics |
child_task_ids |
array | IDs of child tasks spawned by this task |
Paginated List Response¶
All list endpoints (GET /catalog/*, GET /pipeline, etc.) return a paginated response:
| Field | Type | Description |
|---|---|---|
data |
array | Items for the current page |
page |
integer | Current page number (1-based) |
page_size |
integer | Number of items per page |
total |
integer | Total number of matching items |
Pass page and page_size as query parameters to control pagination. Example:
response = client.get("/catalog/ml_model", params={"page": 1, "page_size": 20})
result = response.json()
items = result["data"]
total = result["total"]
Displaying Plots¶
Several API endpoints return interactive Bokeh plots as self-contained HTML — all CSS, JavaScript, and data are embedded directly with no external dependencies. There are two response shapes depending on the endpoint:
Single plot (evaluation plots, forecast comparison): the response contains a content field with the HTML string.
Plot list (table EDA, feature EDA): the response is an array of plot objects, each with a nested plots array.
plots = response.json()
html_contents = [
p["content"]
for plot in plots
for p in plot.get("plots", [])
if "content" in p
]
Display in a Jupyter Notebook¶
Save to an HTML File¶
Open the file in a browser to interact with the plot — Bokeh plots support pan, zoom, reset, and hover tooltips out of the box. Some plots also include interactive widgets (e.g., dynamic rebinning) that update in the browser without a server round-trip.
Embed in a Web Application¶
Since the response is self-contained HTML, you can embed it directly in an iframe:
Tutorials¶
For end-to-end walkthroughs using these APIs, see the API tutorials:
- Credit Default Tutorial — binary classification with 7 tables, full pipeline
- Store Sales Forecast Tutorial — time series forecasting with forecast automation
Next Steps¶
- Source Data Exploration — analyze source tables and generate AI summaries
- Table EDA — run EDA and column analysis on registered tables
- Observation Table Automation — forecast automation and filtered observation tables
- Semantic Detection — automatic column semantic type detection
- Development Dataset — create development plans and datasets for ideation
- Entity Selection — configure which entities to use for feature generation
- Feature Ideation Pipelines — automated feature engineering
- Ideation Configuration — configure pipeline steps (training, entity selection, filters, EDA, etc.)
- Ideated Features — retrieve feature definitions, SDK code, and lineage
- Feature EDA — analyze feature distributions and target relationships
- Feature Selection — SHAP-based feature selection with custom parameters
- Feature Refinement — extract and refine feature lists from trained models
- Model Training — train models with the API
- Batch Predictions — generate and download prediction tables
- Evaluation — leaderboards and evaluation plots
- Deployment — create deployments and generate deployment SQL
- API Reference — complete endpoint reference