Skip to content

featurebyte.Feature.create_new_version

create_new_version(
table_feature_job_settings: Optional[List[TableFeatureJobSetting]]=None,
table_cleaning_operations: Optional[List[TableCleaningOperation]]=None
) -> Feature

Description

Creates a new feature version from a Feature object. The new version is created by replacing the current feature's feature job settings (if provided) and the table cleaning operations (if provided).

Parameters

  • table_feature_job_settings: Optional[List[TableFeatureJobSetting]]
    List of table feature job settings to apply to the feature. Each item in the list represents a specific feature job setting for a table, which is created using the TableFeatureJobSetting constructor. This constructor takes the table name and the desired feature job setting as input. The setting should only be applied to tables that originally contained the timestamp column used in the GroupBy.aggregate_over operation for the feature. If the operation was performed on an item table, use the name of the related event table, as the event timestamp is sourced from there.

  • table_cleaning_operations: Optional[List[TableCleaningOperation]]
    List of table cleaning operations to apply to the feature. Each item in the list represents the cleaning operations for a specific table, which is created using the TableCleaningOperation constructor. This constructor takes the table name and the cleaning operations for that table as input. The cleaning operations for each table are represented as a list, where each item defines the cleaning operations for a specific column. The association between a column and its cleaning operations is established using the ColumnCleaningOperation constructor.

Returns

  • Feature
    New feature version created based on provided feature settings and table cleaning operations.

Raises

  • RecordCreationException
    When failed to save a new version, e.g. when the created feature is exactly the same as the current one. This could happen when the provided feature settings and table cleaning operations are irrelevant to the current feature.

Examples

Check feature job setting of this feature first:

>>> feature = catalog.get_feature("InvoiceAmountAvg_60days")
>>> feature.info()["table_feature_job_setting"]
{'this': [{'table_name': 'GROCERYINVOICE',
   'feature_job_setting': {'blind_spot': '0s',
    'period': '3600s',
    'offset': '90s',
    'execution_buffer': '0s'}}],
 'default': [{'table_name': 'GROCERYINVOICE',
   'feature_job_setting': {'blind_spot': '0s',
    'period': '3600s',
    'offset': '90s',
    'execution_buffer': '0s'}}]}
Create a new feature with a different feature job setting:

>>> new_feature = feature.create_new_version(
...   table_feature_job_settings=[
...     fb.TableFeatureJobSetting(
...       table_name="GROCERYINVOICE",
...       feature_job_setting=fb.FeatureJobSetting(
...         blind_spot="60s",
...         period="3600s",
...         offset="90s",
...       )
...     )
...   ]
... )

>>> new_feature.info()["table_feature_job_setting"]
{'this': [{'table_name': 'GROCERYINVOICE',
   'feature_job_setting': {'blind_spot': '60s',
    'period': '3600s',
    'offset': '90s',
    'execution_buffer': '0s'}}],
 'default': [{'table_name': 'GROCERYINVOICE',
   'feature_job_setting': {'blind_spot': '0s',
    'period': '3600s',
    'offset': '90s',
    'execution_buffer': '0s'}}]}
Check table cleaning operation of this feature first:

>>> feature = catalog.get_feature("InvoiceAmountAvg_60days")
>>> feature.info()["table_cleaning_operation"]
{'this': [], 'default': []}
Create a new version of a feature with different table cleaning operations:

>>> new_feature = feature.create_new_version(
...   table_cleaning_operations=[
...     fb.TableCleaningOperation(
...       table_name="GROCERYINVOICE",
...       column_cleaning_operations=[
...         fb.ColumnCleaningOperation(
...           column_name="Amount",
...           cleaning_operations=[fb.MissingValueImputation(imputed_value=0.0)],
...         )
...       ],
...     )
...   ]
... )

>>> new_feature.info()["table_cleaning_operation"]
{'this': [{'table_name': 'GROCERYINVOICE',
   'column_cleaning_operations': [{'column_name': 'Amount',
     'cleaning_operations': [{'imputed_value': 0.0, 'type': 'missing'}]}]}],
 'default': []}
Check the tables used by this feature first:

>>> feature = catalog.get_feature("InvoiceAmountAvg_60days")
>>> feature.info()["tables"]
[{'name': 'GROCERYINVOICE', 'status': 'PUBLIC_DRAFT', 'catalog_name': 'grocery'}]
Create a new version of a feature with irrelevant table cleaning operations (for example, the specified table name or column name is not used by the feature):

>>> feature.create_new_version(
...   table_cleaning_operations=[
...     fb.TableCleaningOperation(
...       table_name="GROCERYPRODUCT",
...       column_cleaning_operations=[
...         fb.ColumnCleaningOperation(
...           column_name="GroceryProductGuid",
...           cleaning_operations=[fb.MissingValueImputation(imputed_value=0)],
...         )
...       ],
...     )
...   ]
... )
Traceback (most recent call last):
...
featurebyte.exception.RecordCreationException:
Table cleaning operation(s) does not result a new feature version.
This is because the new feature version is the same as the source feature.

See Also