forml.pipeline.wrap¶

Decorators for creating operators and actors by wrapping generic (non-ForML) implementations.

Instead of creating ForML actors and/or operators by fully implementing their relevant base classes, they can (in special cases) be conveniently defined using the wrappers provided within this module.

Module Attributes

forml.pipeline.wrap.AUTO = [AutoSklearnTransformer, AutoSklearnClassifier, AutoSklearnRegressor]¶: The default list of auto-wrapper implementations to be used by the wrap.importer context manager.

Functions

forml.pipeline.wrap.importer(*wrappers: wrap.Auto) → Iterable[None][source]¶

Context manager capturing all direct imports and wrapping their matching entities using the explicit or default list of auto-wrappers.

The signature of the wrapped object is compatible with the original entity.

Parameters:

*wrappers: wrap.Auto¶: Sequences of the auto-wrapper implementations to be matched and potentially (if compatible) applied to the discovered wrapping candidates. If no explicit value is provided, the default wrap.AUTO list of auto-wrapper implementations is used.

Returns:

Context manager under which the direct imports become subject to auto-wrapping.

Examples

All three possible import syntax alternatives are supported, although only the first one is recommended:

with wrap.importer():
    # 1. auto-wrap just the explicit members (recommended):
    from sklearn.ensemble import GradientBoostingClassifier

    # 2. auto-wrap all members discovered in ensemble.*
    #    (not recommended - unnecessarily heavy)
    from sklearn import ensemble

    # 3. similar but without the namespace
    #    (even less recommended - heavy and dirty)
    from sklearn.ensemble import *

Example use-case importing the sklearn.ensemble.GradientBoostingClassifier classifier wrapped as a ForML operator that can be directly used within a pipeline composition expression:

>>> from forml import flow
>>> from forml.pipeline import wrap
>>>
>>> with wrap.importer():
...     from sklearn.ensemble import RandomForestClassifier
...
>>> RFC = RandomForestClassifier(n_estimators=30, max_depth=10)
>>> isinstance(RFC, flow.Operator)
True
>>> PIPELINE = preprocessing.Prepare() >> RFC

Classes

class forml.pipeline.wrap.Actor(*args, **kwargs)[source]¶

Bases: object

Central class providing decorators/wrappers for creating ForML Actors using a number of convenient ways not requiring to fully implement the flow.Actor base class from scratch.

Decorator Methods

apply(origin)[source]¶

Decorator for turning a given plain function into a stateless Actor.

Parameters:

origin¶

Decorated function.

The function must have one of the following signatures:

def foo(*features: flow.Features) -> flow.Result:
def foo(features: flow.Features) -> flow.Result:
def foo(*features: flow.Features, opt1, optN=None) -> flow.Result:
def foo(features: flow.Features, *, opt1, optN=None) -> flow.Result:
def foo(*features: flow.Features, opt1, **kwargs) -> flow.Result:
def foo(features: flow.Features, /, *, opt1, **kwargs) -> flow.Result:

Attention

The optional arguments opt1, opt2, and **kwargs must all be keyword-only arguments.

Returns:

A stateless Actor class with the given apply logic.

Examples

Simple stateless imputation actor using the provided value to fill the NaNs:

@wrap.Actor.apply
def StaticImpute(
    df: pandas.DataFrame,
    *,
    column: str,
    value: float,
) -> pandas.DataFrame:
    df[column] = df[column].fillna(value)
    return df

train(origin)[source]¶

Decorator for turning a given plain function into a follow-up apply function decorator.

Stateful actors need to have distinct implementations for their train vs apply modes. This wrapping facility achieves that by decorating two companion functions each implementing the relevant mode.

Parameters:

origin¶

Decorated train function.

The decorated train function must have one of the following signatures:

def foo(state: typing.Optional[State], features: flow.Features, labels: flow.Labels) -> State:
def foo(state: typing.Optional[State], features: flow.Features, labels: flow.Labels, opt1, optN=None) -> State:
def foo(state: typing.Optional[State], features: flow.Features, labels: flow.Labels, /, opt1,**kwargs) -> State:

The function will receive the previous state as the first parameter and is expected to provide the new state instance as its return value.

Returns:

Follow-up decorator to be used for wrapping the companion apply function which eventually returns a stateful Actor class with the given train-apply logic.

The decorated apply function must have one of the following signatures:

def foo(state: State, features: flow.Features) -> flow.Result:
def foo(state: State, features: flow.Features, opt1, optN=None) -> flow.Result:
def foo(state: State, features: flow.Features, /, opt1, **kwargs) -> flow.Result:

The function will receive the current state as the first parameter and is expected to provide the apply-mode transformation result.

Examples

Simple stateful imputation actor using the trained mean value to fill the NaNs:

@wrap.Actor.train  # starting with wrapping the train-mode function
def MeanImpute(
    state: typing.Optional[float],  # receving the previous state (not used)
    features: pandas.DataFrame,
    labels: pandas.Series,
    *,
    column: str,
) -> float:
    return features[column].mean()  # returning the new state

@MeanImpute.apply  # continue with the follow-up apply-mode function decorator
def MeanImpute(
    state: float,  # receiving current state
    features: pandas.DataFrame,
    *,
    column: str
) -> pandas.DataFrame:
    features[column] = features[column].fillna(state)
    return features  # apply-mode result

type(origin=None, /, *, apply=None, train=None, get_params=None, set_params=None)[source]¶

Wrapper for turning an external user class into a valid Actor.

This can be used either as a parameterless decorator or optionally with mapping of Actor methods to decorated user class implementation.

Parameters:

origin=None¶: Decorated class.
apply=None¶: Target method name or decorator function implementing the actor apply logic.
train=None¶: Target method name or decorator function implementing the actor train logic.
get_params=None¶: Target method name or decorator function implementing the actor get_params logic.
set_params=None¶: Target method name or decorator function implementing the actor set_params logic.

Returns:

Actor class.

Examples

>>> RfcActor = wrap.Actor.type(
...     sklearn.ensemble.RandomForestClassifier,
...     train='fit',
...     apply=lambda c, *a, **kw: c.predict_proba(*a, **kw).transpose()[-1],
... )

class forml.pipeline.wrap.Auto[source]¶

Bases: Generic[Entity], ABC

Generic base class for auto-wrapper implementations.

If supplied to the wrap.importer() context manager when capturing the imports, each discovered entity within the imported namespace is checked against the auto-wrapper using its match() method and if compatible it gets wrapped in-place using its apply() method.

Each auto-wrapper needs to implement the following methods:

match(entity)[source]¶

Check this wrapper is capable of wrapping the given entity into a ForML operator.

Parameters:

entity¶: Wrapping candidate subject.

Returns:

True if this wrapper is capable to wrap the entity.

apply(entity)[source]¶

Actual wrapping implementation.

Parameters:

entity¶: Wrapping subject.

Returns:

ForML operator-type-like callable compatible with the signature of the wrapped entity.

class forml.pipeline.wrap.AutoSklearnTransformer(apply: str | Callable[[...], Any] = 'transform')[source]¶

Bases: AutoClass[type[TransformerMixin]]

Auto-wrapper for turning Scikit-learn transformers into ForML operators.

Instances can be used with wrap.importer to auto-wrap Scikit-learn transformers upon importing.

Hint

Supports not just the official Scikit-learn transformers but any sklearn.base.TransformerMixin subclasses including 3rd party implementations.

Parameters:

apply: str | Callable[[...], Any] = 'transform'¶: Customizable mapping for the apply-mode target endpoint. Defaults to a transform literal.

class forml.pipeline.wrap.AutoSklearnClassifier(apply: str | collections.abc.Callable[..., Any] = predict_proba[-1])[source]¶

Bases: AutoClass[type[ClassifierMixin]]

Auto-wrapper for turning Scikit-learn classifiers into ForML operators.

Instances can be used with wrap.importer to auto-wrap Scikit-learn classifiers upon importing.

Hint

Supports not just the official Scikit-learn classifiers but any sklearn.base.ClassifierMixin subclasses including 3rd party implementations.

Parameters:

apply: str | collections.abc.Callable[..., Any] = predict_proba[-1]¶: Customizable mapping for the apply-mode target endpoint. Defaults to a callback hitting the .predict_proba and returning the last of its produced columns (conveniently the 1-class probability in case of binary classification; for multiclass this needs tweaking).

class forml.pipeline.wrap.AutoSklearnRegressor(apply: str | Callable[[...], Any] = 'predict')[source]¶

Bases: AutoClass[type[RegressorMixin]]

Auto-wrapper for turning Scikit-learn regressors into ForML operators.

Instances can be used with wrap.importer to auto-wrap Scikit-learn regressors upon importing.

Hint

Supports not just the official Scikit-learn regressors but any sklearn.base.RegressorMixin subclasses including 3rd party implementations.

Parameters:

apply: str | Callable[[...], Any] = 'predict'¶: Customizable mapping for the apply-mode target endpoint. Defaults to a predict literal.

class forml.pipeline.wrap.Operator(*args, **kwargs)[source]¶

Bases: Operator

Special operator created via a decoration of particular actors.

This represents a convenient way of implementing ForML Operators without requiring to fully implement the flow.Operator base class from scratch.

Attention

Instances are expected to be created via the decorator methods.

This approach is applicable only to a special case of simple operators implemented by at most one actor per each of the coherent appy/train/label segments corresponding to the relevant primitive decorators (apply(), train(), label()) supplying the particular actors.

In addition to the primitive decorators, there is the combined mapper() decorator filling both the train/apply segments at once.

Hint

The decorators can be chained together as well as applied in a split fashion onto separate actors for different builder:

@wrap.Operator.train
@wrap.Operator.apply  # can be chained if same actor is also to be used in another mode
@wrap.Actor.apply
def MyOperator(df, *, myarg=None):
    ... # stateless actor implementation used for train/apply segments

@MyOperator.label  # decorated operator can itself be used as decorator in split fashion
@wrap.Actor.apply
def MyOperator(df, *, myarg=None):
    ... # stateless actor implementation used for the label segment

Decorator Methods

Actor definitions for individual builder can be provided using the following decorator methods.

train(actor)¶

Train segment actor decorator.

When used as a decorator, this method creates an operator engaging the wrapped actor in the train-mode. If stateful, the actor also gets normally trained first. Note it does not get applied to the apply-mode features unless also decorated with the apply() decorator (this is rarely desired - see the mapper() decorator for a more typical use case)!

Parameters:

actor¶: Decorated actor.

Returns:

An Operator class using the given actor.

Examples

Usage with a wrapped stateless actor:

@wrap.Operator.train
@wrap.Actor.apply
def TrainOnlyDropColumn(
    df: pandas.DataFrame, *, column: str
) -> pandas.DataFrame:
    return df.drop(columns=column)

PIPELINE = AnotherOperator() >> TrainOnlyDropColumn(column='foo')

apply(actor)¶

Apply segment actor decorator.

When used as a decorator, this method creates an operator engaging the wrapped actor in the apply-mode. If stateful, the actor also gets normally trained in train-mode (but does not get applied to the train-mode features unless also decorated with the train() decorator!).

Parameters:

actor¶: Decorated actor.

Returns:

An Operator class using the given actor.

Examples

Usage with a wrapped stateful actor:

@wrap.Actor.train
def ApplyOnlyFillnaMean(
    state: typing.Optional[float],
    df: pandas.DataFrame,
    labels: pandas.Series,
    *,
    column: str,
) -> float:
    return df[column].mean()

@wrap.Operator.apply
@ApplyOnlyFillnaMean.apply
def ApplyOnlyFillnaMean(
    state: float,
    df: pandas.DataFrame,
    *,
    column: str
) -> pandas.DataFrame:
    df[column] = df[column].fillna(state)
    return df

PIPELINE = (
    AnotherOperator()
    >> TrainOnlyDropColumn(column='foo')
    >> ApplyOnlyFillnaMean(column='bar')
)

label(actor)¶

Label segment actor decorator.

When used as a decorator, this method creates an operator engaging the wrapped actor in the train-mode as the label transformer. If stateful, the actor also gets normally trained first. The actor gets engaged prior to any other stateful actors potentially added to the same operator (using the train() or apply() decorators).

Parameters:

actor¶: Decorated actor.

Returns:

An Operator class using the given actor.

Examples

Usage with a wrapped stateless actor:

@wrap.Operator.label
@wrap.Actor.apply
def LabelOnlyFillZero(labels: pandas.Series) -> pandas.Series:
    return labels.fillna(0)

PIPELINE = (
    anotheroperator()
    >> LabelOnlyFillZero()
    >> TrainOnlyDropColumn(column='foo')
    >> ApplyOnlyFillnaMean(column='bar')
)

Alternatively, it could as well be just added to the existing ApplyOnlyFillnaMean:

@ApplyOnlyFillnaMean.label
@wrap.Actor.apply
def ApplyFillnaMeanLabelFillZero(labels: pandas.Series) -> pandas.Series:
    return labels.fillna(0)

mapper(actor)¶

Combined train-apply decorator.

Decorator representing the wrapping of the same actor using both the train() and apply() decorators effectively engaging the actor in transforming the features in both the train-mode as well as the apply-mode.

Parameters:

actor¶: Decorated actor.

Returns:

An Operator class using the given actor.