Output Sink

The Sink is a runtime platform component responsible for processing the output produced upon executing one of the life cycle actions of the particular project pipeline.

It is a (much simpler) logical counterpart to the feed concept.

Individual sink providers are mostly relevant to batch mode. The concept is still used also in the serving mode, but the component is embedded in the engine which transparently deals with the output.

Architecture

From a high-level perspective, Sink mirrors the feed design with flow inversion. It relies on a particular Writer implementation acting as an adapter between the pipeline output and the external media layer.

When launching the pipeline, ForML runner expands the Sink into a closing task within the assembled workflow making it a native part of the final DAG to be executed.

The core Sink API looks as follows:

forml.io.Consumer

alias of Callable[[layout.RowMajor], layout.Outcome]

class forml.io.Sink(**writerkw)[source]

Abstract base class for pipeline output sink providers.

It integrates the concept of a Writer provided using the consumer() method or by overriding the inner .Writer class.

classmethod consumer(schema: dsl.Source.Schema | None, **kwargs: Any) io.Consumer[source]

Consumer factory method.

A Consumer is a generic callable interface most typically represented using the forml.io.Sink.Writer implementation whose task is to commit the pipeline output using an external media layer.

Unless overloaded, the method returns an instance of cls.Writer (which might be easier to extend without needing to overload this method).

Note

For compatibility with the serving mode, the callable Consumer is (contra-intuitively) expected to provide a return value.

Parameters:
schema: dsl.Source.Schema | None

Result schema.

**kwargs: Any

Optional writer keyword arguments.

Returns:

Consumer instance.

class forml.io.Sink.Writer(schema: dsl.Source.Schema | None, **kwargs: Any)

Generic writer base class matching the Sink consumer interface.

It is a low-level output component responsible for committing the actual pipeline output to the supported external media layer using its specific data representation (the write() method).

classmethod format(schema: dsl.Source.Schema, data: layout.RowMajor) layout.Native

Convert the output data into the required media-native layout.Native format.

Parameters:
schema: dsl.Source.Schema

Data schema.

data: layout.RowMajor

Output data.

Returns:

Data formatted into the media-native layout.Native format.

abstract classmethod write(data: layout.Native, **kwargs: Any) None

Perform the write operation with the given media-native data.

Parameters:
data: layout.Native

Output data in the media-native format.

**kwargs: Any

Optional writer keyword arguments (as given to the constructor).

Sink Providers

Sink providers can be configured within the runtime platform setup using the [SINK.*] sections.

The available implementations are:

Null

Null sink with no real write action.

Stdout

Sink implementation committing the pipeline result to the standard output of the execution process.