ForML applications represent a high-level concept built on top of ForML projects as deliverables for the serving layer.

The term application in this context does not hold the common meaning of a general computer program covering a wide range of potential functions. ForML applications obviously focus just on the ML inference (providing predictions in response to the presented data points) representing the apply stage of the production life cycle.

While the purpose of projects is to implement a portable solution to the given ML problem, applications aim to expose it (by means of gateway providers) in a domain-specific form suitable for integration with the actual decision-making process.

ForML platform persists published applications within a special application inventory where they are picked from at runtime by the serving engine.

Project-Application Relationship

As shown in the diagram below, relationships between projects and applications can have any possible cardinality. Projects might not be associated with any application (not exposed for serving - e.g. Project B), on the other hand, an application can possibly span multiple (compatible) projects (its model selection strategy can involve multiple projects - e.g. Application Y) or a single project might be utilized by several different applications (e.g. Project A).

flowchart LR
    subgraph registry ["Registry"]
        subgraph prja ["Project A"]
            gena1[("Generation 1")]
            gena2[("Generation 2")]
        subgraph prjb ["Project B"]
            genb1[("Generation 1")]
        subgraph prjc ["Project C"]
            genc1[("Generation 1")]
    subgraph inventory ["Inventory"]
        app1(["Application X"]) --- gena1 & gena2
        app2(["Application Y"]) --- gena2 & genc1
    subgraph gw ["Gateway"]
        eng["Engine"] --- app1 & app2

It makes sense to manage an application (descriptor) in the scope of some particular project if they form a 1:1 relationship (perhaps the most typical scenario). More complex applications might need to be maintained separately though.

Request Dispatching

Applications play a key role in the serving process taking control over the following steps:

    Engine ->> Application: receive(Request)
    Application --) Engine: Entry, Scope
    Engine ->> Application: select(Scope, Stats)
    Application --) Engine: Model
    Engine ->> Model: predict(Entry)
    Model --) Engine: Outcome
    Engine ->> Application: respond(Outcome, Scope)
    Application --) Engine: Response

Data Interpretation

Applications define how to understand the received query and how to turn it into a model-prediction request, as well as how to present the predicted outcomes as the domain-specific response.

This is implemented within the following steps:

  1. Formally receiving the query by:

    1. Decoding its content according to the implemented payload semantic. Applications might choose to support a number of different encodings.

    2. Optionally compile the query into prediction-relevant data points. This might possibly involve certain domain mapping (e.g. a recommender application receiving click-stream events would turn it at this point into a set of product features to be passed down for scoring, etc.).

    3. Optionally assemble custom metadata to constitute an application context to be carried through the serving layers for reference.

  2. Producing a response based on:

    1. Composing the domain-specific result message out of the prediction outcomes (generated by the engine using the selected model). This might again involve particular domain mapping (e.g. turning probabilities into a selection of products, etc.).

    2. Encoding the response payload into a client-accepted representation.

Model Selection

Another powerful way an application exerts control over the serving process is a dynamic selection of the specific model generation to be used for serving each particular request.

Applications can base the selection logic on the following available facts:

  • the actual content of the model registry (any existing model generation to choose from)

  • custom metadata stored in the application context (e.g. as part of the query receiving)

  • various serving metrics provided by the system (e.g. number of requests already served by this application - using which model - including actual performance tracking results of each of the models, etc.)

The model-selection mechanism allows implementations of complex serving strategies including A/B testing, multi-armed bandits, cold-start/fallback models, etc. It is due to this dynamic ability to select a particular model/generation on the fly that the project-application relationship can potentially have higher than just the ordinary 1:1 cardinality.


Similarly to the principal project components, applications are delivered in form of a python module (single file with the .py suffix) providing an implementation of the application.Descriptor:

class forml.application.Descriptor[source]

Application descriptor abstract base class.

The serving layer is using Application descriptors to control the query processing.

Active descriptors are deployed through asset.Inventory used by the serving engine.

abstract property name : str

Name of the application represented by this descriptor.


Application name is expected to be globally unique. This name will be used to register the application when publishing and to target it when serving.


Application name.

abstract receive(request: layout.Request) layout.Request.Decoded[source]

Receive the raw payload and turn it into a structure suitable for predicting.

This involves at least payload decoding plus potentially also any further data compilation necessary for prediction. Additionally, it might also produce custom metadata representing an application context to be passed down the chain all the way to select() and respond().

request: layout.Request

Native request format.


Decoded entry (adjusted for prediction) with optional custom (serializable!) context.


layout.Encoding.Unsupported – If the received encoding is not supported.

abstract select(registry: asset.Directory, context: Any, stats: runtime.Stats) asset.Instance[source]

Select the model instance to be used for serving the request.

This can implement an arbitrary model-selection strategy with the use of the provided information.

registry: asset.Directory

Model registry to select the model from.

context: Any

Optional metadata carried over from receive().

stats: runtime.Stats

Application specific serving metrics.


Model instance.

abstract respond(outcome: layout.Outcome, encoding: Sequence[layout.Encoding], context: Any) layout.Response[source]

Turn the application result into a native response to be passed back to the requester.

This involves assembling the resulting structure and encoding it into a native format.

outcome: layout.Outcome

Result to be returned.

encoding: Sequence[layout.Encoding]

Accepted encoding media types.

context: Any

Optional metadata carried over from receive().


Encoded native response.


layout.Encoding.Unsupported – If none of the accepted encodings is supported.


Unlike projects, which upon releasing produce a ForML package containing all of their runtime dependencies, application descriptors are published as-is without any implicit dependency management. Any such dependencies would need to be satisfied explicitly by the general runtime environment (given the application scope, the dependencies are expected to be rather lightweight though).

The descriptor needs to be registered within the delivering module via a call to the application.setup() function:

forml.application.setup(descriptor: application.Descriptor) None[source]

Interface for registering application descriptor instances.

This function is expected to be called exactly once from within the application descriptor file.

The true implementation of this function is only provided when imported within the application loader context (outside the context this is effectively no-op).

descriptor: application.Descriptor

Application descriptor instance.

 from forml import application

 APP = application.Generic('forml-example-titanic')

Generic Application

In addition to the abstract application.Descriptor, ForML for convenience also provides a generic out-of-the-box implementation suitable for most typical scenarios.

This implements the data interpretation simply using the available layout.get_decoder and layout.get_encoder codecs and for the model selection it introduces a concept of pluggable application.Selector strategies.

class forml.application.Generic(name: str, selector: application.Selector | None = None)[source]

Bases: Descriptor

Generic application descriptor for basic serving scenarios.

It simply runs the directly decoded (using the available decoders) request payload through the model/generation selected using the provided application.Selector and returns the directly encoded (using the available encoders) outcomes as the response.

name: str

The (unique) name for this application registration/lookup.

selector: application.Selector | None = None

Implementation of a particular model-selection strategy (defaults to application.Latest selector expecting the project name to be matching the application name).


>>> APP = application.Generic('forml-example-titanic')

Generic applications are configured with particular model selection strategies provided as implementations of the following application.Selector base class:

class forml.application.Selector[source]

Abstract base class for the model selection strategy to be used by the application.Generic descriptors.

abstract select(registry: asset.Directory, context: Any, stats: runtime.Stats) asset.Instance[source]

Select the model instance to be used for serving the request.

See also

This serves the same purpose as the method only extracted as a separate object.

registry: asset.Directory

Model registry to select the model from.

context: Any

Optional metadata carried over from the application.Descriptor.receive.

stats: runtime.Stats

Application specific serving metrics.


Model instance.


Following are available implementations of model selection strategies to be used when configuring any generic application.

class forml.application.Explicit(project: str | asset.Project.Key, release: str | asset.Release.Key, generation: str | int | asset.Generation.Key)[source]

Bases: Selector

Model selection strategy always choosing an explicit model generation.

project: str | asset.Project.Key

Project reference of the selected model.

release: str | asset.Release.Key

Project release reference of the selected model.

generation: str | int | asset.Generation.Key

Project generation reference of the selected model.

class forml.application.Latest(project: str | asset.Project.Key, release: str | asset.Release.Key | None = None)[source]

Bases: Selector

Model selection strategy choosing an instance of the most recent model release/generation.


Currently, the instance is cached indefinitely and so updates to the registry are not dynamically reflected.

project: str | asset.Project.Key

Project reference to choose the most recent generation from.

release: str | asset.Release.Key | None = None

Optional release to choose the most recent generation from.


Applications get deployed by publishing into an application inventory used by the particular serving engine. Unlike the project artifacts, applications are not versioned and are only held in a flat namespace depending on the uniqueness of each application name. (Re)publishing an application with an existing name overwrites the original instance.

$ forml application put
$ forml application list


The name of the module containing the application descriptor is (from the publishing perspective) functionally meaningless. The only relevant identifier is the application name.