Life Cycle Management¶
Machine learning projects are handled using a typical set of actions applied in a specific order. This pattern is what we call a life cycle. ForML supports two distinct life cycles depending on the project stage.
Caution
Do not confuse the life cycles with the execution mechanism. ForML projects can be operated in a number of different ways each of which is still subject to a particular life cycle.
Iteration Accomplishment¶
The ultimate milestone of each of the life cycles is the point of producing (a new instance of) the particular runtime artifacts. This concludes the given iteration and the process can start over and/or transition between the two life cycles back and forth.
Generation Advancement¶
Whenever the given pipeline is trained (incrementally or from scratch) and/or tuned, a new generation of its models is produced.
This typically happens to refresh the models using new data while keeping the same pipeline implementation. Updating the models of the same release allows (if supported by the given models) to carry the state over from previous generations to the next by incrementally training only on the new data obtained since the previous training.
Generations get transparently persisted in the model registry as the model generation assets.
Release Roll-out¶
The milestone of the development life cycle is the roll-out of a new release. It is essentially a new version of the project code implementation published for deployment.
Upon releasing, the ForML package is produced and persisted in the model registry.
Caution
Given the different implementations, it is not possible to carry over states between generations of different releases.
Life Cycle Actions¶
A simplified logical flow of the individual steps and transitions between the two life cycles is illustrated by the following diagram:
flowchart TB
subgraph production [Production Life Cycle]
train[Train / Tune] -- Generation Advancement --> apply([Apply / Serve])
apply --> applyeval(Evaluate)
applyeval -- Metrics --> renew{Renew?} -- No --> apply
renew -- Yes --> how{How?} -- Refresh --> train
end
subgraph development [Development Life Cycle]
how -- Reimplement --> implement(Explore / Implement)
implement --> traineval(Test + Evaluate)
traineval --> ready{Ready?} -- No --> implement
ready -- Yes --> release[(Release)] -- Release Roll-out --> train
end
init((Init)) --> implement
Development Life cycle¶
As the name suggests, this life cycle is exercised during the project development in the scope of
the project source-code working copy. It is typically managed using the forml
project <action>
CLI interface as shown below or using the
runtime.Virtual
launcher when visited in the interactive
mode.
The expected behavior of the particular action depends on the correct project setup.
Hint
Any model generations produced within the development life cycle
are stored using the Volatile registry
which is not persistent across multiple
python sessions.
The development life cycle actions are:
Test¶
Simply run through the unit tests defined as per the Unit Testing framework.
Example:
$ forml project test
Evaluate¶
Perform the train-test evaluation based on the evaluation.py component and report the metrics.
Example:
$ forml project eval
Train¶
Run the project pipeline in the standard train-mode. Even though this will produce a true generation of the defined models, it won’t
get persisted across the invocations making this mode useful merely for smoke-testing the
training process (or displaying the task graph on the Graphviz runner
).
Example:
$ forml project train
Release¶
Build and publish the release package into the configured model registry. This effectively constitutes the release roll-out and the process can transition from here into the production life cycle.
Warning
Each model registry provider allows uploading only unique monotonically increasing releases per any given project, hence executing this action twice against the same registry without incrementing the project version is an error.
Example:
$ forml project release
Production Life cycle¶
After rolling-out the new release package into a registry, it becomes available for the production life cycle. In contrast to the development, the production life cycle no longer needs the project source-code working copy as it operates solely on that published release package (plus potentially the previously persisted model generations).
The production life cycle is either managed in batch mode using the CLI or embedded within a serving engine.
The stages of the production life cycle are:
Train¶
Run the project pipeline in the train-mode to produce the new generation and persist it in the model registry.
Example:
$ forml model train forml-tutorial-titanic
Tune¶
Run hyper-parameter tuning of the selected pipeline and produce the new generation (not implemented yet).
Example:
$ forml model tune forml-tutorial-titanic
Todo
Tuning support is currently still pending.
Apply¶
Run the previously trained project pipeline in the apply-mode using an existing model generation (explicit version or by default the latest) loaded from the model registry.
Example:
$ forml model apply forml-tutorial-titanic
See also
In addition to this command-line-based batch mechanism, the serving engine together with the application concept is another way of performing the apply action of the production life cycle.
Evaluate¶
Perform the production performance evaluation based on the evaluation.py component and report the metrics.
Example:
$ forml model eval forml-tutorial-titanic