FAQs

What data format is used in the pipeline between the actors?

ForML actually does not care. It is only responsible for wiring up the actors in the desired graph but is fairly agnostic about the actual payload exchanged between them. It is the responsibility of the project implementer to engage actors that understand each other.

For convenience, the Pipeline Library shipped with ForML contains certain actors/operators implementations that expect the data to be Pandas dataframes. This is however rather a practical choice of the flow library (or a controversy that might get it removed from the ForML framework long term) while the ForML core is truly independent of the data formats being passed through.

Can a Feed engage multiple reader types so that I can mix for example file-based data sources with data in a DB?

No. It sounds like a cool idea to have a DSL interpreter that can just get raw data from any possible reader type and natively implement the ETL operations on top of it, but since there are existing dedicated ETL platforms doing exactly that (like the Trino DB, which ForML already can integrate with), trying to support the same feature on the feed level would be unnecessarily stretching the project goals too far.