Block Executors

RZT aiOS blocks, irrespective of pre-built or custom, can be executed either individually as a single block or as part of a pipeline. Other aspects a user is concerned about while running a block is the execution environment and the mechanism by which data is moved from one block to another. In this document, we talk about different execution environments in detail. To learn about different data transfer methods seet data transport between blocks. This tutorial starts from how to run a block in the simplest environment like a single threaded process to complex environment like spark and horovod. architectures like spark and horovod.

A block or a pipeline can be run in different types of execution environment like

TheadExecutor
SuprocessExecutor Specialization of ThreadExecutor
ProcessExecutor Specialization of ThreadExecutor
ContainerExecutor
BlockPickleExecutor
PipelineEngineExecutor Specialization of BlockPickleExecutor
SparkExecutor Specialization of ContainerExecutor
HorovodExecutor Specialization of ContainerExecutor

TODO Add a hierarchical clss diagram showing the inheritance hierarchy

By default, when no container is specified, a block will run as a subprocess forked from the Jupyter kernel process. This is ideal and quick for trying out small prototypical code. Example

import razor.flow as rf
import pandas as pd
class CsvReader:
    filename: str
    output:rf.SeriesOutput[pd.DataFrame]
    def run(self):
        file_path = project_space_path(self.filename)
        chunks = pd.read_csv(file_path, chunksize=10, nrows=None, delimiter = None)
        for df in chunks:
            self.output.put(df)
            
CsvReader(filename="titanic/train.csv").execute()

QUICK START GUIDES

API Documentation

Block Executors