Pipelines
A pipeline is an ordered collection of stages that together perform a complete analysis. Pipelines are usually assembled by a PipelineBuilder, which decides which stages to include based on validated user input. Snippy-NG pipelines are linear: any branching happens while building the pipeline, not while executing it. For example:
class ExamplePipelineBuilder(PipelineBuilder):
input_file: Path = Field(..., description="Input file")
prefix: str = Field(default="example", description="Output prefix")
def build(self) -> SnippyPipeline:
stages = []
first = ExampleStage(
input_file=self.input_file,
prefix=self.prefix,
)
stages.append(first)
second = AnotherStage(
source=first.output.result,
prefix=self.prefix,
)
stages.append(second)
return SnippyPipeline(stages=stages)
In practice, a pipeline builder usually does three things:
- validates workflow-level inputs
- wires stage outputs into downstream stage inputs
- chooses which files should be kept after cleanup
Pipelines should contain workflow composition logic, not low-level implementation details. If a unit of work is independently meaningful, it should usually live in a stage and then be composed into one or more pipelines.