bripipetools Application Packages¶
Overview¶
Application-level packages are those exposed to the user through wrapper scripts and the command line. They are used to perform common, high-level tasks related to pipeline operations and data. Packages are listed roughly in order of dependency hierarchy (i.e., packages listed first depend on subsequently listed packages).
Note
Intended for developers!
The documentation below is effectively a dump of all high-level packages, modules, classes, and methods that are used to run bripipetools. This amount of detail shouldn’t be needed for most users, but provides a starting point for those looking to understand or modify the code.
Package details¶
dbification package¶
Manages the collection and annotation of data (e.g., generated by the
Genomics Core or produced through bioinformatics processing) for import
into GenLIMS. Modules are designed to handle the set of data
associated with a particular “step” (e.g., a flowcell sequencing run or
bioinformatics processing of a batch of samples). The dbify.control
module inspects an input path and deploys the appropriate importer
class.
control submodule¶
Parse arguments to determine and select appropriate importer class.
flowcellrun module¶
Class for importing data from a sequencing run into GenLIMS and the Research DB as new objects.
workflowbatch module¶
Class for importing data from a processing batch into databases as new objects. Supports both research database (“genomics…”) and GenLIMS collections.
postprocessing package¶
Covers a range of operations performed on outputs and other files
produced through bioinformatics processing of a batch of samples. For
example, the postprocess.stitching module parses data from
individual files of similar type and combines data into a single table
for all samples in a project. By extension, postprocess.compiling
will take these stitched tables of different types and combine them
into a new, large table for the project. On the other hand, the
postprocess.cleanup module deals with fixing the way files
are named and organized on the disk.
stitching module¶
Combine parsed data from a set of batch processing output files and write to a single CSV file.
compiling module¶
Compile combined/stitched ‘summary’ outputs of different types from batch processing and write to a single CSV file.
cleanup module¶
Clean up & organize outputs from processing workflow batch.
monitoring package¶
Contains tools for monitoring the status of pipeline steps. Classes and methods here are designed to inspect files on the server and report on various indicators of state (e.g., file existence, access, completion, size, etc.).
workflowbatches module¶
Monitor the outputs of a workflow processing batch.
submission package¶
Prepares data for batch submission through Globus Galaxy, typically
starting from unaligned samples (libraries) from a flowcell run. The
submission.batchcreate and submission.batchparameterize
modules handle most of the work: the first takes a list of sample
paths (or folders containing sample paths) and a workflow template
file and controls the preparation of a batch submit file as well as
target folders for batch outputs; the latter sets individual
parameter values (mostly input and output file paths) for each sample,
which are then used by the BatchCreator class to create and write
the overall submission instructions. The submission.flowcellsubmit
module provides a wrapper around batchcreate, allowing a user to
select workflows and generate batch submissions for multiple unaligned
projects from a flowcell run.