
H. López-Fernández; O. Graña-Castro; A. Nogueira-Rodríguez; M. Reboiro-Jato; D. Glez-Peña (2021) Compi: a Framework for Portable and Reproducible Pipelines. PeerJ Computer Science. Volume 7: e593. ISSN: 2376-5992
Compi is an extremely simple application framework for portable computational pipelines. A computational pipeline can be seen as a set of processing steps that run one after one (or ocassionally in parallel if they are independent).
There are many fields where computational pipelines constitute the main architecture of applications, such as big data analysis or bioinformatics.
Many pipelines combine third party tools along with custom made processes, conforming the final pipeline. Compi is the framework helping you to create the final - and portable - application.
Compi pipelines are defined in XML, where each task is run in an external program written in any programming language. If your program is a mere combination of existing tools, you have not to program at all! Define the steps of the pipeline and its parameters and that's it!
Building compi apps is easy! Thanks to Docker, pipelines can be packaged in a Docker image along with their dependencies, making them really portable! You only have to include the dependencies your pipeline needs in the Dockerfile we provide you. Notwithstanding, you can also run your pipeline locally without Docker. Once you have your pipeline ready you can share it in our Compi hub!
If you define your pipeline with Compi, a Command-Line user interface is provided for your users to run your pipeline. Thus, Compi is in fact an application framework in charge of dealing with user interaction, multithreaded pipeline execution and logging, saving your time with these aspects. You can focus in things that are really specific to your application.
Compi pipelines run independent tasks in parallel, but you don't have to worry about parallel execution management, Compi does it for you! You can also restart your pipeline from any step, without repeating previous steps that may have completed in previous runs.
Pipelines are specified in an XML file. The main purpose of this file is to define which atomic tasks your pipeline has, their dependencies (those tasks that need to be run before each task), and the parameters the user can specify and that will be used inside the tasks.
This file contains:
<task>
elements, which define your pipeline steps, their dependencies are defined with the
after
attribute. Inside the element, you place the code to be run when the task starts.<param>
elments, which declare and describe the parameters of your pipeline. Inside any task you
can use a parameter with
${parameter_name}
.<?xml version="1.0" encoding="UTF-8"?>
<pipeline xmlns="http://www.sing-group.org/compi/pipeline-1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<version>1.0</version>
<params>
<param name="name" shortName="n">Your name</param>
<param name="output" shortName="o">Output file</param>
</params>
<tasks>
<task id="greetings" params="name output">
echo "Hi ${name}" > ${output}
</task>
<task id="bye" after="greetings" params="name output"
interpreter="python3 -c "${task_code}"">
import os
name = os.environ['name']
output = os.environ['output']
f = open(output, "a")
f.write("bye " + name + "\n")
f.close()
</task>
</tasks>
</pipeline>
Create a pipeline is very easy. You need the Compi Development Kit (compi-dk command). Here, you can see how to install compi-dk in your system, create a pipeline project and build the docker image.
wget http://static.sing-group.org/compi/downloads/compi-dk-1.5.2-installer.bsx
sudo bash ./compi-dk-1.5.2-installer.bsx
compi-dk new-project -p /tmp/my-pipeline -n my-pipeline
cd /tmp/my-pipeline
Here you can edit your
pipeline.xml
and the
Dockerfile
to include dependencies
compi-dk build
Now you have a new docker image named "my-pipeline".
docker run my-pipeline
Since you not provide any pipeline parameter, it will bring up the help, showing all parameters. This is the CLI of your pipeline!
docker run -v /tmp:/data my-pipeline -n 10 -- -p1 param-one -p2 param-two -o /data/output.txt -l one,two,three
cat /tmp/output.txt
Please, note that pipeline parameters are passed after --
. The
-n
option establishes the maximum number of tasks that can run in parallel. There are multiple options to control the execution of your pipeline.
Date | Description | Version | Link |
---|---|---|---|
September 2, 2022 | Compi (core and CLI) - Self-extracted installer (Linux 64-bit) | 1.5 | compi-1.5.2-installer.bsx |
September 2, 2022 | Compi (core and CLI) - Portable version (Linux 64-bit) | 1.5 | compi-1.5.2.tar.gz |
April 28, 2021 | Compi (core and CLI) - Self-extracted installer (Linux 64-bit) | 1.4 | compi-1.4.2-installer.bsx |
April 28, 2021 | Compi (core and CLI) - Portable version (Linux 64-bit) | 1.4 | compi-1.4.2.tar.gz |
September 22, 2020 | Compi (core and CLI) - Self-extracted installer (Linux 64-bit) | 1.3 | compi-1.3.7-installer.bsx |
September 22, 2020 | Compi (core and CLI) - Portable version (Linux 64-bit) | 1.3 | compi-1.3.7.tar.gz |
November 28, 2019 | Compi (core and CLI) - Self-extracted installer (Linux 64-bit) | 1.2 | compi-1.2.3-installer.bsx |
November 28, 2019 | Compi (core and CLI) - Portable version (Linux 64-bit) | 1.2 | compi-1.2.3.tar.gz |
Date | Description | Version | Link |
---|---|---|---|
September 2, 2022 | Compi Development Kit - Self-extracted installer (Linux 64-bit) | 1.5 | compi-dk-1.5.2-installer.bsx |
September 2, 2022 | Compi Development Kit - Portable version (Linux 64-bit) | 1.5 | compi-dk-1.5.2.tar.gz |
April 28, 2021 | Compi Development Kit - Self-extracted installer (Linux 64-bit) | 1.4 | compi-dk-1.4.2-installer.bsx |
April 28, 2021 | Compi Development Kit - Portable version (Linux 64-bit) | 1.4 | compi-dk-1.4.2.tar.gz |
September 22, 2020 | Compi Development Kit - Self-extracted installer (Linux 64-bit) | 1.3 | compi-dk-1.3.7-installer.bsx |
September 22, 2020 | Compi Development Kit - Portable version (Linux 64-bit) | 1.3 | compi-dk-1.3.7.tar.gz |
November 28, 2019 | Compi Development Kit - Self-extracted installer (Linux 64-bit) | 1.2 | compi-dk-1.2.3-installer.bsx |
November 28, 2019 | Compi Development Kit - Portable version (Linux 64-bit) | 1.2 | compi-dk-1.2.3.tar.gz |
July 3, 2018 | Compi Development Kit - Self-extracted installer (Linux 64-bit) | 1.1 | compi-dk-1.1-installer.bsx |