.. _custom_interpreters: Custom interpreters ******************** Interpreters are the way to use **other languages** for tasks' code **directly** in the pipeline XML. By default, tasks' code are interpreted with bash, however you can define a custom `interpreter` for any task. How to define an interpreter for a task ======================================= You need to add the ``interpreter`` attribute to the ```` element whose code will be written in another language. The interpreter attribute takes a bash script that will be run **instead of** the task code. You will then need to run the task code manually (normally by means of an interpreter, such as python, awk, etc.). Inside this script you will have available all the task's parameters as environment variables, as well as: - ``$task_code``. The task's code. You can send this code to the correct interpreter. - ``$task_id``. The task's ``id``. - ``$task_interpreter``. The task's ``interpreter`` itself. - ``$task_params``. A list of parameters of the task, separated by white-spaces. .. note:: **Portability**: In order to keep the pipeline as portable as possible, it is recommended to use interpreters that are available in the majority of operating systems. Examples ======== awk --- Consider the following ```` definition: .. code-block:: xml { print $1 } Here, the interpreter is: ``/usr/bin/awk -e '$task_code' $input_file > $output_file_awk``. When the task is about to run, the code that runs is simply that specified in ``interpreter``, which in turn runs the AWK interpreter against the task's code (``-e '$task_code'``), and uses the ``$input_file`` and ``$output_file`` parameters of the task to feed the AWK script and to save its output, respectively. python ------ Consider the following ```` definition: .. code-block:: xml import os f = open(os.environ['output_file_python'], "a") f.write(os.environ['input_file']) Here, the interpreter is: ``/usr/bin/python -c "$task_code"`` which simply uses the Python interpreter to run the tasks code (``-c "$task_code"``). Inside the task's code, you can see how to access the task's parameters via the process environment (``os.environ``, remember that parameters in Compi are passed to the process as environmental variables). R - Consider the following ```` definition: .. code-block:: xml Here, the interpreter is ``docker run --rm r-base:3.6.0 R -e "$task_code"``, which uses the R interpreter (``-e``) from the ``r-base`` Docker image. .. note:: **Known issues with R interpreters**: - Previous versions of R (e.g. 3.2 and 3.4) fail to process multiline code using the ``-e`` parameter. Thus, it is recommended the most recent versions (e.g. 3.6.0). - Version 3.6.0 of R also fails to process multiline code when lines are indented using tabs instead of spaces. This bug can be overcome by using spaces to indent code (as in the example provided) or by using the following interpreter (it simply removes the tabs from the ``$task_code``` using ``tr`` before passing it to the interpreter): ``R -e "$(echo "$task_code" | tr -d '\t')"``