Handling Input and Outputs

Contents

Handling Input and Outputs#

b2luigi comes with a number of helpful functionalities to simplify the handling of input and output targets. You can check the general usage in the b2luigi b2luigi.Task documentation. In the following only some best practices are explained.

Outputs#

A tasks output can easily be defined using the task’s output method in combination with its b2luigi.Task.add_to_output() method. add_to_output create a luigi target with a unique path depending on the tasks parameter. This way the book keeping of all output files of a task tree is already taken care of. To access the output target filename, the task class has the b2luigi.Task.get_output_file_name() method. This method always returns the full path of the output file.

Attention

Only use the get_output_file method to access the output file. This will ensure, that the correct file us used. If you want to work with temporary files, use the b2luigi.on_temporary_files() decorator. As long as you fully commit to these tasks own methods, you do not have to change any of your code when working with temporary files.

To find out more about the task and its methods, check the b2luigi.Task documentation. The temporary file decorator is documented in the Temporary File Context Manager section.

Inputs#

A tasks inputs can be accessed via the b2luigi.Task.get_input_file_names() method. Since a task can have multiple inputs of the same type (e.g.: different background processes run through the same selection), get_input_file_names always returns a list.

Sometimes a Task requires multiple input files from various previous tasks. Handling these can get tedious. In this case, b2luigi.Task.get_input_file_names_from_dict() can help. This method expects the task’s requires method to return a dictionary of the form:

def requires(self):
    return {
        "input_a": (Task_A(), Task_B()),
        "input_b": (Task_C(), Task_D()),
        ...
    }

It then allows to access the list of all inputs generated by Task_A, Task_B, … via the key input_a.