Running PynPoint#

Introduction#

The pipeline can be executed with a Python script, in interactive mode, or with a Jupyter Notebook. The main components of PynPoint are the pipeline and the three types of pipeline modules:

Pypeline – The actual pipeline which capsules a list of pipeline modules.
ReadingModule – Module for importing data and relevant header information from FITS, HDF5, or ASCII files into the database.
WritingModule – Module for exporting results from the database into FITS, HDF5 or ASCII files.
ProcessingModule – Module for processing data with a specific data reduction or analysis recipe.

Initiating the Pypeline#

The pipeline is initiated by creating an instance of Pypeline:

pipeline = Pypeline(working_place_in='/path/to/working_place',
                    input_place_in='/path/to/input_place',
                    output_place_in='/path/to/output_place')

PynPoint creates an HDF5 database called PynPoin_database.hdf5 in the working_place_in of the pipeline. This is the central data storage in which the processing results from a ProcessingModule are stored. The advantage of the HDF5 format is that reading of data is much faster than from FITS files and it is also possible to quickly read subsets from large datasets.

Restoring data from an already existing pipeline database can be done by creating an instance of Pypeline with the working_place_in pointing to the path of the PynPoint_database.hdf5 file.

Running pipeline modules#

Input data is read into the central database with a ReadingModule. By default, PynPoint will read data from the input_place_in but setting a manual folder is possible to read data to separate database tags (e.g., dark frames, flat fields, and science data).

For example, to read the images from FITS files that are located in the default input place:

module = FitsReadingModule(name_in='read',
                           input_dir=None,
                           image_tag='science')

pipeline.add_module(module)

The images from the FITS files are stored in the database as a dataset with a unique tag. This tag can be used by other pipeline module to read the data for further processing.

The parallactic angles can be read from a text or FITS file and are attached as attribute to a dataset:

module = ParangReadingModule(name_in='parang',
                             data_tag='science'
                             file_name='parang.dat',
                             input_dir=None)

pipeline.add_module(module)

Finally, we run all pipeline modules:

pipeline.run()

Alternatively, it is also possible to run each pipeline module individually by their name_in value:

pipeline.run_module('read')
pipeline.run_module('parang')

Important

Some pipeline modules require pixel coordinates for certain arguments. Throughout PynPoint, pixel coordinates are zero-indexed, meaning that (x, y) = (0, 0) corresponds to the center of the pixel in the bottom-left corner of the image. This means that there is an offset of -1 in both directions with respect to the pixel coordinates of DS9, for which the center of the bottom-left pixel is (x, y) = (1, 1).

HDF5 database#

There are several ways to access the datasets in the HDF5 database that is used by PynPoint:

The FitsWritingModule exports a dataset from the database into a FITS file.
Several methods of the Pypeline class help to easily retrieve data and attributes from the database. For example:
- To read a dataset:
  pipeline.get_data('tag_name')
- To read an attribute of a dataset:
  pipeline.get_attribute('tag_name', 'attr_name')
The h5py Python package can be used to access the HDF5 file directly.
There are external tools available such as HDFCompass or HDFView to read, inspect, and visualize data and attributes. HDFCompass is easy to use and has a basic plotting functionality. In HDFCompass, the static PynPoint attributes can be opened with the Reopen as HDF5 Attributes option.

Dataset attributes#

Apart from using get_attribute(), it is also possible to print and return all attributes of a dataset with the list_attributes() method of Pypeline:

attr_dict = pipeline.list_attributes('tag_name')

The method returns a dictionary that contains both the static and non-static attributes.