Running PynPoint#
Introduction#
The pipeline can be executed with a Python script, in interactive mode, or with a Jupyter Notebook. The main components of PynPoint are the pipeline and the three types of pipeline modules:
Pypeline
– The actual pipeline which capsules a list of pipeline modules.ReadingModule
– Module for importing data and relevant header information from FITS, HDF5, or ASCII files into the database.WritingModule
– Module for exporting results from the database into FITS, HDF5 or ASCII files.ProcessingModule
– Module for processing data with a specific data reduction or analysis recipe.
Initiating the Pypeline#
The pipeline is initiated by creating an instance of Pypeline
:
pipeline = Pypeline(working_place_in='/path/to/working_place',
input_place_in='/path/to/input_place',
output_place_in='/path/to/output_place')
PynPoint creates an HDF5 database called PynPoin_database.hdf5
in the working_place_in
of the pipeline. This is the central data storage in which the processing results from a ProcessingModule
are stored. The advantage of the HDF5 format is that reading of data is much faster than from FITS files and it is also possible to quickly read subsets from large datasets.
Restoring data from an already existing pipeline database can be done by creating an instance of Pypeline
with the working_place_in
pointing to the path of the PynPoint_database.hdf5
file.
Running pipeline modules#
Input data is read into the central database with a ReadingModule
. By default, PynPoint will read data from the input_place_in
but setting a manual folder is possible to read data to separate database tags (e.g., dark frames, flat fields, and science data).
For example, to read the images from FITS files that are located in the default input place:
module = FitsReadingModule(name_in='read',
input_dir=None,
image_tag='science')
pipeline.add_module(module)
The images from the FITS files are stored in the database as a dataset with a unique tag. This tag can be used by other pipeline module to read the data for further processing.
The parallactic angles can be read from a text or FITS file and are attached as attribute to a dataset:
module = ParangReadingModule(name_in='parang',
data_tag='science'
file_name='parang.dat',
input_dir=None)
pipeline.add_module(module)
Finally, we run all pipeline modules:
pipeline.run()
Alternatively, it is also possible to run each pipeline module individually by their name_in
value:
pipeline.run_module('read')
pipeline.run_module('parang')
Important
Some pipeline modules require pixel coordinates for certain arguments. Throughout PynPoint, pixel coordinates are zero-indexed, meaning that (x, y) = (0, 0) corresponds to the center of the pixel in the bottom-left corner of the image. This means that there is an offset of -1 in both directions with respect to the pixel coordinates of DS9, for which the center of the bottom-left pixel is (x, y) = (1, 1).
HDF5 database#
There are several ways to access the datasets in the HDF5 database that is used by PynPoint:
The
FitsWritingModule
exports a dataset from the database into a FITS file.Several methods of the
Pypeline
class help to easily retrieve data and attributes from the database. For example:To read a dataset:
pipeline.get_data('tag_name')
To read an attribute of a dataset:
pipeline.get_attribute('tag_name', 'attr_name')
The h5py Python package can be used to access the HDF5 file directly.
There are external tools available such as HDFCompass or HDFView to read, inspect, and visualize data and attributes. HDFCompass is easy to use and has a basic plotting functionality. In HDFCompass, the static PynPoint attributes can be opened with the Reopen as HDF5 Attributes option.
Dataset attributes#
Apart from using get_attribute()
, it is also possible to print and return all attributes of a dataset with the list_attributes()
method of Pypeline
:
attr_dict = pipeline.list_attributes('tag_name')
The method returns a dictionary that contains both the static and non-static attributes.