Baseband Helpers¶

Helpers assist with reading and writing all file formats. Currently, they only include the sequentialfile module for reading a sequence of files as a single one.

Sequential File¶

The sequentialfile module is for reading from and writing to a sequence of files as if they were a single, contiguous one. Like with file formats, there is a master sequentialfile.open function to open sequences either for reading or writing. It returns sequential file objects that have read, write, seek, tell, and close methods that work identically to their single file object counterparts. They additionally have memmap methods to read or write to files through numpy.memmap.

As an example of how to use open, we write the data from the sample VDIF file baseband/data/sample.vdif into a sequence of two files - as the sample file has two framesets - and then read the files back in. We first load the required data:

>>> from baseband import vdif
>>> from baseband.data import SAMPLE_VDIF
>>> import numpy as np
>>> fh = vdif.open(SAMPLE_VDIF, 'rs')
>>> d = fh.read()

We now open a sequential file object for writing:

>>> from baseband.helpers import sequentialfile as sf
>>> filenames = ["seqvdif_{0}".format(i) for i in range(2)]
>>> file_size = fh.fh_raw.seek(0, 2) // 2
>>> fwr = sf.open(filenames, mode='wb', file_size=file_size)

The first argument passed to open must be a time-ordered sequence of filenames in a list, tuple, or other subscriptable object that returns IndexError when the index is out of bounds. The read mode is ‘wb’, though note that writing using numpy.memmap (eg. required for the DADA stream writer) is only possible if mode='w+b'. file_size determines the largest size a file may reach before the next one in the sequence is opened for writing. We set file_size such that each file holds exactly one frameset.

Note

Setting file_size to a larger value than above will lead to the two files having different sizes. By default, file_size=None, meaning it can be arbitrarily large, in which case only one file will be created.

To write the data, we pass fwr to vdif.open:

>>> fw = vdif.open(fwr, 'ws', header0=fh.header0,
...                sample_rate=fh.sample_rate,
...                nthread=fh.sample_shape.nthread)
>>> fw.write(d)
>>> fw.close()    # This implicitly closes fwr.

To read the sequence and confirm their contents are identical to the sample file’s, we may again use open:

>>> frr = sf.open(filenames, mode='rb')
>>> fr = vdif.open(frr, 'rs', sample_rate=fh.sample_rate)
>>> fr.header0.time == fh.header0.time
True
>>> np.all(fr.read() == d)
True
>>> fr.close()

We can also open the second file on its own and confirm it contains the second frameset of the sample file:

>>> fsf = vdif.open(filenames[1], mode='rs', sample_rate=fh.sample_rate)
>>> fh.seek(fh.shape[0] // 2)    # Seek to start of second frameset.
20000
>>> fsf.header0.time == fh.time
True
>>> np.all(fsf.read() == fh.read())
True
>>> fsf.close()
>>> fh.close()  # Close sample file.

While sequentialfile can be used for any format, since file sequences are common for DADA, it is implicitly used if a list of files or filename template is passed to dada.open. See the DADA Usage section for details.

Reference/API¶