Baseband Helpers¶
Helpers assist with reading and writing all file formats. Currently,
they only include the sequentialfile
module
for reading a sequence of files as a single one.
Sequential File¶
The sequentialfile
module is for reading from and writing
to a sequence of files as if they were a single, contiguous one. Like with
file formats, there is a master sequentialfile.open
function to open sequences either
for reading or writing. It returns sequential file objects that have read
,
write
, seek
, tell
, and close
methods that work identically to
their single file object counterparts. They additionally have memmap
methods to read or write to files through numpy.memmap
.
As an example of how to use open
, we write the data from the sample VDIF
file baseband/data/sample.vdif
into a sequence of two files - as the sample
file has two framesets - and then read the files back in. We first load the
required data:
>>> from baseband import vdif
>>> from baseband.data import SAMPLE_VDIF
>>> import numpy as np
>>> fh = vdif.open(SAMPLE_VDIF, 'rs')
>>> d = fh.read()
We now open a sequential file object for writing:
>>> from baseband.helpers import sequentialfile as sf
>>> filenames = ["seqvdif_{0}".format(i) for i in range(2)]
>>> file_size = fh.fh_raw.seek(0, 2) // 2
>>> fwr = sf.open(filenames, mode='wb', file_size=file_size)
The first argument passed to open
must be a time-ordered sequence of
filenames in a list, tuple, or other subscriptable object that returns
IndexError
when the index is out of bounds. The read mode is ‘wb’,
though note that writing using numpy.memmap
(eg. required for the DADA stream
writer) is only possible if mode='w+b'
. file_size
determines the
largest size a file may reach before the next one in the sequence is opened
for writing. We set file_size
such that each file holds exactly one
frameset.
Note
Setting file_size
to a larger value than above will lead to the
two files having different sizes. By default, file_size=None
, meaning
it can be arbitrarily large, in which case only one file will be created.
To write the data, we pass fwr
to vdif.open
:
>>> fw = vdif.open(fwr, 'ws', header0=fh.header0,
... sample_rate=fh.sample_rate,
... nthread=fh.sample_shape.nthread)
>>> fw.write(d)
>>> fw.close() # This implicitly closes fwr.
To read the sequence and confirm their contents are identical to the sample
file’s, we may again use open
:
>>> frr = sf.open(filenames, mode='rb')
>>> fr = vdif.open(frr, 'rs', sample_rate=fh.sample_rate)
>>> fr.header0.time == fh.header0.time
True
>>> np.all(fr.read() == d)
True
>>> fr.close()
We can also open the second file on its own and confirm it contains the second frameset of the sample file:
>>> fsf = vdif.open(filenames[1], mode='rs', sample_rate=fh.sample_rate)
>>> fh.seek(fh.shape[0] // 2) # Seek to start of second frameset.
20000
>>> fsf.header0.time == fh.time
True
>>> np.all(fsf.read() == fh.read())
True
>>> fsf.close()
>>> fh.close() # Close sample file.
While sequentialfile
can be used for any format, since
file sequences are common for DADA, it is implicitly used if a list of
files or filename template is passed to dada.open
.
See the DADA Usage section for details.
Reference/API¶
baseband.helpers Package¶
baseband.helpers.sequentialfile Module¶
Functions¶
open (files[, mode, file_size, opener]) |
Read or write several files as if they were one contiguous one. |
Classes¶
SequentialFileReader (files[, mode, opener]) |
Read several files as if they were one contiguous one. |
SequentialFileWriter (files[, mode, …]) |
Write several files as if they were one contiguous one. |