Baseband

Welcome to the Baseband documentation! Baseband is a package for reading and writing VLBI and other radio baseband files, with the aim of simplifying and streamlining data conversion and standardization. It provides:

  • File input/output objects for supported radio baseband formats, enabling selective decoding of data into Numpy arrays, and encoding user-defined arrays into baseband formats. Supported formats are listed under specific file formats.
  • Helper objects for reading from and writing to an ordered sequence of files as if it was a single file.

Overview

Installation

Requirements

Baseband requires:

Installing Baseband

Using pip

To install Baseband with pip, run:

pip3 install baseband

Note

To run without pip potentially updating Numpy and Astropy, run, include the --no-deps flag.

Obtaining source code

The source code and latest development version of Baseband can found on its GitHub repo. You can get your own clone using:

git clone git@github.com:mhvk/baseband.git

Of course, it is even better to fork it on GitHub, and then clone your own repository, so that you can more easily contribute!

Running code without installing

As Baseband is purely Python, it can be used without being built or installed, by appending the directory it is located in to the PYTHON_PATH environment variable. Alternatively, you can use sys.path within Python to append the path:

import sys
sys.path.append(BASEBAND_PATH)

where BASEBAND_PATH is the directory you downloaded or cloned Baseband into.

Installing source code

If you want Baseband to be more broadly available, either to all users on a system, or within, say, a virtual environment, use setup.py in the root directory by calling:

python3 setup.py install

For general information on setup.py, see its documentation . Many of the setup.py options are inherited from Astropy (specifically, from Astropy -affiliated package manager) and are described further in Astropy’s installation documentation .

Testing the installation

The root directory setup.py can also be used to test if Baseband can successfully be run on your system:

python3 setup.py test

or, inside of Python:

import baseband
baseband.test()

These tests require pytest to be installed. Further documentation can be found on the Astropy running tests documentation .

Building documentation

Note

As with Astropy, building the documentation is unnecessary unless you are writing new documentation or do not have internet access, as Baseband’s documentation is available online at baseband.readthedocs.io.

The Baseband documentation can be built again using setup.py from the root directory:

python3 setup.py build_docs

This requires to have Sphinx installed (and its dependencies).

Getting Started

For most file formats, one can simply import baseband and use baseband.open to access the file. This gives one a filehandle from which one can read decoded samples:

>>> import baseband
>>> from baseband.data import SAMPLE_DADA
>>> fh = baseband.open(SAMPLE_DADA)
>>> fh.read(3)
array([[ -38.-38.j,  -38.-38.j],
       [ -38.-38.j,  -40. +0.j],
       [-105.+60.j,   85.-15.j]], dtype=complex64)
>>> fh.close()

For other file formats, a bit more information is needed. Below, we cover the basics of inspecting files, reading from and writing to files, and converting from one format to another. We assume that Baseband as well as NumPy and the Astropy units module have been imported:

>>> import baseband
>>> import numpy as np
>>> import astropy.units as u

Inspecting Files

Baseband allows you to quickly determine basic properties of a file, including what format it is, using the baseband.file_info function. For instance, it shows that the sample VDIF file that comes with Baseband is very short (sample files can all be found in the baseband.data module):

>>> import baseband.data
>>> baseband.file_info(baseband.data.SAMPLE_VDIF)
Stream information:
start_time = 2014-06-16T05:56:07.000000000
stop_time = 2014-06-16T05:56:07.001250000
sample_rate = 32.0 MHz
shape = (40000, 8)
format = vdif
bps = 2
complex_data = False

File information:
edv = 3
frame_rate = 1600.0 Hz
samples_per_frame = 20000
sample_shape = (8, 1)

The same function will also tell you when more information is needed. For instance, for Mark 5B files one needs the number of channels used, as well as (roughly) when the data were taken:

>>> baseband.file_info(baseband.data.SAMPLE_MARK5B)
File information:
format = mark5b
frame_rate = 6400.0 Hz
bps = 2
complex_data = False

missing:  nchan: needed to determine sample shape and rate.
          kday, ref_time: needed to infer full times.
>>> from astropy.time import Time
>>> baseband.file_info(baseband.data.SAMPLE_MARK5B, nchan=8, ref_time=Time('2014-01-01'))
Stream information:
start_time = 2014-06-13T05:30:01.000000000
stop_time = 2014-06-13T05:30:01.000625000
sample_rate = 32.0 MHz
shape = (20000, 8)
format = mark5b
bps = 2
complex_data = False

File information:
frame_rate = 6400.0 Hz
samples_per_frame = 5000
sample_shape = (8,)

The information is gleaned from info properties on the various file and stream readers (see below).

Note

The one format for which file_info works a bit differently is GSB, as this format requires separate time-stamp and raw data files. Only the timestamp file can be inspected usefully.

Reading Files

Opening Files

As shown at the very start, files can be opened with the general baseband.open function. This will try to determine the file type using file_info, load the corresponding baseband module, and then open the file using that module’s master input/output function.

Generally, if one knows the file type, one might as well work with the corresponding module directly. For instance, to explicitly use the DADA reader to open the sample DADA file included in Baseband, one can use the DADA module’s open function:

>>> from baseband import dada
>>> from baseband.data import SAMPLE_DADA
>>> fh = dada.open(SAMPLE_DADA, 'rs')
>>> fh.read(3)
array([[ -38.-38.j,  -38.-38.j],
       [ -38.-38.j,  -40. +0.j],
       [-105.+60.j,   85.-15.j]], dtype=complex64)
>>> fh.close()

In general, file I/O and data manipulation use the same syntax across all file formats. When opening Mark 4 and Mark 5B files, however, some additional arguments may need to be passed (as was the case above for inspecting a Mark 5B file, and indeed this is a good way to find out what is needed). Notes on such features and quirks of individual formats can be found in the API entries of their open functions, and within the Specific file format documentation.

For the rest of this section, we will stick to VDIF files.

Decoding Data and the Sample File Pointer

By giving the openers a 'rs' flag, which is the default, we open files in “stream reader” mode, where a file is accessed as if it were a stream of samples. For VDIF, open will then return an instance of VDIFStreamReader, which wraps a raw data file with methods to decode the binary data frames and seek to and read data samples. To decode the first 12 samples into a ndarray, we would use the read method:

>>> from baseband import vdif
>>> from baseband.data import SAMPLE_VDIF
>>> fh = vdif.open(SAMPLE_VDIF, 'rs')
>>> d = fh.read(12)
>>> type(d)
<... 'numpy.ndarray'>
>>> d.shape
(12, 8)
>>> d[:, 0].astype(int)    # First thread.
array([-1, -1,  3, -1,  1, -1,  3, -1,  1,  3, -1,  1])

As discussed in detail in the VDIF section, VDIF files are sequences of data frames, each of which is comprised of a header (which holds information like the time at which the data was taken) and a payload, or block of data. Multiple concurrent time streams can be stored within a single frame; each of these is called a “channel”. Moreover, groups of channels can be stored over multiple frames, each of which is called a “thread”. Our sample file is an “8-thread, single-channel file” (8 concurrent time streams with 1 stream per frame), and in the example above, fh.read decoded the first 12 samples from all 8 threads, mapping thread number to the second axis of the decoded data array. Reading files with multiple threads and channels will produce 3-dimensional arrays.

fh includes shape, size and ndim, which give the shape, total number of elements and dimensionality of the file’s entire dataset if it was decoded into an array. The number of complete samples - the set of samples from all available threads and channels for one point in time - in the file is given by the first element in shape:

>>> fh.shape    # Shape of all data from the file in decoded array form.
(40000, 8)
>>> fh.shape[0] # Number of complete samples.
40000
>>> fh.size
320000
>>> fh.ndim
2

The shape of a single complete sample, including names indicating the meaning of shape dimensions, is retrievable using:

>>> fh.sample_shape
SampleShape(nthread=8)

By default, dimensions of length unity are squeezed, or removed from the sample shape. To retain them, we can pass squeeze=False to open:

>>> fhns = vdif.open(SAMPLE_VDIF, 'rs', squeeze=False)
>>> fhns.sample_shape    # Sample shape now keeps channel dimension.
SampleShape(nthread=8, nchan=1)
>>> fhns.ndim            # fh.shape and fh.ndim also change with squeezing.
3
>>> d2 = fhns.read(12)
>>> d2.shape             # Decoded data has channel dimension.
(12, 8, 1)
>>> fhns.close()

Basic information about the file is obtained by either by fh.info or simply fh itself:

>>> fh.info
Stream information:
start_time = 2014-06-16T05:56:07.000000000
stop_time = 2014-06-16T05:56:07.001250000
sample_rate = 32.0 MHz
shape = (40000, 8)
format = vdif
bps = 2
complex_data = False

File information:
edv = 3
frame_rate = 1600.0 Hz
samples_per_frame = 20000
sample_shape = (8, 1)

>>> fh
<VDIFStreamReader name=... offset=12
    sample_rate=32.0 MHz, samples_per_frame=20000,
    sample_shape=SampleShape(nthread=8),
    bps=2, complex_data=False, edv=3, station=65532,
    start_time=2014-06-16T05:56:07.000000000>

Not coincidentally, the first is identical to what we found above using file_info.

The filehandle itself also shows the offset, the current location of the sample file pointer. Above, it is at 12 since we have read in 12 (complete) samples. If we called fh.read (12) again we would get the next 12 samples. If we instead called fh.read(), it would read from the pointer’s current position to the end of the file. If we wanted all the data in one array, we would move the file pointer back to the start of file, using fh.seek, before reading:

>>> fh.seek(0)      # Seek to sample 0.  Seek returns its offset in counts.
0
>>> d_complete = fh.read()
>>> d_complete.shape
(40000, 8)

We can also move the pointer with respect to the end of file by passing 2 as a second argument:

>>> fh.seek(-100, 2)    # Second arg is 0 (start of file) by default.
39900
>>> d_end = fh.read(100)
>>> np.array_equal(d_complete[-100:], d_end)
True

-100 means 100 samples before the end of file, so d_end is equal to the last 100 entries of d_complete. Baseband only keeps the most recently accessed data frame in memory, making it possible to analyze (normally large) files through selective decoding using seek and read.

Note

As with file pointers in general, fh.seek will not return an error if one seeks beyond the end of file. Attempting to read beyond the end of file, however, will result in an EOFError.

To determine where the pointer is located, we use fh.tell():

>>> fh.tell()
40000
>>> fh.close()

Caution should be used when decoding large blocks of data using fh.read. For typical files, the resulting arrays are far too large to hold in memory.

Seeking and Telling in Time With the Sample Pointer

We can use seek and tell with units of time rather than samples. To do this with tell, we can pass an appropriate astropy.units.Unit object to its optional unit parameter:

>>> fh = vdif.open(SAMPLE_VDIF, 'rs')
>>> fh.seek(40000)
40000
>>> fh.tell(unit=u.ms)
<Quantity 1.25 ms>

Passing the string 'time' reports the pointer’s location in absolute time:

>>> fh.tell(unit='time')
<Time object: scale='utc' format='isot' value=2014-06-16T05:56:07.001250000>

We can also pass an absolute astropy.time.Time, or a positive or negative time difference TimeDelta or astropy.units.Quantity to seek. If the offset is a Time object, the second argument to seek is ignored.:

>>> from astropy.time.core import TimeDelta
>>> from astropy.time import Time
>>> fh.seek(TimeDelta(-5e-4, format='sec'), 2)  # Seek -0.5 ms from end.
24000
>>> fh.seek(0.25*u.ms, 1)  # Seek 0.25 ms from current position.
32000
>>> # Seek to specific time.
>>> fh.seek(Time('2014-06-16T05:56:07.001125'))
36000

We can retrieve the time of the first sample in the file using start_time, the time immediately after the last sample using stop_time, and the time of the pointer’s current location (equivalent to fh.tell(unit='time')) using time:

>>> fh.start_time
<Time object: scale='utc' format='isot' value=2014-06-16T05:56:07.000000000>
>>> fh.stop_time
<Time object: scale='utc' format='isot' value=2014-06-16T05:56:07.001250000>
>>> fh.time
<Time object: scale='utc' format='isot' value=2014-06-16T05:56:07.001125000>
>>> fh.close()
Extracting Header Information

The first header of the file is stored as the header0 attribute of the stream reader object; it gives direct access to header properties via keyword lookup:

>>> with vdif.open(SAMPLE_VDIF, 'rs') as fh:
...     header0 = fh.header0
>>> header0['frame_length']
629

The full list of keywords is available by printing out header0:

>>> header0
<VDIFHeader3 invalid_data: False,
             legacy_mode: False,
             seconds: 14363767,
             _1_30_2: 0,
             ref_epoch: 28,
             frame_nr: 0,
             vdif_version: 1,
             lg2_nchan: 0,
             frame_length: 629,
             complex_data: False,
             bits_per_sample: 1,
             thread_id: 1,
             station_id: 65532,
             edv: 3,
             sampling_unit: True,
             sampling_rate: 16,
             sync_pattern: 0xacabfeed,
             loif_tuning: 859832320,
             _7_28_4: 15,
             dbe_unit: 2,
             if_nr: 0,
             subband: 1,
             sideband: True,
             major_rev: 1,
             minor_rev: 5,
             personality: 131>

A number of derived properties, such as the time (as a Time object), are also available through the header object.

>>> header0.time
<Time object: scale='utc' format='isot' value=2014-06-16T05:56:07.000000000>

These are listed in the API for each header class. For example, the sample VDIF file’s headers are of class:

>>> type(header0)
<class 'baseband.vdif.header.VDIFHeader3'>

and so its attributes can be found here.

Reading Specific Components of the Data

By default, fh.read() returns complete samples, i.e. with all available threads, polarizations or channels. If we were only interested in decoding a subset of the complete sample, we can select specific components by passing indexing objects to the subset keyword in open. For example, if we only wanted thread 3 of the sample VDIF file:

>>> fh = vdif.open(SAMPLE_VDIF, 'rs', subset=3)
>>> fh.sample_shape
()
>>> d = fh.read(20000)
>>> d.shape
(20000,)
>>> fh.subset
(3,)
>>> fh.close()

Since by default data are squeezed, one obtains a data stream with just a single dimension. If one would like to keep all information, one has to pass squeeze=False and also make subset a list (or slice):

>>> fh = vdif.open(SAMPLE_VDIF, 'rs', subset=[3], squeeze=False)
>>> fh.sample_shape
SampleShape(nthread=1, nchan=1)
>>> d = fh.read(20000)
>>> d.shape
(20000, 1, 1)
>>> fh.close()

Data with multi-dimensional samples can be subset by passing a tuple of indexing objects with the same dimensional ordering as the (possibly squeezed) sample shape; in the case of the sample VDIF with squeeze=False, this is threads, then channels. For example, if we wished to select threads 1 and 3, and channel 0:

>>> fh = vdif.open(SAMPLE_VDIF, 'rs', subset=([1, 3], 0), squeeze=False)
>>> fh.sample_shape
SampleShape(nthread=2)
>>> fh.close()

Generally, subset accepts any object that can be used to index a numpy.ndarray, including advanced indexing (as done above, with subset=([1, 3], 0)). If possible, slices should be used instead of list of integers, since indexing with them returns a view rather than a copy and thus avoid unnecessary processing and memory allocation. (An exception to this is VDIF threads, where the subset is used to selectively read specific threads, and thus is not used for actual slicing of the data.)

Writing to Files and Format Conversion

Writing to a File

To write data to disk, we again use open. Writing data in a particular format requires both the header and data samples. For modifying an existing file, we have both the old header and old data handy.

As a simple example, let’s read in the 8-thread, single-channel sample VDIF file and rewrite it as an single-thread, 8-channel one, which, for example, may be necessary for compatibility with DSPSR:

>>> import baseband.vdif as vdif
>>> from baseband.data import SAMPLE_VDIF
>>> fr = vdif.open(SAMPLE_VDIF, 'rs')
>>> fw = vdif.open('test_vdif.vdif', 'ws',
...                sample_rate=fr.sample_rate,
...                samples_per_frame=fr.samples_per_frame // 8,
...                nthread=1, nchan=fr.sample_shape.nthread,
...                complex_data=fr.complex_data, bps=fr.bps,
...                edv=fr.header0.edv, station=fr.header0.station,
...                time=fr.start_time)

The minimal parameters needed to generate a file are listed under the documentation for each format’s open, though comprehensive lists can be found in the documentation for each format’s stream writer class (eg. for VDIF, it’s under VDIFStreamWriter). In practice we specify as many relevant header properties as available to obtain a particular file structure. If we possess the exact first header of the file, it can simply be passed to open via the header keyword. In the example above, though, we manually switch the values of nthread and nchan. Because VDIF EDV = 3 requires each frame’s payload to contain 5000 bytes, and nchan is now a factor of 8 larger, we decrease samples_per_frame, the number of complete (i.e. all threads and channels included) samples per frame, by a factor of 8.

Encoding samples and writing data to file is done by passing data arrays into fw’s write method. The first dimension of the arrays is sample number, and the remaining dimensions must be as given by fw.sample_shape:

>>> fw.sample_shape
SampleShape(nchan=8)

In this case, the required dimensions are the same as the arrays from fr.read. We can thus write the data to file using:

>>> while fr.tell() < fr.shape[0]:
...     fw.write(fr.read(fr.samples_per_frame))
>>> fr.close()
>>> fw.close()

For our sample file, we could simply have written

fw.write(fr.read())

instead of the loop, but for large files, reading and writing should be done in smaller chunks to minimize memory usage. Baseband stores only the data frame or frame set being read or written to in memory.

We can check the validity of our new file by re-opening it:

>>> fr = vdif.open(SAMPLE_VDIF, 'rs')
>>> fh = vdif.open('test_vdif.vdif', 'rs')
>>> fh.sample_shape
SampleShape(nchan=8)
>>> np.all(fr.read() == fh.read())
True
>>> fr.close()
>>> fh.close()

Note

One can also use the top-level open function for writing, with the file format passed in via its format argument.

File Format Conversion

It is often preferable to convert data from one file format to another that offers wider compatibility, or better fits the structure of the data. As an example, we convert the sample Mark 4 data to VDIF.

Since we don’t have a VDIF header handy, we pass the relevant Mark 4 header values into vdif.open to create one.

>>> import baseband.mark4 as mark4
>>> from baseband.data import SAMPLE_MARK4
>>> fr = mark4.open(SAMPLE_MARK4, 'rs', ntrack=64, decade=2010)
>>> spf = 640       # fanout * 160 = 640 invalid samples per Mark 4 frame
>>> fw = vdif.open('m4convert.vdif', 'ws', sample_rate=fr.sample_rate,
...                samples_per_frame=spf, nthread=1,
...                nchan=fr.sample_shape.nchan,
...                complex_data=fr.complex_data, bps=fr.bps,
...                edv=1, time=fr.start_time)

We choose edv = 1 since it’s the simplest VDIF EDV whose header includes a sampling rate. The concept of threads does not exist in Mark 4, so the file effectively has nthread = 1. As discussed in the Mark 4 documentation, the data at the start of each frame is effectively overwritten by the header and are represented by invalid samples in the stream reader. We set samples_per_frame to 640 so that each section of invalid data is captured in a single frame.

We now write the data to file, manually flagging each invalid data frame:

>>> while fr.tell() < fr.shape[0]:
...     d = fr.read(fr.samples_per_frame)
...     fw.write(d[:640], valid=False)
...     fw.write(d[640:])
>>> fr.close()
>>> fw.close()

Lastly, we check our new file:

>>> fr = mark4.open(SAMPLE_MARK4, 'rs', ntrack=64, decade=2010)
>>> fh = vdif.open('m4convert.vdif', 'rs')
>>> np.all(fr.read() == fh.read())
True
>>> fr.close()
>>> fh.close()

For file format conversion in general, we have to consider how to properly scale our data to make the best use of the dynamic range of the new encoded format. For VLBI formats like VDIF, Mark 4 and Mark 5B, samples of the same size have the same scale, which is why we did not have to rescale our data when writing 2-bits-per-sample Mark 4 data to a 2-bits-per-sample VDIF file. Rescaling is necessary, though, to convert DADA or GSB to VDIF. For examples of rescaling, see the baseband/tests/test_conversion.py file.

Reading or Writing to a Sequence of Files

Data from one continuous observation is often spread over a sequence of files. The sequentialfile module is available for reading in a sequence as if it were one contiguous file. Simple usage examples can be found in the Sequential File section. DADA data is so often stored in a file sequence that reading a time-ordered list of filenames is built into open; for details, see the its API entry.

Glossary

channel
A single component of the complete sample, or a stream thereof. They typically represent one frequency sub-band, the output from a single antenna, or (for channelized data) one spectral or Fourier channel, ie. one part of a Fourier spectrum.
complete sample
Set of all component samples - ie. from all threads, polarizations, channels, etc. - for one point in time. Its dimensions are given by the sample shape.
component
One individual thread and channel, or one polarization and channel, etc. Component samples each occupy one element in decoded data arrays. A component sample is composed of one elementary sample if it is real, and two if it is complex.
data frame
A block of time-sampled data, or payload, accompanied by a header. “Frame” for short.
data frameset
In the VDIF format, the set of all data frames representing the same segment of time. Each data frame consists of sets of channels from different threads.
elementary sample
The smallest subdivision of a complete sample, i.e. the real / imaginary part of one component of a complete sample.
header
Metadata accompanying a data frame.
payload
The data within a data frame.
sample
Data from one point in time. Complete samples contain samples from all components, while elementary samples are one part of one component.
sample rate
Rate of complete samples.
sample shape
The lengths of the dimensions of the complete sample.
squeezing
The removal of any dimensions of length unity from decoded data.
stream
Timeseries of samples; may refer to all of, or a subsection of, the dataset.
subset
A subset of a complete sample, in particular one defined by the user for selective decoding.
thread
A collection of channels from the complete sample, or a stream thereof. For VDIF, each thread is carried by a separate (set of) data frame(s).

Specific file formats

Baseband’s code is subdivided into its supported file formats, and the following sections contain format specifications, usage notes, troubleshooting help and APIs for each.

VDIF

The VLBI Data Interchange Format (VDIF) was introduced in 2009 to standardize VLBI data transfer and storage. Detailed specifications are found in VDIF’s specification document.

File Structure

A VDIF file is composed of data frames. Each has a header of eight 32-bit words (32 bytes; the exception is the “legacy VDIF” format, which is four words, or 16 bytes, long), and a payload that ranges from 32 bytes to ~134 megabytes. Both are little-endian. The first four words of a VDIF header hold the same information in all VDIF files, but the last four words hold optional user-defined data. The layout of these four words is specified by the file’s extended-data version, or EDV. More detailed information on the header can be found in the tutorial for supporting a new VDIF EDV.

A data frame may carry one or multiple channels, and a stream of data frames all carrying the same (set of) channels is known as a thread and denoted by its thread ID. The collection of frames representing the same time segment (and all possible thread IDs) is called a data frameset (or just “frameset”).

Strict time and thread ID ordering of frames in the stream, while considered part of VDIF best practices, is not mandated, and cannot be guaranteed during data transmission over the internet.

Usage Notes

This section covers reading and writing VDIF files with Baseband; general usage can be found under the Getting Started section. For situations in which one is unsure of a file’s format, Baseband features the general baseband.open and baseband.file_info functions, which are also discussed in Getting Started. The examples below use the small sample file baseband/data/sample.vdif, and the numpy, astropy.units, and baseband.vdif modules:

>>> import numpy as np
>>> from baseband import vdif
>>> import astropy.units as u
>>> from baseband.data import SAMPLE_VDIF

Simple reading and writing of VDIF files can be done entirely using open. Opening in binary mode provides a normal file reader, but extended with methods to read a VDIFFrameSet data container for storing a frame set as well as VDIFFrame one for storing a single frame:

>>> fh = vdif.open(SAMPLE_VDIF, 'rb')
>>> fs = fh.read_frameset()
>>> fs.data.shape
(20000, 8, 1)
>>> fr = fh.read_frame()
>>> fr.data.shape
(20000, 1)
>>> fh.close()

(As with other formats, fr.data is a read-only property of the frame.)

Opening in stream mode wraps the low-level routines such that reading and writing is in units of samples. It also provides access to header information:

>>> fh = vdif.open(SAMPLE_VDIF, 'rs')
>>> fh
<VDIFStreamReader name=... offset=0
    sample_rate=32.0 MHz, samples_per_frame=20000,
    sample_shape=SampleShape(nthread=8),
    bps=2, complex_data=False, edv=3, station=65532,
    start_time=2014-06-16T05:56:07.000000000>
>>> d = fh.read(12)
>>> d.shape
(12, 8)
>>> d[:, 0].astype(int)  # first thread
array([-1, -1,  3, -1,  1, -1,  3, -1,  1,  3, -1,  1])
>>> fh.close()

To set up a file for writing needs quite a bit of header information. Not coincidentally, what is given by the reader above suffices:

>>> from astropy.time import Time
>>> fw = vdif.open('try.vdif', 'ws', sample_rate=32*u.MHz,
...                samples_per_frame=20000, nchan=1, nthread=2,
...                complex_data=False, bps=2, edv=3, station=65532,
...                time=Time('2014-06-16T05:56:07.000000000'))
>>> with vdif.open(SAMPLE_VDIF, 'rs', subset=[1, 3]) as fh:
...    d = fh.read(20000)  # Get some data to write
>>> fw.write(d)
>>> fw.close()
>>> fh = vdif.open('try.vdif', 'rs')
>>> d2 = fh.read(12)
>>> np.all(d[:12] == d2)
True
>>> fh.close()

Here is a simple example to copy a VDIF file. We use the sort=False option to ensure the frames are written exactly in the same order, so the files should be identical:

>>> with vdif.open(SAMPLE_VDIF, 'rb') as fr, vdif.open('try.vdif', 'wb') as fw:
...     while True:
...         try:
...             fw.write_frameset(fr.read_frameset(sort=False))
...         except:
...             break

For small files, one could just do:

>>> with vdif.open(SAMPLE_VDIF, 'rs') as fr, \
...         vdif.open('try.vdif', 'ws', header0=fr.header0,
...                   sample_rate=fr.sample_rate,
...                   nthread=fr.sample_shape.nthread) as fw:
...     fw.write(fr.read())

This copies everything to memory, though, and some header information is lost.

Troubleshooting

In situations where the VDIF files being handled are corrupted or modified in an unusual way, using open will likely lead to an exception being raised or to unexpected behavior. In such cases, it may still be possible to read in the data. Below, we provide a few solutions and workarounds to do so.

Note

This list is certainly incomplete. If you have an issue (solved or otherwise) you believe should be on this list, please e-mail the contributors.

AssertionError when checking EDV in header verify function

All VDIF header classes (other than VDIFLegacyHeader) check, using their verify function, that the EDV read from file matches the class EDV. If they do not, the following line

assert self.edv is None or self.edv == self['edv']

returns an AssertionError. If this occurs because the VDIF EDV is not yet supported by Baseband, support can be added by implementing a custom header class. If the EDV is supported, but the header deviates from the format found in the VLBI.org EDV registry, the best solution is to create a custom header class, then override the subclass selector in VDIFHeader. Tutorials for doing either can be found here.

EOFError encountered in _get_frame_rate when reading

When the sample rate is not input by the user and cannot be deduced from header information (if EDV = 1 or, the sample rate is found in the header), Baseband tries to determine the frame rate using the private method _get_frame_rate in VDIFStreamReader (and then multiply by the samples per frame to obtain the sample rate). This function raises EOFError if the file contains less than one second of data, or is corrupt. In either case the file can be opened still by explicitly passing in the sample rate to open via the sample_rate keyword.

Reference/API

baseband.vdif Package

VLBI Data Interchange Format (VDIF) reader/writer

For the VDIF specification, see http://www.vlbi.org/vdif

Functions
open(name[, mode]) Open VDIF file for reading or writing.
Classes
VDIFFrame(header, payload[, valid, verify]) Representation of a VDIF data frame, consisting of a header and payload.
VDIFFrameSet(frames[, header0]) Representation of a set of VDIF frames, combining different threads.
VDIFHeader(words[, edv, verify]) VDIF Header, supporting different Extended Data Versions.
VDIFPayload(words[, header, nchan, bps, …]) Container for decoding and encoding VDIF payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.vdif.frame.VDIFFrame, baseband.vdif.frame.VDIFFrameSet, baseband.vdif.header.VDIFHeader, baseband.vdif.payload.VDIFPayload

baseband.vdif.header Module

Definitions for VLBI VDIF Headers.

Implements a VDIFHeader class used to store header words, and decode/encode the information therein.

For the VDIF specification, see http://www.vlbi.org/vdif

Classes
VDIFHeader(words[, edv, verify]) VDIF Header, supporting different Extended Data Versions.
VDIFBaseHeader(words[, edv, verify]) Base for non-legacy VDIF headers that use 8 32-bit words.
VDIFSampleRateHeader(words[, edv, verify]) Base for VDIF headers that include the sample rate (EDV= 1, 3, 4).
VDIFLegacyHeader(words[, edv, verify]) Legacy VDIF header that uses only 4 32-bit words.
VDIFHeader0(words[, edv, verify]) VDIF Header for EDV=0.
VDIFHeader1(words[, edv, verify]) VDIF Header for EDV=1.
VDIFHeader2(words[, edv, verify]) VDIF Header for EDV=2.
VDIFHeader3(words[, edv, verify]) VDIF Header for EDV=3.
VDIFMark5BHeader(words[, edv, verify]) Mark 5B over VDIF (EDV=0xab).
Variables
VDIF_HEADER_CLASSES Dict for storing VDIF header class definitions, indexed by their EDV.
Class Inheritance Diagram

Inheritance diagram of baseband.vdif.header.VDIFHeader, baseband.vdif.header.VDIFBaseHeader, baseband.vdif.header.VDIFSampleRateHeader, baseband.vdif.header.VDIFLegacyHeader, baseband.vdif.header.VDIFHeader0, baseband.vdif.header.VDIFHeader1, baseband.vdif.header.VDIFHeader2, baseband.vdif.header.VDIFHeader3, baseband.vdif.header.VDIFMark5BHeader

baseband.vdif.payload Module

Definitions for VLBI VDIF payloads.

Implements a VDIFPayload class used to store payload words, and decode to or encode from a data array.

See the VDIF specification page for payload specifications.

Functions
init_luts() Sets up the look-up tables for levels as a function of input byte.
decode_2bit(words) Decodes data stored using 2 bits per sample.
decode_4bit(words) Decodes data stored using 4 bits per sample.
encode_2bit(values) Encodes values using 2 bits per sample, packing the result into bytes.
encode_4bit(values) Encodes values using 4 bits per sample, packing the result into bytes.
Classes
VDIFPayload(words[, header, nchan, bps, …]) Container for decoding and encoding VDIF payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.vdif.payload.VDIFPayload

baseband.vdif.frame Module

Definitions for VLBI VDIF frames and frame sets.

Implements a VDIFFrame class that can be used to hold a header and a payload, providing access to the values encoded in both. Also, define a VDIFFrameSet class that combines a set of frames from different threads.

For the VDIF specification, see http://www.vlbi.org/vdif

Classes
VDIFFrame(header, payload[, valid, verify]) Representation of a VDIF data frame, consisting of a header and payload.
VDIFFrameSet(frames[, header0]) Representation of a set of VDIF frames, combining different threads.
Class Inheritance Diagram

Inheritance diagram of baseband.vdif.frame.VDIFFrame, baseband.vdif.frame.VDIFFrameSet

baseband.vdif.base Module
Functions
open(name[, mode]) Open VDIF file for reading or writing.
Classes
VDIFFileReader(fh_raw) Simple reader for VDIF files.
VDIFFileWriter(fh_raw) Simple writer for VDIF files.
VDIFStreamBase(fh_raw, header0[, …]) Base for VDIF streams.
VDIFStreamReader(fh_raw[, sample_rate, …]) VLBI VDIF format reader.
VDIFStreamWriter(fh_raw[, header0, …]) VLBI VDIF format writer.
Class Inheritance Diagram

Inheritance diagram of baseband.vdif.base.VDIFFileReader, baseband.vdif.base.VDIFFileWriter, baseband.vdif.base.VDIFStreamBase, baseband.vdif.base.VDIFStreamReader, baseband.vdif.base.VDIFStreamWriter

MARK 5B

The Mark 5B format is the output format of the Mark 5B disk-based VLBI data system. It is described in its design specifications.

File Structure

Each data frame consists of a header consisting of four 32-bit words (16 bytes) followed by a payload of 2500 32-bit words (10000 bytes). The header contains a sync word, frame number, and timestamp (accurate to 1 ms), as well as user-specified data; see Sec. 1 of the design specifications for details. The payload supports \(2^n\) bit streams, for \(0 \leq n \leq 5\), and the first sample of each stream corresponds precisely to the header time. elementary samples may be 1 or 2 bits in size, with the latter being stored in two successive bit streams. The number of channels is equal to the number of bit-streams divided by the number of bits per elementary sample (Baseband currently only supports files where all bit-streams are active). Files begin at a header (unlike for Mark 4), and an integer number of frames fit within 1 second.

The Mark 5B system also outputs files with the active bit-stream mask, number of frames per second, and observational metadata (Sec. 1.3 of the design specifications). Baseband does not yet use these files, and instead requires the user specify, for example, the sample rate.

Usage

This section covers reading and writing Mark 5B files with Baseband; general usage can be found under the Getting Started section. For situations in which one is unsure of a file’s format, Baseband features the general baseband.open and baseband.file_info functions, which are also discussed in Getting Started. The examples below use the small sample file baseband/data/sample.m5b, and the numpy, astropy.units, astropy.time.Time, and baseband.mark5b modules:

>>> import numpy as np
>>> import astropy.units as u
>>> from astropy.time import Time
>>> from baseband import mark5b
>>> from baseband.data import SAMPLE_MARK5B

Opening a Mark 5B file with open in binary mode provides a normal file reader extended with methods to read a Mark5BFrame. The number of channels, kiloday (thousands of MJD) and number of bits per sample must all be passed when using read_frame:

>>> fb = mark5b.open(SAMPLE_MARK5B, 'rb', kday=56000, nchan=8)
>>> frame = fb.read_frame()
>>> frame.shape
(5000, 8)
>>> fb.close()

Our sample file has 2-bit component samples, which is also the default for read_frame, so it does not need to be passed. Also, we may pass a reference Time object within 500 days of the observation start time to ref_time, rather than kday.

Opening as a stream wraps the low-level routines such that reading and writing is in units of samples. It also provides access to header information. Here, we also must provide nchan, sample_rate, and ref_time or kday:

>>> fh = mark5b.open(SAMPLE_MARK5B, 'rs', sample_rate=32*u.MHz, nchan=8,
...                  ref_time=Time('2014-06-13 12:00:00'))
>>> fh
<Mark5BStreamReader name=... offset=0
    sample_rate=32.0 MHz, samples_per_frame=5000,
    sample_shape=SampleShape(nchan=8), bps=2,
    start_time=2014-06-13T05:30:01.000000000>
>>> header0 = fh.header0    # To be used for writing, below.
>>> d = fh.read(10000)
>>> d.shape
(10000, 8)
>>> d[0, :3]    
array([-3.316505, -1.      ,  1.      ], dtype=float32)
>>> fh.close()

When writing to file, we again need to pass in sample_rate and nchan, though time can either be passed explicitly or inferred from the header:

>>> fw = mark5b.open('test.m5b', 'ws', header0=header0,
...                  sample_rate=32*u.MHz, nchan=8)
>>> fw.write(d)
>>> fw.close()
>>> fh = mark5b.open('test.m5b', 'rs', sample_rate=32*u.MHz,
...                  kday=57000, nchan=8)
>>> np.all(fh.read() == d)
True
>>> fh.close()

Reference/API

baseband.mark5b Package

Mark5B VLBI data reader.

Code inspired by Walter Brisken’s mark5access. See https://github.com/demorest/mark5access.

Also, for the Mark5B design, see http://www.haystack.mit.edu/tech/vlbi/mark5/mark5_memos/019.pdf

Functions
open(name[, mode]) Open Mark5B file for reading or writing.
Classes
Mark5BFrame(header, payload[, valid, verify]) Representation of a Mark 5B frame, consisting of a header and payload.
Mark5BHeader(words[, kday, ref_time, verify]) Decoder/encoder of a Mark5B Frame Header.
Mark5BPayload(words[, nchan, bps, complex_data]) Container for decoding and encoding VDIF payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.mark5b.frame.Mark5BFrame, baseband.mark5b.header.Mark5BHeader, baseband.mark5b.payload.Mark5BPayload

baseband.mark5b.header Module

Definitions for VLBI Mark5B Headers.

Implements a Mark5BHeader class used to store header words, and decode/encode the information therein.

For the specification, see http://www.haystack.edu/tech/vlbi/mark5/docs/Mark%205B%20users%20manual.pdf

Classes
Mark5BHeader(words[, kday, ref_time, verify]) Decoder/encoder of a Mark5B Frame Header.
Variables
CRC16 CRC polynomial used for Mark 5B Headers, as a check on the time code.
crc16 Cyclic Redundancy Check for a bitstream.
Class Inheritance Diagram

Inheritance diagram of baseband.mark5b.header.Mark5BHeader

baseband.mark5b.payload Module

Definitions for VLBI Mark 5B payloads.

Implements a Mark5BPayload class used to store payload words, and decode to or encode from a data array.

For the specification, see http://www.haystack.edu/tech/vlbi/mark5/docs/Mark%205B%20users%20manual.pdf

Functions
init_luts() Set up the look-up tables for levels as a function of input byte.
decode_2bit(words)
encode_2bit(values) Generic encoder for data stored using two bits.
Classes
Mark5BPayload(words[, nchan, bps, complex_data]) Container for decoding and encoding VDIF payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.mark5b.payload.Mark5BPayload

baseband.mark5b.frame Module

Definitions for VLBI Mark 5B frames.

Implements a Mark5BFrame class that can be used to hold a header and a payload, providing access to the values encoded in both.

For the specification, see http://www.haystack.edu/tech/vlbi/mark5/docs/Mark%205B%20users%20manual.pdf

Classes
Mark5BFrame(header, payload[, valid, verify]) Representation of a Mark 5B frame, consisting of a header and payload.
Class Inheritance Diagram

Inheritance diagram of baseband.mark5b.frame.Mark5BFrame

baseband.mark5b.base Module
Functions
open(name[, mode]) Open Mark5B file for reading or writing.
Classes
Mark5BFileReader(fh_raw[, kday, ref_time, …]) Simple reader for Mark 5B files.
Mark5BFileWriter(fh_raw) Simple writer for Mark 5B files.
Mark5BStreamReader(fh_raw[, sample_rate, …]) VLBI Mark 5B format reader.
Mark5BStreamWriter(fh_raw[, header0, …]) VLBI Mark 5B format writer.
Class Inheritance Diagram

Inheritance diagram of baseband.mark5b.base.Mark5BFileReader, baseband.mark5b.base.Mark5BFileWriter, baseband.mark5b.base.Mark5BStreamReader, baseband.mark5b.base.Mark5BStreamWriter

MARK 4

The Mark 4 format is the output format of the MIT Haystack Observatory’s Mark 4 VLBI magnetic tape-based data acquisition system, and one output format of its successor, the Mark 5A hard drive-based system. The format’s specification is in the Mark IIIA/IV/VLBA design specifications.

Baseband currently only supports files that have been parity-stripped and corrected for barrel roll and data modulation.

File Structure

Mark 4 files contain up to 64 concurrent data “tracks”. Tracks are divided into 22500-bit “tape frames”, each of which consists of a 160-bit header followed by a 19840-bit payload. The header includes a timestamp (accurate to 1.25 ms), track ID, sideband, and fan-out/in factor (see below); the details of these can be found in 2.1.1 - 2.1.3 in the design specifications. The payload consists of a 1-bit stream. When recording 2-bit elementary samples, the data is split into two tracks, with one carrying the sign bit, and the other the magnitude bit.

The header takes the place of the first 160 bits of payload data, so that the first sample occurs fanout * 160 sample times after the header time. This means that a Mark 4 stream is not contiguous in time. The length of one frame ranges from 1.25 ms to 160 ms in octave steps (which ensures an integer number of frames falls within 1 minute), setting the maximum sample rate per track to 18 megabits/track/s.

Data from a single channel may be distributed to multiple tracks - “fan-out” - or multiple channels fed to one track - “fan-in”. Fan-out is used when sampling at rates higher than 18 megabits/track/s. Baseband currently only supports tracks using fan-out (“longitudinal data format”).

Baseband reconstructs the tracks into channels (reconstituting 2-bit data from two tracks into a single channel if necessary) and combines tape frame headers into a single data frame header.

Usage

This section covers reading and writing Mark 4 files with Baseband; general usage can be found under the Getting Started section. For situations in which one is unsure of a file’s format, Baseband features the general baseband.open and baseband.file_info functions, which are also discussed in Getting Started. The examples below use the small sample file baseband/data/sample.m4, and the numpy, astropy.units, astropy.time.Time, and baseband.mark4 modules:

>>> import numpy as np
>>> import astropy.units as u
>>> from astropy.time import Time
>>> from baseband import mark4
>>> from baseband.data import SAMPLE_MARK4

Opening a Mark 4 file with open in binary mode provides a normal file reader but extended with methods to read a Mark4Frame. Mark 4 files generally do not start (or end) at a frame boundary, so in binary mode one has to seek the first frame using locate_frame (which will also determine the number of Mark 4 tracks, if not given explicitly). Since Mark 4 files do not store the full time information, one must pass either the the decade the data was taken, or an equivalent reference Time object:

>>> fb = mark4.open(SAMPLE_MARK4, 'rb', decade=2010)
>>> fb.locate_frame()  # Locate first frame.
2696
>>> frame = fb.read_frame()
>>> frame.shape
(80000, 8)
>>> fb.close()

Opening in stream mode automatically seeks for the first frame, and wraps the low-level routines such that reading and writing is in units of samples. It also provides access to header information. Here we pass a reference Time object within 4 years of the observation start time to ref_time, rather than a decade:

>>> fh = mark4.open(SAMPLE_MARK4, 'rs', ref_time=Time('2013:100:23:00:00'))
>>> fh
<Mark4StreamReader name=... offset=0
    sample_rate=32.0 MHz, samples_per_frame=80000,
    sample_shape=SampleShape(nchan=8), bps=2,
    start_time=2014-06-16T07:38:12.47500>
>>> d = fh.read(6400)
>>> d.shape
(6400, 8)
>>> d[635:645, 0].astype(int)  # first channel
array([ 0,  0,  0,  0,  0, -1,  1,  3,  1, -1])
>>> fh.close()

As mentioned in the File Structure section, because the header takes the place of the first 160 samples of each track, the first payload sample occurs fanout * 160 sample times after the header time. The stream reader includes these overwritten samples as invalid data (zeros, by default):

>>> np.array_equal(d[:640], np.zeros((640,) + d.shape[1:]))
True

When writing to file, we need to pass in the sample rate in addition to decade. The number of tracks can be inferred from the header:

>>> fw = mark4.open('sample_mark4_segment.m4', 'ws', header0=frame.header,
...                 sample_rate=32*u.MHz, decade=2010)
>>> fw.write(frame.data)
>>> fw.close()
>>> fh = mark4.open('sample_mark4_segment.m4', 'rs',
...                 sample_rate=32.*u.MHz, decade=2010)
>>> np.all(fh.read(80000) == frame.data)
True
>>> fh.close()

Note that above we had to pass in the sample rate even when opening the file for reading; this is because there is only a single frame in the file, and hence the sample rate cannot be inferred automatically.

Reference/API

baseband.mark4 Package

Mark 4 VLBI data reader.

Code inspired by Walter Brisken’s mark5access. See https://github.com/demorest/mark5access.

The format itself is described in detail in http://www.haystack.mit.edu/tech/vlbi/mark5/docs/230.3.pdf

Functions
open(name[, mode]) Open Mark4 file for reading or writing.
Classes
Mark4Frame(header, payload[, valid, verify]) Representation of a Mark 4 frame, consisting of a header and payload.
Mark4Header(words[, ntrack, decade, …]) Decoder/encoder of a Mark 4 Header, containing all streams.
Mark4Payload(words[, header, nchan, bps, fanout]) Container for decoding and encoding Mark 4 payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.mark4.frame.Mark4Frame, baseband.mark4.header.Mark4Header, baseband.mark4.payload.Mark4Payload

baseband.mark4.header Module

Definitions for VLBI Mark 4 Headers.

Implements a Mark4Header class used to store header words, and decode/encode the information therein.

For the specification of tape Mark 4 format, see http://www.haystack.mit.edu/tech/vlbi/mark5/docs/230.3.pdf

A little bit on the disk representation is at http://adsabs.harvard.edu/abs/2003ASPC..306..123W

Functions
stream2words(stream[, track]) Convert a stream of integers to uint32 header words.
words2stream(words) Convert a set of uint32 header words to a stream of integers.
Classes
Mark4TrackHeader(words[, decade, ref_time, …]) Decoder/encoder of a Mark 4 Track Header.
Mark4Header(words[, ntrack, decade, …]) Decoder/encoder of a Mark 4 Header, containing all streams.
Variables
CRC12 CRC polynomial used for Mark 4 Headers.
crc12 Cyclic Redundancy Check for a bitstream.
Class Inheritance Diagram

Inheritance diagram of baseband.mark4.header.Mark4TrackHeader, baseband.mark4.header.Mark4Header

baseband.mark4.payload Module

Definitions for VLBI Mark 4 payloads.

Implements a Mark4Payload class used to store payload words, and decode to or encode from a data array.

For the specification, see http://www.haystack.mit.edu/tech/vlbi/mark5/docs/230.3.pdf

Functions
reorder32(x) Reorder 32-track bits to bring signs & magnitudes together.
reorder64(x) Reorder 64-track bits to bring signs & magnitudes together.
init_luts() Set up the look-up tables for levels as a function of input byte.
decode_8chan_2bit_fanout4(frame) Decode payload for 8 channels using 2 bits, fan-out 4 (64 tracks).
encode_8chan_2bit_fanout4(values) Encode payload for 8 channels using 2 bits, fan-out 4 (64 tracks).
Classes
Mark4Payload(words[, header, nchan, bps, fanout]) Container for decoding and encoding Mark 4 payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.mark4.payload.Mark4Payload

baseband.mark4.frame Module

Definitions for VLBI Mark 4 payloads.

Implements a Mark4Payload class used to store payload words, and decode to or encode from a data array.

For the specification, see http://www.haystack.mit.edu/tech/vlbi/mark5/docs/230.3.pdf

Classes
Mark4Frame(header, payload[, valid, verify]) Representation of a Mark 4 frame, consisting of a header and payload.
Class Inheritance Diagram

Inheritance diagram of baseband.mark4.frame.Mark4Frame

baseband.mark4.base Module
Functions
open(name[, mode]) Open Mark4 file for reading or writing.
Classes
Mark4FileReader(fh_raw[, ntrack, decade, …]) Simple reader for Mark 4 files.
Mark4FileWriter(fh_raw) Simple writer for Mark 4 files.
Mark4StreamReader(fh_raw[, sample_rate, …]) VLBI Mark 4 format reader.
Mark4StreamWriter(fh_raw[, header0, …]) VLBI Mark 4 format writer.
Class Inheritance Diagram

Inheritance diagram of baseband.mark4.base.Mark4FileReader, baseband.mark4.base.Mark4FileWriter, baseband.mark4.base.Mark4StreamReader, baseband.mark4.base.Mark4StreamWriter

DADA

Distributed Acquisition and Data Analysis (DADA) format data files contain a single data frame consisting of an ASCII header of typically 4096 bytes followed by a payload.

Usage

This section covers reading and writing DADA files with Baseband; general usage is covered in the Getting Started section. For situations in which one is unsure of a file’s format, Baseband features the general baseband.open and baseband.file_info functions, which are also discussed in Getting Started. The examples below use the sample file baseband/data/sample.dada, and the the astropy.units and baseband.dada modules:

>>> from baseband import dada
>>> import astropy.units as u
>>> from baseband.data import SAMPLE_DADA

Single files can be opened with open in binary mode. DADA files typically consist of just a single header and payload, and can be read into a single DADAFrame.

>>> fb = dada.open(SAMPLE_DADA, 'rb')
>>> frame = fb.read_frame()
>>> frame.shape
(16000, 2, 1)
>>> frame[:3].squeeze()
array([[ -38.-38.j,  -38.-38.j],
       [ -38.-38.j,  -40. +0.j],
       [-105.+60.j,   85.-15.j]], dtype=complex64)
>>> fb.close()

Since the files can be quite large, the payload is mapped (with numpy.memmap), so that if one accesses part of the data, only the corresponding parts of the encoded payload are loaded into memory (since the sample file is encoded using 8 bits, the above example thus loads 12 bytes into memory).

Opening in stream mode wraps the low-level routines such that reading and writing is in units of samples, and provides access to header information:

>>> fh = dada.open(SAMPLE_DADA, 'rs')
>>> fh
<DADAStreamReader name=... offset=0
    sample_rate=16.0 MHz, samples_per_frame=16000,
    sample_shape=SampleShape(npol=2), bps=8,
    start_time=2013-07-02T01:39:20.000>
>>> d = fh.read(10000)
>>> d.shape
(10000, 2)
>>> d[:3]
array([[ -38.-38.j,  -38.-38.j],
       [ -38.-38.j,  -40. +0.j],
       [-105.+60.j,   85.-15.j]], dtype=complex64)
>>> fh.close()

To set up a file for writing as a stream is possible as well:

>>> from astropy.time import Time
>>> fw = dada.open('{utc_start}.{obs_offset:016d}.000000.dada', 'ws',
...                sample_rate=16*u.MHz, samples_per_frame=5000,
...                npol=2, nchan=1, bps=8, complex_data=True,
...                time=Time('2013-07-02T01:39:20.000'))
>>> fw.write(d)
>>> fw.close()
>>> import os
>>> [f for f in sorted(os.listdir('.')) if f.startswith('2013')]
['2013-07-02-01:39:20.0000000000000000.000000.dada',
 '2013-07-02-01:39:20.0000000000020000.000000.dada']
>>> fr = dada.open('2013-07-02-01:39:20.{obs_offset:016d}.000000.dada', 'rs')
>>> d2 = fr.read()
>>> (d == d2).all()
True
>>> fr.close()

Here, we have used an even smaller size of the payload, to show how one can define multiple files. DADA data are typically stored in sequences of files. If, in place of a single filename, one passes a time-ordered list or tuple of filenames to open, it uses sequentialfile.open to read or write to them as a single contiguous file. If, as above, one passes a template string, open uses DADAFileNameSequencer to create a subscriptable filename generator, which is then passed to sequentialfile.open. (See API links for further details.)

Reference/API

baseband.dada Package

Distributed Acquisition and Data Analysis (DADA) format reader/writer.

Functions
open(name[, mode]) Open DADA file for reading or writing.
Classes
DADAFrame(header, payload[, valid, verify]) Representation of a DADA file, consisting of a header and payload.
DADAHeader(*args, **kwargs) DADA baseband file format header.
DADAPayload(words[, header, sample_shape, …]) Container for decoding and encoding DADA payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.dada.frame.DADAFrame, baseband.dada.header.DADAHeader, baseband.dada.payload.DADAPayload

baseband.dada.header Module

Definitions for DADA pulsar baseband headers.

Implements a DADAHeader class used to store header definitions in a FITS header, and read & write these from files.

Classes
DADAHeader(*args, **kwargs) DADA baseband file format header.
Class Inheritance Diagram

Inheritance diagram of baseband.dada.header.DADAHeader

baseband.dada.payload Module

Payload for DADA format.

Classes
DADAPayload(words[, header, sample_shape, …]) Container for decoding and encoding DADA payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.dada.payload.DADAPayload

baseband.dada.frame Module
Classes
DADAFrame(header, payload[, valid, verify]) Representation of a DADA file, consisting of a header and payload.
Class Inheritance Diagram

Inheritance diagram of baseband.dada.frame.DADAFrame

baseband.dada.base Module
Functions
open(name[, mode]) Open DADA file for reading or writing.
Classes
DADAFileNameSequencer(template, header) List-like generator of filenames using a template.
DADAFileReader(fh_raw) Simple reader for DADA files.
DADAFileWriter(fh_raw) Simple writer/mapper for DADA files.
DADAStreamBase(fh_raw, header0[, squeeze, …]) Base for DADA streams.
DADAStreamReader(fh_raw[, squeeze, subset, …]) DADA format reader.
DADAStreamWriter(fh_raw, header0[, squeeze]) DADA format writer.
Class Inheritance Diagram

Inheritance diagram of baseband.dada.base.DADAFileNameSequencer, baseband.dada.base.DADAFileReader, baseband.dada.base.DADAFileWriter, baseband.dada.base.DADAStreamBase, baseband.dada.base.DADAStreamReader, baseband.dada.base.DADAStreamWriter

GUPPI

The GUPPI format is the output of the Green Bank Ultimate Pulsar Processing Instrument and any clones operating at other telescopes, such as PUPPI at the Arecibo Observatory. Baseband specifically supports GUPPI data taken in baseband mode, and is based off of DSPSR’s implementation. While general format specifications can be found at the SERA Project and on Paul Demorest’s site, some of the header information could be invalid or not applicable, particularly with older files.

Baseband currently only supports 8-bit elementary samples.

File Structure

Each GUPPI file contains multiple (typically 128) frames, with each frame consisting of an ASCII header composed of 80-character entries, followed by a binary payload (or “block”). The header’s length is variable, but always ends with “END” followed by 77 spaces.

How samples are stored in the payload depends on whether or not it is channels-first. A channels-first payload stores each channel’s stream in a contiguous data block, while a non-channels-first one groups the components of a complete sample together (like with other formats). In either case, for each channel polarization samples from the same point in time are stored adjacent to one another. At the end of each channel’s data is a section of overlap samples identical to the first samples in the next payload. Baseband retains these redundant samples when reading individual GUPPI frames, but removes them when reading files as a stream.

Usage

This section covers reading and writing GUPPI files with Baseband; general usage is covered in the Getting Started section. For situations in which one is unsure of a file’s format, Baseband features the general baseband.open and baseband.file_info functions, which are also discussed in Getting Started. The examples below use the sample PUPPI file baseband/data/sample_puppi.raw, and the the astropy.units and baseband.guppi modules:

>>> from baseband import guppi
>>> import astropy.units as u
>>> from baseband.data import SAMPLE_PUPPI

Single files can be opened with open in binary mode, which provides a normal file reader, but extended with methods to read a GUPPIFrame:

>>> fb = guppi.open(SAMPLE_PUPPI, 'rb')
>>> frame = fb.read_frame()
>>> frame.shape
(1024, 2, 4)
>>> frame[:3, 0, 1]    
array([-32.-10.j, -15.-14.j,   9.-13.j], dtype=complex64)
>>> fb.close()

Since the files can be quite large, the payload is mapped (with numpy.memmap), so that if one accesses part of the data, only the corresponding parts of the encoded payload are loaded into memory (since the sample file is encoded using 8 bits, the above example thus loads 6 bytes into memory).

Opening in stream mode wraps the low-level routines such that reading and writing is in units of samples, and provides access to header information:

>>> fh = guppi.open(SAMPLE_PUPPI, 'rs')
>>> fh
<GUPPIStreamReader name=... offset=0
    sample_rate=250.0 Hz, samples_per_frame=960,
    sample_shape=SampleShape(npol=2, nchan=4), bps=8,
    start_time=2018-01-14T14:11:33.000>
>>> d = fh.read()
>>> d.shape
(3840, 2, 4)
>>> d[:3, 0, 1]    
array([-32.-10.j, -15.-14.j,   9.-13.j], dtype=complex64)
>>> fh.close()

Note that fh.samples_per_frame represents the number of samples per frame excluding overlap samples, since the stream reader works on a linearly increasing sequence of samples. Frames themselves have access to the overlap, and fh.header0.samples_per_frame returns the number of samples per frame including overlap.

To set up a file for writing as a stream is possible as well. Overlap must be zero when writing (so we set samples_per_frame to its stream reader value from above):

>>> from astropy.time import Time
>>> files = ['puppi_test.000{i}.raw'.format(i=i) for i in range(2)]
>>> fw = guppi.open(files, 'ws', frames_per_file=2, sample_rate=250*u.Hz,
...                 samples_per_frame=960, pktsize=1024,
...                 time=Time(58132.59135416667, format='mjd'),
...                 npol=2, nchan=4)
>>> fw.write(d)
>>> fw.close()
>>> fr = guppi.open(files, 'rs')
>>> d2 = fr.read()
>>> (d == d2).all()
True
>>> fr.close()

Here we show how we can write to a sequence of files. One may pass a time-ordered list or tuple of filenames to open, which then uses sequentialfile.open to read or write to them as a single contiguous file. Unlike when writing DADA files, which have one frame per file, we must specify the number of frames in one file. Note that typically one does not have to pass PKTSIZE, the UDP data packet size (set by the observing mode), but the sample file has small enough frames that the default of 8192 bytes is too large. Baseband only uses PKTSIZE to double-check the sample offset of the frame, so PKTSIZE must be set to a value such each payload, excluding overlap samples, contains an integer number of packets.

Reference/API

baseband.guppi Package

Green Bank Ultimate Pulsar Processing Instrument (GUPPI) format reader/writer.

Functions
open(name[, mode]) Open GUPPI file for reading or writing.
Classes
GUPPIFrame(header, payload[, valid, verify]) Representation of a GUPPI file, consisting of a header and payload.
GUPPIHeader(*args, **kwargs) GUPPI baseband file format header.
GUPPIPayload(words[, header, sample_shape, …]) Container for decoding and encoding GUPPI payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.guppi.frame.GUPPIFrame, baseband.guppi.header.GUPPIHeader, baseband.guppi.payload.GUPPIPayload

baseband.guppi.header Module

Definitions for GUPPI headers.

Implements a GUPPIHeader class that reads & writes FITS-like headers from file.

Classes
GUPPIHeader(*args, **kwargs) GUPPI baseband file format header.
Class Inheritance Diagram

Inheritance diagram of baseband.guppi.header.GUPPIHeader

baseband.guppi.payload Module

Payload for GUPPI format.

Classes
GUPPIPayload(words[, header, sample_shape, …]) Container for decoding and encoding GUPPI payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.guppi.payload.GUPPIPayload

baseband.guppi.frame Module
Classes
GUPPIFrame(header, payload[, valid, verify]) Representation of a GUPPI file, consisting of a header and payload.
Class Inheritance Diagram

Inheritance diagram of baseband.guppi.frame.GUPPIFrame

baseband.guppi.base Module
Functions
open(name[, mode]) Open GUPPI file for reading or writing.
Classes
GUPPIFileReader(fh_raw) Simple reader for GUPPI files.
GUPPIFileWriter(fh_raw) Simple writer/mapper for GUPPI files.
GUPPIStreamBase(fh_raw, header0[, squeeze, …]) Base for GUPPI streams.
GUPPIStreamReader(fh_raw[, squeeze, subset, …]) GUPPI format reader.
GUPPIStreamWriter(fh_raw, header0[, squeeze]) GUPPI format writer.
Class Inheritance Diagram

Inheritance diagram of baseband.guppi.base.GUPPIFileReader, baseband.guppi.base.GUPPIFileWriter, baseband.guppi.base.GUPPIStreamBase, baseband.guppi.base.GUPPIStreamReader, baseband.guppi.base.GUPPIStreamWriter

GSB

The GMRT software backend (GSB) file format is the standard output of the initial correlator of the Giant Metrewave Radio Telescope (GMRT). The GSB design is described by Roy et al. (2010, Exper. Astron. 28:25-60) with further specifications and operating procedures given on the relevant GMRT/GSB pages.

File Structure

A GSB dataset consists of an ASCII file with a sequence of headers, and one or more accompanying binary data files. Each line in the header and its corresponding data comprise a data frame, though these do not have explicit divisions in the data files.

Baseband currently supports two forms of GSB data: rawdump, for storing real-valued raw voltage timestreams, and phased, for storing complex pre-channelized data from the GMRT in phased array baseband mode.

Data in rawdump format is stored in a binary file representing the voltage stream from one polarization of a single dish. Each such file is accompanied by a header file which contains GPS timestamps, in the form:

YYYY MM DD HH MM SS 0.SSSSSSSSS

In the default rawdump observing setup, samples are recorded at a rate of 33.3333… megasamples per second (Msps). Each sample is 4 bits in size, and two samples are grouped into bytes such that the oldest sample occupies the least significant bit. Each frame consists of 4 megabytes of data, or \(2^{23}\), samples; as such, the timespan of one frame is exactly 0.25165824 s.

Data in phased format is normally spread over four binary files and one accompanying header file. The binary files come in two pairs, one for each polarization, with the pair contain the first and second half of the data of each frame.

When recording GSB in phased array voltage beam (ie. baseband) mode, the “raw”, or pre-channelized, sample rate is either 33.3333… Msps at 8 bits per sample or 66.6666… Msps at 4 bits per sample (in the latter case, sample bit-ordering is the same as for rawdump). Channelization via fast Fourier transform sets the channelized complete sample rate to the raw rate divided by \(2N_\mathrm{F}\), where \(N_\mathrm{F}\) is the number of Fourier channels (either 256 or 512). The timespan of one frame is 0.25165824 s, and one frame is 8 megabytes in size, for either raw sample rate.

The phased header’s structure is:

<PC TIME> <GPS TIME> <SEQ NUMBER> <MEM BLOCK>

where <PC TIME> and <GPS TIME> are the less accurate computer-based and exact GPS-based timestamps, respectively, with the same format as the rawdump timestamp; <SEQ NUMBER> is the frame number; and <MEM BLOCK> a redundant modulo-8 shared memory block number.

Usage Notes

This section covers reading and writing GSB files with Baseband; general usage is covered in the Getting Started section. While Baseband features the general baseband.open and baseband.file_info functions, these cannot read GSB binary files without the accompanying timestamp file (at which point it is obvious the files are GSB). baseband.file_info, however, can be used on the timestamp file to determine if it is in rawdump or phased format.

The examples below use the samplefiles in the baseband/data/gsb/ directory, and the numpy, astropy.units and baseband.gsb modules:

>>> import numpy as np
>>> import astropy.units as u
>>> from baseband import gsb
>>> from baseband.data import (
...     SAMPLE_GSB_RAWDUMP, SAMPLE_GSB_RAWDUMP_HEADER,
...     SAMPLE_GSB_PHASED, SAMPLE_GSB_PHASED_HEADER)

A single timestamp file can be opened with open in text mode:

>>> ft = gsb.open(SAMPLE_GSB_RAWDUMP_HEADER, 'rt')
>>> ft.read_timestamp()
<GSBRawdumpHeader gps: 2015 04 27 18 45 00 0.000000240>
>>> ft.close()

Reading payloads requires the samples per frame or sample rate. For phased the sample rate is:

sample_rate = raw_sample_rate / (2 * nchan)

where the raw sample rate is the pre-channelized one, and nchan the number of Fourier channels. The samples per frame for both rawdump and phased is:

samples_per_frame = timespan_of_frame * sample_rate

Note

Since the number of samples per frame is an integer number while both the frame timespan and sample rate are not, it is better to separately caculate samples_per_frame rather than multiplying timespan_of_frame with sample_rate in order to avoid rounding issues.

Alternatively, if the size of the frame buffer and the frame rate are known, the former can be used to determine samples_per_frame, and the latter used to determine sample_rate by inverting the above equation.

If samples_per_frame is not given, Baseband assumes it is the equivalent of 4 megabytes of data for rawdump, or 8 megabytes if phased. If sample_rate is not given, it is calculated from samples_per_frame assuming timespan_of_frame = 0.25165824 (see File Structure above).

A single payload file can be opened with open in binary mode. Here, for our sample file, we have to take into account that in order to keep these files small, their sample size has been reduced to only 4 or 8 kilobytes worth of samples per frame (for the default timespan). So, we define their sample rate here, and use that to calculate payload_nbytes, the size of one frame in bytes. Since rawdump samples are 4 bits, payload_nbytes is just samples_per_frame / 2:

>>> rawdump_samples_per_frame = 2**13
>>> payload_nbytes = rawdump_samples_per_frame // 2
>>> fb = gsb.open(SAMPLE_GSB_RAWDUMP, 'rb', payload_nbytes=payload_nbytes,
...               nchan=1, bps=4, complex_data=False)
>>> payload = fb.read_payload()
>>> payload[:4]
array([[ 0.],
       [-2.],
       [-2.],
       [ 0.]], dtype=float32)
>>> fb.close()

(payload_nbytes for phased data is the size of one frame divided by the number of binary files.)

Opening in stream mode allows timestamp and binary files to be read in concert to create data frames, and also wraps the low-level routines such that reading and writing is in units of samples, and provides access to header information.

When opening a rawdump file in stream mode, we pass the timestamp file as the first argument, and the binary file to the raw keyword. As per above, we also pass samples_per_frame:

>>> fh_rd = gsb.open(SAMPLE_GSB_RAWDUMP_HEADER, mode='rs',
...                  raw=SAMPLE_GSB_RAWDUMP,
...                  samples_per_frame=rawdump_samples_per_frame)
>>> fh_rd.header0
<GSBRawdumpHeader gps: 2015 04 27 18 45 00 0.000000240>
>>> dr = fh_rd.read()
>>> dr.shape
(81920,)
>>> dr[:3]
array([ 0., -2., -2.], dtype=float32)
>>> fh_rd.close()

To open a phased fileset in stream mode, we package the binary files into a nested tuple with the format:

((L pol stream 1, L pol stream 2), (R pol stream 1, R pol stream 2))

The nested tuple is passed to raw (note that we again have to pass a non-default sample rate):

>>> phased_samples_per_frame = 2**3
>>> fh_ph = gsb.open(SAMPLE_GSB_PHASED_HEADER, mode='rs',
...                  raw=SAMPLE_GSB_PHASED,
...                  samples_per_frame=phased_samples_per_frame)
>>> header0 = fh_ph.header0     # To be used for writing, below.
>>> dp = fh_ph.read()
>>> dp.shape
(80, 2, 512)
>>> dp[0, 0, :3]    
array([30.+12.j, -1. +8.j,  7.+19.j], dtype=complex64)
>>> fh_ph.close()

To set up a file for writing, we need to pass names for both timestamp and raw files, as well as sample_rate, samples_per_frame, and either the first header or a time object. We first calculate sample_rate:

>>> timespan = 0.25165824 * u.s
>>> rawdump_sample_rate = (rawdump_samples_per_frame / timespan).to(u.MHz)
>>> phased_sample_rate = (phased_samples_per_frame / timespan).to(u.MHz)

To write a rawdump file:

>>> from astropy.time import Time
>>> fw_rd = gsb.open('test_rawdump.timestamp',
...                  mode='ws', raw='test_rawdump.dat',
...                  sample_rate=rawdump_sample_rate,
...                  samples_per_frame=rawdump_samples_per_frame,
...                  time=Time('2015-04-27T13:15:00'))
>>> fw_rd.write(dr)
>>> fw_rd.close()
>>> fh_rd = gsb.open('test_rawdump.timestamp', mode='rs',
...                  raw='test_rawdump.dat',
...                  sample_rate=rawdump_sample_rate,
...                  samples_per_frame=rawdump_samples_per_frame)
>>> np.all(dr == fh_rd.read())
True
>>> fh_rd.close()

To write a phased file, we need to pass a nested tuple of filenames or filehandles:

>>> test_phased_bin = (('test_phased_pL1.dat', 'test_phased_pL2.dat'),
...                    ('test_phased_pR1.dat', 'test_phased_pR2.dat'))
>>> fw_ph = gsb.open('test_phased.timestamp',
...                  mode='ws', raw=test_phased_bin,
...                  sample_rate=phased_sample_rate,
...                  samples_per_frame=phased_samples_per_frame,
...                  header0=header0)
>>> fw_ph.write(dp)
>>> fw_ph.close()
>>> fh_ph = gsb.open('test_phased.timestamp', mode='rs',
...                  raw=test_phased_bin,
...                  sample_rate=phased_sample_rate,
...                  samples_per_frame=phased_samples_per_frame)
>>> np.all(dp == fh_ph.read())
True
>>> fh_ph.close()

Baseband does not use the PC time in the phased header, and, when writing, simply uses the same time for both GPS and PC times. Since the PC time can drift from the GPS time by several tens of milliseconds, test_phased.timestamp will not be identical to SAMPLE_GSB_PHASED, even though we have written the exact same data to file.

Reference/API

baseband.gsb Package

GMRT Software Backend (GSB) data reader.

See http://gmrt.ncra.tifr.res.in/gmrt_hpage/sub_system/gmrt_gsb/index.htm

Functions
open(name[, mode]) Open GSB file(s) for reading or writing.
Classes
GSBFrame(header, payload[, valid, verify]) Frame encapsulating GSB rawdump or phased data.
GSBHeader(words[, mode, nbytes, utc_offset, …]) GSB Header, based on a line from a timestamp file.
GSBPayload(words[, sample_shape, bps, …]) Container for decoding and encoding GSB payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.gsb.frame.GSBFrame, baseband.gsb.header.GSBHeader, baseband.gsb.payload.GSBPayload

baseband.gsb.header Module

Definitions for GSB Headers, using the timestamp files.

Somewhat out of data description for phased data: http://gmrt.ncra.tifr.res.in/gmrt_hpage/sub_system/gmrt_gsb/GSB_beam_timestamp_note_v1.pdf and for rawdump data http://gmrt.ncra.tifr.res.in/gmrt_hpage/sub_system/gmrt_gsb/GSB_rawdump_data_format_v2.pdf

Classes
TimeGSB(val1, val2, scale, precision, …[, …]) GSB header date-time format YYYY MM DD HH MM SS 0.SSSSSSSSS.
GSBHeader(words[, mode, nbytes, utc_offset, …]) GSB Header, based on a line from a timestamp file.
GSBRawdumpHeader(words[, mode, nbytes, …]) GSB rawdump header.
GSBPhasedHeader(words[, mode, nbytes, …]) GSB phased header.
Class Inheritance Diagram

Inheritance diagram of baseband.gsb.header.TimeGSB, baseband.gsb.header.GSBHeader, baseband.gsb.header.GSBRawdumpHeader, baseband.gsb.header.GSBPhasedHeader

baseband.gsb.payload Module

Definitions for GSB payloads.

Implements a GSBPayload class used to store payload blocks, and decode to or encode from a data array.

See http://gmrt.ncra.tifr.res.in/gmrt_hpage/sub_system/gmrt_gsb/index.htm

Classes
GSBPayload(words[, sample_shape, bps, …]) Container for decoding and encoding GSB payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.gsb.payload.GSBPayload

baseband.gsb.frame Module
Classes
GSBFrame(header, payload[, valid, verify]) Frame encapsulating GSB rawdump or phased data.
Class Inheritance Diagram

Inheritance diagram of baseband.gsb.frame.GSBFrame

baseband.gsb.base Module
Functions
open(name[, mode]) Open GSB file(s) for reading or writing.
Classes
GSBFileReader(fh_raw, payload_nbytes[, …]) Simple reader for GSB data files.
GSBFileWriter(fh_raw) Simple writer for GSB data files.
GSBStreamReader(fh_ts, fh_raw[, …]) GSB format reader.
GSBStreamWriter(fh_ts, fh_raw[, header0, …]) GSB format writer.
Class Inheritance Diagram

Inheritance diagram of baseband.gsb.base.GSBFileReader, baseband.gsb.base.GSBFileWriter, baseband.gsb.base.GSBStreamReader, baseband.gsb.base.GSBStreamWriter

Core framework and utilities

These sections contain APIs and usage notes for the sequential file opener, the API for the set of core utility functions and classes located in vlbi_base, and sample data that come with baseband (mostly used for testing).

Baseband Helpers

Helpers assist with reading and writing all file formats. Currently, they only include the sequentialfile module for reading a sequence of files as a single one.

Sequential File

The sequentialfile module is for reading from and writing to a sequence of files as if they were a single, contiguous one. Like with file formats, there is a master sequentialfile.open function to open sequences either for reading or writing. It returns sequential file objects that have read, write, seek, tell, and close methods that work identically to their single file object counterparts. They additionally have memmap methods to read or write to files through numpy.memmap.

As an example of how to use open, we write the data from the sample VDIF file baseband/data/sample.vdif into a sequence of two files - as the sample file has two framesets - and then read the files back in. We first load the required data:

>>> from baseband import vdif
>>> from baseband.data import SAMPLE_VDIF
>>> import numpy as np
>>> fh = vdif.open(SAMPLE_VDIF, 'rs')
>>> d = fh.read()

We now open a sequential file object for writing:

>>> from baseband.helpers import sequentialfile as sf
>>> filenames = ["seqvdif_{0}".format(i) for i in range(2)]
>>> file_size = fh.fh_raw.seek(0, 2) // 2
>>> fwr = sf.open(filenames, mode='wb', file_size=file_size)

The first argument passed to open must be a time-ordered sequence of filenames in a list, tuple, or other subscriptable object that returns IndexError when the index is out of bounds. The read mode is ‘wb’, though note that writing using numpy.memmap (eg. required for the DADA stream writer) is only possible if mode='w+b'. file_size determines the largest size a file may reach before the next one in the sequence is opened for writing. We set file_size such that each file holds exactly one frameset.

Note

Setting file_size to a larger value than above will lead to the two files having different sizes. By default, file_size=None, meaning it can be arbitrarily large, in which case only one file will be created.

To write the data, we pass fwr to vdif.open:

>>> fw = vdif.open(fwr, 'ws', header0=fh.header0,
...                sample_rate=fh.sample_rate,
...                nthread=fh.sample_shape.nthread)
>>> fw.write(d)
>>> fw.close()    # This implicitly closes fwr.

To read the sequence and confirm their contents are identical to the sample file’s, we may again use open:

>>> frr = sf.open(filenames, mode='rb')
>>> fr = vdif.open(frr, 'rs', sample_rate=fh.sample_rate)
>>> fr.header0.time == fh.header0.time
True
>>> np.all(fr.read() == d)
True
>>> fr.close()

We can also open the second file on its own and confirm it contains the second frameset of the sample file:

>>> fsf = vdif.open(filenames[1], mode='rs', sample_rate=fh.sample_rate)
>>> fh.seek(fh.shape[0] // 2)    # Seek to start of second frameset.
20000
>>> fsf.header0.time == fh.time
True
>>> np.all(fsf.read() == fh.read())
True
>>> fsf.close()
>>> fh.close()  # Close sample file.

While sequentialfile can be used for any format, since file sequences are common for DADA, it is implicitly used if a list of files or filename template is passed to dada.open. See the DADA Usage section for details.

Reference/API

baseband.helpers Package
baseband.helpers.sequentialfile Module
Functions
open(files[, mode, file_size, opener]) Read or write several files as if they were one contiguous one.
Classes
SequentialFileReader(files[, mode, opener]) Read several files as if they were one contiguous one.
SequentialFileWriter(files[, mode, …]) Write several files as if they were one contiguous one.
Class Inheritance Diagram

Inheritance diagram of baseband.helpers.sequentialfile.SequentialFileReader, baseband.helpers.sequentialfile.SequentialFileWriter

VLBI Base

Routines on which the readers and writers for specific VLBI formats are based.

Reference/API

baseband.vlbi_base Package
baseband.vlbi_base.header Module

Base definitions for VLBI Headers, used for VDIF and Mark 5B.

Defines a header class VLBIHeaderBase that can be used to hold the words corresponding to a frame header, providing access to the values encoded in via a dict-like interface. Definitions for headers are constructed using the HeaderParser class.

Functions
make_parser(word_index, bit_index, bit_length) Construct a function that converts specific bits from a header.
make_setter(word_index, bit_index, bit_length) Construct a function that uses a value to set specific bits in a header.
Classes
HeaderProperty(header_parser, getter[, doc]) Mimic a dictionary, calculating entries from header words.
HeaderPropertyGetter(getter[, doc]) Special property for attaching HeaderProperty.
HeaderParser(*args, **kwargs) Parser & setter for VLBI header keywords.
VLBIHeaderBase(words[, verify]) Base class for all VLBI headers.
Class Inheritance Diagram

Inheritance diagram of baseband.vlbi_base.header.HeaderProperty, baseband.vlbi_base.header.HeaderPropertyGetter, baseband.vlbi_base.header.HeaderParser, baseband.vlbi_base.header.VLBIHeaderBase

baseband.vlbi_base.payload Module

Base definitions for VLBI payloads, used for VDIF and Mark 5B.

Defines a payload class VLBIPayloadBase that can be used to hold the words corresponding to a frame payload, providing access to the values encoded in it as a numpy array.

Classes
VLBIPayloadBase(words[, sample_shape, bps, …]) Container for decoding and encoding VLBI payloads.
Class Inheritance Diagram

Inheritance diagram of baseband.vlbi_base.payload.VLBIPayloadBase

baseband.vlbi_base.frame Module

Base definitions for VLBI frames, used for VDIF and Mark 5B.

Defines a frame class VLBIFrameBase that can be used to hold a header and a payload, providing access to the values encoded in both.

Classes
VLBIFrameBase(header, payload[, valid, verify]) Representation of a VLBI data frame, consisting of a header and payload.
Class Inheritance Diagram

Inheritance diagram of baseband.vlbi_base.frame.VLBIFrameBase

baseband.vlbi_base.base Module
Functions
make_opener(fmt, classes[, doc, append_doc]) Create a baseband file opener.
Classes
VLBIFileBase(fh_raw) VLBI file wrapper, used to add frame methods to a binary data file.
VLBIFileReaderBase(fh_raw) VLBI wrapped file reader base class.
VLBIStreamBase(fh_raw, header0, sample_rate, …) VLBI file wrapper, allowing access as a stream of data.
VLBIStreamReaderBase(fh_raw, header0, …)
VLBIStreamWriterBase(fh_raw, header0, …)
Class Inheritance Diagram

Inheritance diagram of baseband.vlbi_base.base.VLBIFileBase, baseband.vlbi_base.base.VLBIFileReaderBase, baseband.vlbi_base.base.VLBIStreamBase, baseband.vlbi_base.base.VLBIStreamReaderBase, baseband.vlbi_base.base.VLBIStreamWriterBase

baseband.vlbi_base.encoding Module

Encoders and decoders for generic VLBI data formats.

Functions
encode_2bit_base(values) Generic encoder for data stored using two bits.
encode_4bit_base(values) Generic encoder for data stored using four bits.
decode_8bit(words) Generic decoder for data stored using 8 bits.
encode_8bit(values) Encode 8 bit VDIF data.
Variables
OPTIMAL_2BIT_HIGH Optimal high value for a 2-bit digitizer for which the low value is 1.
TWO_BIT_1_SIGMA Optimal level between low and high for the above OPTIMAL_2BIT_HIGH.
FOUR_BIT_1_SIGMA Scaling for four-bit encoding that makes it look like 2 bit.
EIGHT_BIT_1_SIGMA Scaling for eight-bit encoding that makes it look like 2 bit.
decoder_levels Levels for data encoded with different numbers of bits..
baseband.vlbi_base.utils Module
Functions
bcd_decode(value)
bcd_encode(value)
Classes
CRC(polynomial) Cyclic Redundancy Check for a bitstream.
Class Inheritance Diagram

Inheritance diagram of baseband.vlbi_base.utils.CRC

Sample Data Files

baseband.data Package

Sample files with baseband data recorded in different formats.

Variables
SAMPLE_AROCHIME_VDIF VDIF sample from ARO, written by CHIME backend.
SAMPLE_DADA DADA sample from Effelsberg, with header adapted to shortened size.
SAMPLE_DRAO_CORRUPT Corrupted VDIF sample.
SAMPLE_GSB_PHASED GSB phased sample.
SAMPLE_GSB_PHASED_HEADER GSB phased header sample.
SAMPLE_GSB_RAWDUMP GSB rawdump sample.
SAMPLE_GSB_RAWDUMP_HEADER GSB rawdump header sample.
SAMPLE_MARK4 Mark 4 sample.
SAMPLE_MARK4_16TRACK Mark 4 sample.
SAMPLE_MARK4_32TRACK Mark 4 sample.
SAMPLE_MARK4_32TRACK_FANOUT2 Mark 4 sample.
SAMPLE_MARK5B Mark 5B sample.
SAMPLE_MWA_VDIF VDIF sample from MWA.
SAMPLE_PUPPI GUPPI/PUPPI sample, npol=2, nchan=4.
SAMPLE_VDIF VDIF sample.
SAMPLE_VLBI_VDIF VDIF sample.

Developer documentation

The developer documentation feature tutorials for supporting new formats or format extensions such as VDIF EDV.

Supporting a New VDIF EDV

Users may encounter VDIF files with unusual headers not currently supported by Baseband. These may either have novel EDV, or they may purport to be a supported EDV but not conform to its formal specification. To handle such situations, Baseband supports implementation of new EDVs and overriding of existing EDVs without the need to modify Baseband’s source code.

The tutorials below assumes the following modules have been imported:

>>> import numpy as np
>>> import astropy.units as u
>>> from baseband import vdif, vlbi_base as vlbi

VDIF Headers

Each VDIF frame begins with a 32-byte, or eight 32-bit word, header that is structured as follows:

_images/VDIFHeader.png

Schematic of the standard 32-bit VDIF header, from VDIF specification release 1.1.1 document, Fig. 3. 32-bit words are labelled on the left, while byte and bit numbers above indicate relative addresses within each word. Subscripts indicate field length in bits.

where the abbreviated labels are

  • \(\mathrm{I}_1\) - invalid data
  • \(\mathrm{L}_1\) - if 1, header is VDIF legacy
  • \(\mathrm{V}_3\) - VDIF version number
  • \(\mathrm{log}_2\mathrm{(\#chns)}_5\) - \(\mathrm{log}_2\) of the number of sub-bands in the frame
  • \(\mathrm{C}_1\) - if 1, complex data
  • \(\mathrm{EDV}_8\) - “extended data version” number; see below

Detailed definitions of terms are found on pages 5 to 7 of the VDIF specification document.

Words 4 - 7 hold optional extended user data, using a layout specified by the EDV, in word 4 of the header. EDV formats can be registered on the VDIF website; Baseband aims to support all registered formats (but does not currently support EDV = 4).

Implementing a New EDV

In this tutorial, we follow the implementation of an EDV=4 header. This would be a first and required step to support that format, but does not suffice, as it also needs a new frame class that allows the purpose of the EDV class, which is to independently store the validity of sub-band channels within a single data frame, rather than using the single invalid-data bit. From the EDV=4 specification, we see that we need to add the following to the standard VDIF header:

  • Validity header mask (word 4, bits 16 - 24): integer value between 1 and 64 inclusive indicating the number of validity bits. (This is different than \(\mathrm{log}_2\mathrm{(\#chns)}_5\), since some channels can be unused.)
  • Synchronization pattern (word 5): constant byte sequence 0xACABFEED, for finding the locations of headers in a data stream.
  • Validity mask (words 6 - 7): 64-bit binary mask indicating the validity of sub-bands. Any fraction of 64 sub-bands can be stored in this format, with any unused bands labelled as invalid (0) in the mask. If the number of bands exceeds 64, each bit indicates the validity of a group of sub-bands; see specification for details.

See Sec. 3.1 of the specification for best practices on using the invalid data bit \(\mathrm{I}_1\) in word 0.

In Baseband, a header is parsed using VDIFHeader, which returns a header instance that is a subclass of VDIFHeader corresponding to the header EDV. This can be seen in the header module class inheritance diagram. To support a new EDV, we create a new subclass to VDIFHeader:

>>> class VDIFHeader4(vdif.header.VDIFHeader):
...     _edv = 4
...
...     _header_parser = vlbi.header.HeaderParser(
...         (('invalid_data', (0, 31, 1, False)),
...          ('legacy_mode', (0, 30, 1, False)),
...          ('seconds', (0, 0, 30)),
...          ('_1_30_2', (1, 30, 2, 0x0)),
...          ('ref_epoch', (1, 24, 6)),
...          ('frame_nr', (1, 0, 24, 0x0)),
...          ('vdif_version', (2, 29, 3, 0x1)),
...          ('lg2_nchan', (2, 24, 5)),
...          ('frame_length', (2, 0, 24)),
...          ('complex_data', (3, 31, 1)),
...          ('bits_per_sample', (3, 26, 5)),
...          ('thread_id', (3, 16, 10, 0x0)),
...          ('station_id', (3, 0, 16)),
...          ('edv', (4, 24, 8)),
...          ('validity_mask_length', (4, 16, 8, 0)),
...          ('sync_pattern', (5, 0, 32, 0xACABFEED)),
...          ('validity_mask', (6, 0, 64, 0))))

VDIFHeader has a metaclass that ensures that whenever it is subclassed, the subclass definition is inserted into the VDIF_HEADER_CLASSES dictionary using its EDV value as the dictionary key. Methods in VDIFHeader use this dictionary to determine the type of object to return for a particular EDV. How all this works is further discussed in the documentation of the VDIF header module.

The class must have a private _edv attribute for it to properly be registered in VDIF_HEADER_CLASSES. It must also feature a _header_parser that reads these words to return header properties. For this, we utilize vlbi_base.header.HeaderParser, available in baseband.vlbi_base.header. To initialize a header parser, we pass it a tuple of header properties, where each entry follows the syntax:

('property_name', (word_index, bit_index, bit_length, default))

where

  • property_name: name of the header property; this will be the key;
  • word_index: index into the header words for this key;
  • bit_index: index to the starting bit of the part used;
  • bit_length: number of bits used, normally between 1 and 32, but can be 64 for adding two words together; and
  • default: (optional) default value to use in initialization.

For further details, see the documentation of HeaderParser.

Once defined, we can use our new header like any other:

>>> myheader = vdif.header.VDIFHeader.fromvalues(
...     edv=4, seconds=14363767, nchan=1,
...     station=65532, bps=2, complex_data=False,
...     thread_id=3, validity_mask_length=60,
...     validity_mask=(1 << 59) + 1)
>>> myheader
<VDIFHeader4 invalid_data: False,
             legacy_mode: False,
             seconds: 14363767,
             _1_30_2: 0,
             ref_epoch: 0,
             frame_nr: 0,
             vdif_version: 1,
             lg2_nchan: 0,
             frame_length: 0,
             complex_data: False,
             bits_per_sample: 1,
             thread_id: 3,
             station_id: 65532,
             edv: 4,
             validity_mask_length: 60,
             sync_pattern: 0xacabfeed,
             validity_mask: 576460752303423489>
>>> myheader['validity_mask'] == 2**59 + 1
True

There is an easier means of instantiating the header parser. As can be seen in the class inheritance diagram for the header module, many VDIF headers are subclassed from other VDIFHeader subclasses, namely VDIFBaseHeader and VDIFSampleRateHeader. This is because many EDV specifications share common header values, and so their functions and derived properties should be shared as well. Moreover, header parsers can be appended to one another, which saves repetitious coding because the first four words of any VDIF header are the same. Indeed, we can create the same header as above by subclassing VDIFBaseHeader:

>>> class VDIFHeader4Enhanced(vdif.header.VDIFBaseHeader):
...     _edv = 42
...
...     _header_parser = vdif.header.VDIFBaseHeader._header_parser +\
...                      vlbi.header.HeaderParser((
...                             ('validity_mask_length', (4, 16, 8, 0)),
...                             ('sync_pattern', (5, 0, 32, 0xACABFEED)),
...                             ('validity_mask', (6, 0, 64, 0))))
...
...     _properties = vdif.header.VDIFBaseHeader._properties + ('validity',)
...
...     def verify(self):
...         """Basic checks of header integrity."""
...         super(VDIFHeader4Enhanced, self).verify()
...         assert 1 <= self['validity_mask_length'] <= 64
...
...     @property
...     def validity(self):
...         """Validity mask array with proper length.
...
...         If set, writes both ``validity_mask`` and ``validity_mask_length``.
...         """
...         bitmask = np.unpackbits(self['validity_mask'].astype('>u8')
...                                 .view('u1'))[::-1].astype(bool)
...         return bitmask[:self['validity_mask_length']]
...
...     @validity.setter
...     def validity(self, validity):
...         bitmask = np.zeros(64, dtype=bool)
...         bitmask[:len(validity)] = validity
...         self['validity_mask_length'] = len(validity)
...         self['validity_mask'] = np.packbits(bitmask[::-1]).view('>u8')

Here, we set edv = 42 because VDIFHeader’s metaclass is designed to prevent accidental overwriting of existing entries in VDIF_HEADER_CLASSES. If we had used _edv = 4, we would have gotten an exception:

ValueError: EDV 4 already registered in VDIF_HEADER_CLASSES

We shall see how to override header classes in the next section. Except for the EDV, VDIFHeader4Enhanced’s header structure is identical to VDIFHeader4. It also contains a few extra functions to enhance the header’s usability.

The verify function is an optional function that runs upon header initialization to check its veracity. Ours simply checks that the validity mask length is in the allowed range, but we also call the same function in the superclass (VDIFBaseHeader), which checks that the header is not in 4-word “legacy mode”, that the header’s EDV matches that read from the words, that there are eight words, and that the sync pattern matches 0xACABFEED.

The validity_mask is a bit mask, which is not necessarily the easiest to use directly. Hence, implement a derived validity property that generates a boolean mask of the right length (note that this is not right for cases whether the number of channels in the header exceeds 64). We also define a corresponding setter, and add this to the private _properties attribute, so that we can use validity as a keyword in fromvalues:

>>> myenhancedheader = vdif.header.VDIFHeader.fromvalues(
...     edv=42, seconds=14363767, nchan=1,
...     station=65532, bps=2, complex_data=False,
...     thread_id=3, validity=[True]+[False]*58+[True])
>>> myenhancedheader
<VDIFHeader4Enhanced invalid_data: False,
                     legacy_mode: False,
                     seconds: 14363767,
                     _1_30_2: 0,
                     ref_epoch: 0,
                     frame_nr: 0,
                     vdif_version: 1,
                     lg2_nchan: 0,
                     frame_length: 0,
                     complex_data: False,
                     bits_per_sample: 1,
                     thread_id: 3,
                     station_id: 65532,
                     edv: 42,
                     validity_mask_length: 60,
                     sync_pattern: 0xacabfeed,
                     validity_mask: [576460752303423489]>
>>> assert myenhancedheader['validity_mask'] == 2**59 + 1
>>> assert (myenhancedheader.validity == [True]+[False]*58+[True]).all()
>>> myenhancedheader.validity = [True]*8
>>> myenhancedheader['validity_mask']
array([255], dtype=uint64)

Note

If you have implemented support for a new EDV that is widely used, we encourage you to make a pull request to Baseband’s GitHub repository, as well as to register it (if it is not already registered) with the VDIF consortium!

Replacing an Existing EDV

Above, we mentioned that VDIFHeader’s metaclass is designed to prevent accidental overwriting of existing entries in VDIF_HEADER_CLASSES, so attempting to assign two header classes to the same EDV results in an exception. There are situations such the one above, however, where we’d like to replace one header with another.

To get VDIFHeader to use VDIFHeader4Enhanced when edv=4, we can manually insert it in the dictionary:

>>> vdif.header.VDIF_HEADER_CLASSES[4] = VDIFHeader4Enhanced

Of course, we should then be sure that its _edv attribute is correct:

>>> VDIFHeader4Enhanced._edv = 4

VDIFHeader will now return instances of VDIFHeader4Enhanced when reading headers with edv = 4:

>>> myheader = vdif.header.VDIFHeader.fromvalues(
...     edv=4, seconds=14363767, nchan=1,
...     station=65532, bps=2, complex_data=False,
...     thread_id=3, validity=[True]*60)
>>> assert isinstance(myheader, VDIFHeader4Enhanced)

Note

Failing to modify _edv in the class definition will lead to an EDV mismatch when verify is called during header initialization.

This can also be used to override VDIFHeader’s behavior even for EDVs that are supported by Baseband, which may prove useful when reading data with corrupted or mislabelled headers. To illustrate this, we attempt to read in a corrupted VDIF file originally from the Dominion Radio Astrophysical Observatory. This file can be imported from the baseband data directory:

>>> from baseband.data import SAMPLE_DRAO_CORRUPT

Naively opening the file with

>>> fh = vdif.open(SAMPLE_DRAO_CORRUPT, 'rs')  

will lead to an AssertionError. This is because while the headers of the file use EDV=0, it deviates from that EDV standard by storing additional information an: an “eud2” parameter in word 5, which is related to the sample time. Furthermore, the bits_per_sample setting is incorrect (it should be 3 rather than 4 – the number is defined such that a one-bit sample has a bits_per_sample code of 0). Finally, though not an error, the thread_id in word 3 defines two parts, link and slot, which reflect the data acquisition computer node that wrote the data to disk.

To accommodate these changes, we design an alternate header. We first pop the EDV = 0 entry from VDIF_HEADER_CLASSES:

>>> vdif.header.VDIF_HEADER_CLASSES.pop(0)
<class 'baseband.vdif.header.VDIFHeader0'>

We then define a replacement class:

>>> class DRAOVDIFHeader(vdif.header.VDIFHeader0):
...     """DRAO VDIF Header
...
...     An extension of EDV=0 which uses the thread_id to store link
...     and slot numbers, and adds a user keyword (illegal in EDV0,
...     but whatever) that identifies data taken at the same time.
...
...     The header also corrects 'bits_per_sample' to be properly bps-1.
...     """
...
...     _header_parser = vdif.header.VDIFHeader0._header_parser + \
...         vlbi.header.HeaderParser((('link', (3, 16, 4)),
...                                   ('slot', (3, 20, 6)),
...                                   ('eud2', (5, 0, 32))))
...
...     def verify(self):
...         pass  # this is a hack, don't bother with verification...
...
...     @classmethod
...     def fromfile(cls, fh, edv=0, verify=False):
...         self = super(DRAOVDIFHeader, cls).fromfile(fh, edv=0,
...                                                    verify=False)
...         # Correct wrong bps
...         self.mutable = True
...         self['bits_per_sample'] = 3
...         return self

We override verify because VDIFHeader0’s verify function checks that word 5 contains no data. We also override the fromfile class method such that the bits_per_sample property is reset to its proper value whenever a header is read from file.

We can now read in the corrupt file by manually reading in the header, then the payload, of each frame:

>>> fh = vdif.open(SAMPLE_DRAO_CORRUPT, 'rb')
>>> header0 = DRAOVDIFHeader.fromfile(fh)
>>> header0['eud2'] == 667235140
True
>>> header0['link'] == 2
True
>>> payload0 = vdif.payload.VDIFPayload.fromfile(fh, header0)
>>> payload0.shape == (header0.samples_per_frame, header0.nchan)
True
>>> fh.close()

Reading a frame using VDIFFrame will still fail, since its _header_class is VDIFHeader, and so VDIFHeader.fromfile, rather than the function we defined, is used to read in headers. If we wanted to use VDIFFrame, we would need to set

VDIFFrame._header_class = DRAOVDIFHeader

before using open(), so that header files are read using DRAOVDIFHeader.fromfile.

A more elegant solution that is compatible with VDIFStreamReader without hacking VDIFFrame involves modifying the bits-per-sample code within __init__(). Let’s remove our previous custom class, and define a replacement:

>>> vdif.header.VDIF_HEADER_CLASSES.pop(0)
<class '__main__.DRAOVDIFHeader'>
>>> class DRAOVDIFHeaderEnhanced(vdif.header.VDIFHeader0):
...     """DRAO VDIF Header
...
...     An extension of EDV=0 which uses the thread_id to store link and slot
...     numbers, and adds a user keyword (illegal in EDV0, but whatever) that
...     identifies data taken at the same time.
...
...     The header also corrects 'bits_per_sample' to be properly bps-1.
...     """
...     _header_parser = vdif.header.VDIFHeader0._header_parser + \
...         vlbi.header.HeaderParser((('link', (3, 16, 4)),
...                                   ('slot', (3, 20, 6)),
...                                   ('eud2', (5, 0, 32))))
...
...     def __init__(self, words, edv=None, verify=True, **kwargs):
...         super(DRAOVDIFHeaderEnhanced, self).__init__(
...                 words, verify=False, **kwargs)
...         self.mutable = True
...         self['bits_per_sample'] = 3
...
...     def verify(self):
...         pass

We can then use the stream reader without further modification:

>>> fh2 = vdif.open(SAMPLE_DRAO_CORRUPT, 'rs', sample_rate=5**12*u.Hz)
>>> fh2.header0['eud2'] == header0['eud2']
True
>>> np.all(fh2.read(1) == payload0[0])
True
>>> fh2.close()

Reading frames using VDIFFileReader.read_frame will now work as well, but reading frame sets using VDIFFileReader.read_frameset will still fail. This is because the frame and thread numbers that function relies on are meaningless for these headers, and grouping threads together using the link, slot and eud2 values should be manually performed by the user.

Project details

https://travis-ci.org/mhvk/baseband.svg?branch=master https://coveralls.io/repos/github/mhvk/baseband/badge.svg Documentation Status

Contributors

Authors and Credits

Baseband Project Contributors

Authors
  • Marten van Kerkwijk (@mhvk)
  • Chenchong Charles Zhu (@cczhu)
Alphabetical list of contributors
  • Rebecca Lin (@00rebe)
  • Nikhil Mahajan (@theXYZT)
  • Robert Main (@ramain)
  • Dana Simard (@danasimard)
  • George Stein (@georgestein)

If you have contributed to Baseband but are not listed above, please send one of the authors an e-mail, or open a pull request for this page.

Full Changelog

1.1.1 (2018-07-24)

Bug Fixes

1.1 (2018-06-06)

New Features
  • Added a new baseband.file_info function, which can be used to inspect data files. [#200]
  • Added a general file opener, baseband.open which for a set of formats will check whether the file is of that format, and then load it using the corresponding module. [#198]
  • Allow users to pass a verify keyword to file openers reading streams. [#233]
  • Added support for the GUPPI format. [#212]
  • Enabled baseband.dada.open to read streams where the last frame has an incomplete payload. [#228]
API Changes
  • In analogy with Mark 5B, VDIF header time getting and setting now requires a frame rate rather than a sample rate. [#217, #218]
  • DADA and GUPPI now support passing either a start_time or offset (in addition to time) to set the start time in the header. [#240]
Bug Fixes
Other Changes and Additions
  • The baseband.data module with sample data files now has an explicit entry in the documentation. [#198]
  • Increased speed of VLBI stream reading by changing the way header sync patterns are stored, and removing redundant verification steps. VDIF sequential decode is now 5 - 10% faster (depending on the number of threads). [#241]

1.0.1 (2018-06-04)

Bug Fixes
  • Fixed a bug in baseband.dada.open where passing a squeeze setting is ignored when also passing header keywords in ‘ws’ mode. [#211]
  • Raise an exception rather than return incorrect times for Mark 5B files in which the fractional seconds are not set. [#216]
Other Changes and Additions
  • Fixed broken links and typos in the documentation. [#211]

1.0.0 (2018-04-09)

  • Initial release.

Licenses

Baseband License

Baseband is licensed under the GNU General Public License v3.0. The full text of the license can be found in LICENSE under Baseband’s root directory.

Reference/API

baseband Package

Radio baseband I/O.

Functions

file_info(name[, format]) Get format and other information from a baseband file.
open(name[, mode, format]) Open a baseband file for reading or writing.
test([package, test_path, args, plugins, …]) Run the tests using py.test.