VDIF

The VLBI Data Interchange Format (VDIF) was introduced in 2009 to standardize VLBI data transfer and storage. Detailed specifications are found in VDIF’s specification document.

File Structure

A VDIF file is composed of data frames. Each has a header of eight 32-bit words (32 bytes; the exception is the “legacy VDIF” format, which is four words, or 16 bytes, long), and a payload that ranges from 32 bytes to ~134 megabytes. Both are little-endian. The first four words of a VDIF header hold the same information in all VDIF files, but the last four words hold optional user-defined data. The layout of these four words is specified by the file’s extended-data version, or EDV. More detailed information on the header can be found in the tutorial for supporting a new VDIF EDV.

A data frame may carry one or multiple channels, and a stream of data frames all carrying the same (set of) channels is known as a thread and denoted by its thread ID. The collection of frames representing the same time segment (and all possible thread IDs) is called a data frameset (or just “frameset”).

Strict time and thread ID ordering of frames in the stream, while considered part of VDIF best practices, is not mandated, and cannot be guaranteed during data transmission over the internet.

Usage Notes

This section covers reading and writing VDIF files with Baseband; general usage can be found under the Using Baseband section. For situations in which one is unsure of a file’s format, Baseband features the general baseband.open and baseband.file_info functions, which are also discussed in Using Baseband. The examples below use the small sample file baseband/data/sample.vdif, and the numpy, astropy.units, and baseband.vdif modules:

>>> import numpy as np
>>> from baseband import vdif
>>> import astropy.units as u
>>> from baseband.data import SAMPLE_VDIF

Simple reading and writing of VDIF files can be done entirely using open. Opening in binary mode provides a normal file reader, but extended with methods to read a VDIFFrameSet data container for storing a frame set as well as VDIFFrame one for storing a single frame:

>>> fh = vdif.open(SAMPLE_VDIF, 'rb')
>>> fs = fh.read_frameset()
>>> fs.data.shape
(20000, 8, 1)
>>> fr = fh.read_frame()
>>> fr.data.shape
(20000, 1)
>>> fh.close()

(As with other formats, fr.data is a read-only property of the frame.)

Opening in stream mode wraps the low-level routines such that reading and writing is in units of samples. It also provides access to header information:

>>> fh = vdif.open(SAMPLE_VDIF, 'rs')
>>> fh
<VDIFStreamReader name=... offset=0
    sample_rate=32.0 MHz, samples_per_frame=20000,
    sample_shape=SampleShape(nthread=8),
    bps=2, complex_data=False, edv=3, station=65532,
    start_time=2014-06-16T05:56:07.000000000>
>>> d = fh.read(12)
>>> d.shape
(12, 8)
>>> d[:, 0].astype(int)  # first thread
array([-1, -1,  3, -1,  1, -1,  3, -1,  1,  3, -1,  1])
>>> fh.close()

To set up a file for writing needs quite a bit of header information. Not coincidentally, what is given by the reader above suffices:

>>> from astropy.time import Time
>>> fw = vdif.open('try.vdif', 'ws', sample_rate=32*u.MHz,
...                samples_per_frame=20000, nchan=1, nthread=2,
...                complex_data=False, bps=2, edv=3, station=65532,
...                time=Time('2014-06-16T05:56:07.000000000'))
>>> with vdif.open(SAMPLE_VDIF, 'rs', subset=[1, 3]) as fh:
...    d = fh.read(20000)  # Get some data to write
>>> fw.write(d)
>>> fw.close()
>>> fh = vdif.open('try.vdif', 'rs')
>>> d2 = fh.read(12)
>>> np.all(d[:12] == d2)
True
>>> fh.close()

Copying Parts of Files

Here is a simple example to copy a VDIF file:

>>> with vdif.open(SAMPLE_VDIF, 'rb') as fr, vdif.open('try.vdif', 'wb') as fw:
...     while True:
...         try:
...             fw.write_frame(fr.read_frame())
...         except EOFError:
...             break

For small files, one could just do:

>>> with vdif.open(SAMPLE_VDIF, 'rs') as fr, \
...         vdif.open('try.vdif', 'ws', header0=fr.header0,
...                   sample_rate=fr.sample_rate,
...                   nthread=fr.sample_shape.nthread) as fw:
...     fw.write(fr.read())

This copies everything to memory, though, and some header information is lost.

Both examples can be adjusted for copying just some channels:

>>> from baseband.data import SAMPLE_AROCHIME_VDIF as SAMPLE_AROCHIME
>>> with vdif.open(SAMPLE_AROCHIME, 'rb') as fr, vdif.open('try.vdif', 'wb') as fw:
...     while True:
...         try:
...             frame = fr.read_frame()
...         except EOFError:
...             break
...         new_header = frame.header.copy()
...         new_header.nchan = 128
...         new_header.samples_per_frame = 1
...         new_data = frame[:, :128]
...         fw.write_frame(new_data, new_header)

>>> with vdif.open(SAMPLE_AROCHIME, 'rs', sample_rate=800/1024/2*u.MHz,
...                subset=(slice(None), slice(0, 128))) as fr:
...     out_header = fr.header0.copy()
...     out_header.nchan = 128
...     out_header.samples_per_frame = 1
...     with vdif.open('try.vdif', 'ws', sample_rate=fr.sample_rate,
...                    header0=out_header, nthread=2) as fw:
...         fw.write(fr.read())

Troubleshooting

In situations where the VDIF files being handled are corrupted or modified in an unusual way, using open will likely lead to an exception being raised or to unexpected behavior. In such cases, it may still be possible to read in the data. Below, we provide a few solutions and workarounds to do so.

Note

This list is certainly incomplete. If you have an issue (solved or otherwise) you believe should be on this list, please e-mail the contributors.

AssertionError when checking EDV in header verify function

All VDIF header classes (other than VDIFLegacyHeader) check, using their verify function, that the EDV read from file matches the class EDV. If they do not, the following line

assert self.edv is None or self.edv == self['edv']

returns an AssertionError. If this occurs because the VDIF EDV is not yet supported by Baseband, support can be added by implementing a custom header class. If the EDV is supported, but the header deviates from the format found in the VLBI.org EDV registry, the best solution is to create a custom header class, then override the subclass selector in VDIFHeader. Tutorials for doing either can be found here.

EOFError encountered in _get_frame_rate when reading

When the sample rate is not input by the user and cannot be deduced from header information (if EDV = 1 or, the sample rate is found in the header), Baseband tries to determine the frame rate using the private method _get_frame_rate in VDIFStreamReader (and then multiply by the samples per frame to obtain the sample rate). This function raises EOFError if the file contains less than one second of data, or is corrupt. In either case the file can be opened still by explicitly passing in the sample rate to open via the sample_rate keyword.

Reference/API

baseband.vdif Package

VLBI Data Interchange Format (VDIF) reader/writer

For the VDIF specification, see https://vlbi.org/vlbi-standards/vdif/

Functions

info(name, **kwargs)

Collect VDIF file information.

open(name[, mode])

Open VDIF file(s) for reading or writing.

Classes

VDIFFrame(header, payload[, valid, verify])

Representation of a VDIF data frame, consisting of a header and payload.

VDIFFrameSet(frames[, header0])

Representation of a set of VDIF frames, combining different threads.

VDIFHeader([words, edv, verify])

VDIF Header, supporting different Extended Data Versions.

VDIFPayload(words[, header, sample_shape, ...])

Container for decoding and encoding VDIF payloads.

Class Inheritance Diagram

Inheritance diagram of baseband.vdif.frame.VDIFFrame, baseband.vdif.frame.VDIFFrameSet, baseband.vdif.header.VDIFHeader, baseband.vdif.payload.VDIFPayload

baseband.vdif.header Module

Definitions for VLBI VDIF Headers.

Implements a VDIFHeader class used to store header words, and decode/encode the information therein.

For the VDIF specification, see https://www.vlbi.org/vdif

Classes

VDIFHeader([words, edv, verify])

VDIF Header, supporting different Extended Data Versions.

VDIFBaseHeader([words, edv, verify])

Base for non-legacy VDIF headers that use 8 32-bit words.

VDIFNoSampleRateHeader([words, edv, verify])

VDIFSampleRateHeader([words, edv, verify])

Base for VDIF headers that include the sample rate (EDV= 1, 3, 4).

VDIFLegacyHeader([words, edv, verify])

Legacy VDIF header that uses only 4 32-bit words.

VDIFHeader0([words, edv, verify])

VDIF Header for EDV=0.

VDIFHeader1([words, edv, verify])

VDIF Header for EDV=1.

VDIFHeader2([words, edv, verify])

VDIF Header for EDV=2.

VDIFHeader3([words, edv, verify])

VDIF Header for EDV=3.

VDIFMark5BHeader([words, edv, verify])

Mark 5B over VDIF (EDV=0xab).

Variables

VDIF_HEADER_CLASSES

Dict for storing VDIF header class definitions, indexed by their EDV.

Class Inheritance Diagram

Inheritance diagram of baseband.vdif.header.VDIFHeader, baseband.vdif.header.VDIFBaseHeader, baseband.vdif.header.VDIFNoSampleRateHeader, baseband.vdif.header.VDIFSampleRateHeader, baseband.vdif.header.VDIFLegacyHeader, baseband.vdif.header.VDIFHeader0, baseband.vdif.header.VDIFHeader1, baseband.vdif.header.VDIFHeader2, baseband.vdif.header.VDIFHeader3, baseband.vdif.header.VDIFMark5BHeader

baseband.vdif.payload Module

Definitions for VLBI VDIF payloads.

Implements a VDIFPayload class used to store payload words, and decode to or encode from a data array.

See the VDIF specification page for payload specifications.

Functions

init_luts()

Sets up the look-up tables for levels as a function of input byte.

decode_1bit(words)

decode_2bit(words)

Decodes data stored using 2 bits per sample.

decode_4bit(words)

Decodes data stored using 4 bits per sample.

encode_1bit(values)

Encodes values using 1 bit per sample, packing the result into bytes.

encode_2bit(values)

Encodes values using 2 bits per sample, packing the result into bytes.

encode_4bit(values)

Encodes values using 4 bits per sample, packing the result into bytes.

Classes

VDIFPayload(words[, header, sample_shape, ...])

Container for decoding and encoding VDIF payloads.

Class Inheritance Diagram

Inheritance diagram of baseband.vdif.payload.VDIFPayload

baseband.vdif.frame Module

Definitions for VLBI VDIF frames and frame sets.

Implements a VDIFFrame class that can be used to hold a header and a payload, providing access to the values encoded in both. Also, define a VDIFFrameSet class that combines a set of frames from different threads.

For the VDIF specification, see https://www.vlbi.org/vdif

Classes

VDIFFrame(header, payload[, valid, verify])

Representation of a VDIF data frame, consisting of a header and payload.

VDIFFrameSet(frames[, header0])

Representation of a set of VDIF frames, combining different threads.

Class Inheritance Diagram

Inheritance diagram of baseband.vdif.frame.VDIFFrame, baseband.vdif.frame.VDIFFrameSet

baseband.vdif.file_info Module

The VDIFFileReaderInfo property.

Includes information about threads and frame sets.

Classes

VDIFFileReaderInfo([parent])

Class Inheritance Diagram

Inheritance diagram of baseband.vdif.file_info.VDIFFileReaderInfo

baseband.vdif.base Module

Functions

open(name[, mode])

Open VDIF file(s) for reading or writing.

info(name, **kwargs)

Collect VDIF file information.

Classes

VDIFFileReader(fh_raw)

Simple reader for VDIF files.

VDIFFileWriter(fh_raw)

Simple writer for VDIF files.

VDIFStreamBase()

Provides sample shape maker and fast time and index getting/setting.

VDIFStreamReader(fh_raw[, sample_rate, ...])

VLBI VDIF format reader.

VDIFStreamWriter(fh_raw[, header0, ...])

VLBI VDIF format writer.

Class Inheritance Diagram

Inheritance diagram of baseband.vdif.base.VDIFFileReader, baseband.vdif.base.VDIFFileWriter, baseband.vdif.base.VDIFStreamBase, baseband.vdif.base.VDIFStreamReader, baseband.vdif.base.VDIFStreamWriter