.. _getting_started: .. include:: ../tutorials/glossary_substitutions.rst ***************************** Getting Started with Baseband ***************************** This quickstart tutorial is meant to help the reader hit the ground running with Baseband. For more detail, including writing to files, see :ref:`Using Baseband `. For installation instructions, please see :ref:`Installing Baseband `. When using Baseband, we typically will also use numpy_, the `astropy.units `_ module, and :class:`~astropy.time.Time` from the `astropy.time `_ module. Let's import all of these:: >>> import baseband >>> import numpy as np >>> import astropy.units as u >>> from astropy.time import Time .. _getting_started_opening: Opening Files ============= For this tutorial, we'll use two sample files:: >>> from baseband.data import SAMPLE_VDIF, SAMPLE_MARK5B The first file is a VDIF one created from EVN_/VLBA_ observations of `Black Widow pulsar PSR B1957+20 `_, while the second is a Mark 5B from EVN_/WSRT_ observations of the same pulsar. .. _EVN: https://www.evlbi.org/ .. _VLBA: https://public.nrao.edu/telescopes/vlba/ .. _WSRT: https://www.astron.nl/radio-observatory/public/public-0 To open the VDIF file:: >>> fh_vdif = baseband.open(SAMPLE_VDIF) Opening the Mark 5B file is slightly more involved, as not all required metadata is stored in the file itself:: >>> fh_m5b = baseband.open(SAMPLE_MARK5B, nchan=8, sample_rate=32*u.MHz, ... ref_time=Time('2014-06-13 12:00:00')) Here, we've manually passed in as keywords the number of |channels|, the :term:`sample rate` (number of samples per channel per second) as an `astropy.units.Quantity`, and a reference time within 500 days of the start of the observation as an `astropy.time.Time`. That last keyword is needed to properly read timestamps from the Mark 5B file. `baseband.open` tries to open files using all available formats, returning whichever is successful. If you know the format of your file, you can pass its name with the ``format`` keyword, or directly use its format opener (for VDIF, it is `baseband.vdif.open`). Also, the `baseband.file_info` function can help determine the format and any missing information needed by `baseband.open` - see :ref:`Inspecting Files `. Do you have a sequence of files you want to read in? You can pass a list of filenames to `baseband.open`, and it will open them up as if they were a single file! See :ref:`Reading or Writing to a Sequence of Files `. .. _getting_started_reading: Reading Files ============= Radio baseband files are generally composed of blocks of binary data, or |payloads|, stored alongside corresponding metadata, or |headers|. Each header and payload combination is known as a :term:`data frame`, and most formats feature files composed of a long series of frames. Baseband file objects are frame-reading wrappers around Python file objects, and have the same interface, including `~baseband.base.base.VLBIStreamReaderBase.seek` for seeking to different parts of the file, `~baseband.base.base.VLBIStreamReaderBase.tell` for reporting the file pointer's current position, and `~baseband.base.base.VLBIStreamReaderBase.read` for reading data. The main difference is that Baseband file objects read and navigate in units of samples. Let's read some samples from the VDIF file:: >>> data = fh_vdif.read(3) >>> data # doctest: +FLOAT_CMP array([[-1. , 1. , 1. , -1. , -1. , -1. , 3.316505, 3.316505], [-1. , 1. , -1. , 1. , 1. , 1. , 3.316505, 3.316505], [ 3.316505, 1. , -1. , -1. , 1. , 3.316505, -3.316505, 3.316505]], dtype=float32) >>> data.shape (3, 8) Baseband decodes binary data into `~numpy.ndarray` objects. Notice we input ``3``, and received an array of shape ``(3, 8)``; this is because there are 8 VDIF |threads|. Threads and channels represent different |components| of the data such as polarizations or frequency sub-bands, and the collection of all components at one point in time is referred to as a :term:`complete sample`. Baseband reads in units of complete samples, and works with sample rates in units of complete samples per second (including with the Mark 5B example above). Like an `~numpy.ndarray`, calling ``fh_vdif.shape`` returns the shape of the entire dataset:: >>> fh_vdif.shape (40000, 8) The first axis represents time, and all additional axes represent the shape of a complete sample. A labelled version of the complete sample shape is given by:: >>> fh_vdif.sample_shape SampleShape(nthread=8) Baseband extracts basic properties and header metadata from opened files. Notably, the start and end times of the file are given by:: >>> fh_vdif.start_time