Low-Level Access to MIRIAD Data

Is awesome. But the documentation will be written a bit later.

mirtask API Reference

class mirtask.DataSet(path, mode)
Synopsis :

an opened MIRIAD dataset

Parameters:
  • path (str) – the path of the dataset on disk
  • mode (str) – the mode on which to open the dataset; one of “rw”, “c”, or “a”

Instances of this class allow lowlevel manipulation of MIRIAD datasets. More specific subclasses, such as UVDataSet or XYDataSet, allow more structured access to the data contained in the dataset.

close()

Close the dataset.

closeHistory()

Close this data set’s history item.

copyItem(dest, itemname)

Copy an item from this dataset to another.

Parameters:
  • dest (DataSet or subclass) – the opened destination dataset
  • itemname (str) – the name of the item
Returns:

self

deleteAll()

Completely delete this data set. After calling this function, this object cannot be used.

deleteItem(name)

Delete an item from this data-set.

flush()

Write any changed items in the data set out to disk.

getArrayItem(itemname)

Read a dataset item as a homogeneous data array.

Parameters:itemname (str) – the name of the item to read
Returns:the item data
Return type:ndarray or str

This function reads an entire item from a MIRIAD dataset into a numpy array. The datatype and size of the array are automatically determined. Array sizes are not limited, so attempting to load a very large item can eat all of your memory.

Items detected as being of textual type are converted to Python strings before being returned.

Items of inhomogeneous types (“mixed” or “nonstandard” in getItemInfo()) are read in as byte arrays.

getItem(itemname, mode)

Return a DataItem object representing the desired item within this dataset. See the documentation of the DataItem constructor for the meaning of the ‘itemname’ and ‘mode’ parameters.

getItemInfo(itemname)

Get information about a dataset item.

Parameters:itemname (str) – the name of the item
Returns:(kind, dtype, nvalues, offset) (see below)

If you’re interested in fetching the value of an item, in most cases you can safely just call getScalarItem() or getArrayItem() directly.

kind is a string describing the item data. Possible values are:

Kind Meaning
standard The item is an array of one or more homogeneously-typed data values
mixed The item data are heterogeneously typed and not necessarily representable as an array.
nonstandard The structure of the item data is unknown.
missing The item is not present in the dataset.

If kind is “missing”, all of remaining values are None.

dtype is a Numpy datatype for the item values. Possibly values are int16, int32, int64, float32, float64, complex64, or str if the item data are textual. If kind is “mixed” or “nonstandard”, dtype is numpy.uint8.

nvalues is the number of data values in the item, treating it as an array. If kind is “mixed” or “nonstandard”, nvalues is the number of data bytes in the item.

offset is the offset into the item data stream at which the actual item data start. The item types are usually recorded in short header records located before this position in the item data stream, but in some cases the record is missing and its effective size can vary due to alignment constraints.

The return values are set up such that “mixed” or “nonstandard” items can be read in as byte arrays, but this will rarely be useful. External knowledge about the item format will be needed to properly decode “mixed” items. Items marked as “nonstandard” do not conform to basic MIRIAD data typing rules and are therefore likely to be evidence of dataset corruption or extra files inside the dataset directory that do not correspond to genuine MIRIAD data.

getMode()

Return the access mode of this data-set: readonly or read-write. See the MODE_X fields of this class for possible return values.

getScalarItem(itemname, default=None, missingok=True)

Get the value of a scalar dataset item

Parameters:
  • itemname (str) – the name of the item to fetch
  • default (any) – the value to return if the item is not present in this dataset; defaults to None.
  • missingok (bool) – if default should be returned if the variable is not defined in this dataset; defaults to True. If False and the variable is not defined, raises ValueError.
Returns:

the value

Return type:

numpy scalar type

This gets the value of a scalar dataset item, possibly returning a default value if the variable is not found. The return value is a numpy scalar type appropriate for the item, or a string for textual items. Note that these types propagate, so there is a danger of overflow or underflow if you do some kinds of math with the return value. Furthermore, if you provide default, it will usually be one of the builting Python numeric types, not a NumPy type, so if code depends on the type of the return value, there may be variations in behavior depending on whether the variable was found or not.

hasItem(name)

Return whether this data-set contains an item with the given name.

itemNames()

Generate a list of the names of the data items contained in this data set.

logInvocation(ident, args=None)

Log a the date, a task name, and an argument list to this dataset’s history.

Parameters:
  • ident (string) – an identifier that will prefix the history entries
  • args (string iterable or None) – a list of arguments, or None (the default); if the latter, sys.argv[1:] is used.
Returns:

self

This function emulates the MIRIAD library function HISINPUT. It logs the date, some arguments, and an identifier to the dataset’s history file, with the identifier traditionally being the name of a MIRIAD task. This implementation attempts to mimic the behavior of HISINPUT as closely as possible – except for its truncation of very long arguments.

Note that args should not start with an argv[0] entry.

openHistory(mode='a')

Open the history item of this data set. ‘mode’ may be ‘r’ if the history is being read, ‘w’ for truncation and writing, and ‘a’ for appending. The default is ‘a’.

setArrayItem(itemname, dtype, value)

Set the value of a dataset item to a numpy array

Parameters:
  • itemname (str) – the name of the item
  • dtype (Numpy dtype or str) – the format to use for storing the data
  • value (Numpy ndarray) – the data
Returns:

self

Sets the value of the specified item to an array of data. Due to limitations in the MIRIAD I/O routines, data of type int16 are not allowed. Data divided into 8-bit chunks should be given a dtype of str. The number of items to write is given by the size of value.

setScalarItem(itemname, itemtype, value)

Set the value of a scalar dataset item.

Parameters:
  • itemname (str) – the name of the item to set
  • itemtype (type) – the type of the item value
  • value (any) – the item value
Returns:

self

Sets the value of a scalar dataset item. Because many aspects of MIRIAD rely on the particular storage types of dataset items, the type must be specified explicitly. The value will be cast to the specified type before writing if it is not already an instance of it.

Acceptable types are str, numpy.int32, numpy.int64, numpy.float32, numpy.float64, and numpy.complex64. Due to limitations in the MIRIAD I/O routines, numpy.int8 and numpy.int16, which are acceptable in other contexts, are not allowed here.

writeHistory(text)

Write text into this data set’s history file.

class mirtask.DataItem(dataset, itemname, mode)

An item contained within a Miriad dataset.

getSize()

Return the size of this data item.

read(offset, dtype, count)

Read data from this item into a newly-allocated buffer.

Parameters:
  • offset (int) – the byte offset into the item at which to read
  • dtype (Numpy dtype or str) – the type of data to read
  • count – the number of items to read
Returns:

the data

Return type:

ndarray of data type dtype, or str

Allocates a new buffer and reads data into it.

See also readInto().

readInto(offset, buf, count=None)

Read data from this item into a preexisting buffer.

Parameters:
  • offset (int) – the byte offset into the item at which to read
  • buf (ndarray) – the buffer into which the data should be read
  • count – the number of items to read. None, the default, signifies buf.size.
Returns:

buf

Reads data into a preexisting buffer. The data are interpreted as being of whatever format is specified by the data type of buf.

Item data that should be interpreted as strings cannot be read with this function.

See also read().

write(offset, dtype, buf, count=None)

Write data to this item.

Parameters:
  • offset (int) – the byte offset into the item at which to write
  • dtype (Numpy dtype or str) – the kind of data to write
  • buf (ndarray or other iterable) – the data to write
  • count – the number of items to write. None, the default, signifies buf.size.
Returns:

self

Writes data to the item. Before writing, buf is converted to a numpy ndarray if it is not already, then its contents are converted to the format dtype if they are not already in that format. If dtype is str, buf is stringified, then that binary sequence is written to the item.

class mirtask.UVDataSet(path, mode)
baselineShadowed(diameter_meters)

Returns whether the most recently-read UV record comes from antennas that were shadowed, assuming a given antenna diameter.

In order for this function to operate, you must apply a UV selection of the form “auto,or,-auto,or,shadow(1)”. This is a necessary hack to enable the internal UVW recomputation needed for shadow testing. Obviously, the example selection doesn’t filter out any data. If an appropriate “shadow()” selection is not applied, a MiriadError will be raised.

This function depends on an API in the MIRIAD UV I/O library that may not necessarily be exposed. If this is the case, this function will raise a NotImplementedError. You can check in advance whether this function is available by checking the return value of mirtask._miriad_c.probe_uvchkshadow(), True indicating availability.

diameter_meters - the diameter within which an antenna is
considered shadowed, measured in meters.

Returns: boolean.

copyLineVars(output)

Copy UV variables to the output dataset that describe the current line in the input set.

copyMarkedVars(output)

Copy variables in this data set to the output data set. Only copies those variables which have changed and are marked as ‘copy’.

flush()

Write out any unbuffered changes to the UV data set.

getBandwidths(maxnread=4096, trustmaxnread=False)

Get the bandwidths of the channels being read.

Parameters:
  • maxnread (int) – size of the data buffer; default 4096
  • trustmaxnread (bool) – whether maxnread is known to be accurate; default False
Returns:

the bandwidths in GHz

Return type:

double ndarray

This function returns an array of channel bandwidths measured in GHz. There’s one element for each channel in the most recently-read UV record. The values may not match what you’d naively determine from the UV variables “sfreq” and “sdf” in various situations. Each bandwidth value is positive, which is not necessarily true for the underlying “sdf” variable.

The MIRIAD subroutine underlying this function requires a preallocated buffer for the output data. This routine can determine the correct size after-the-fact but does not know how large the buffer should be at the time of allocation. The maxnread parameter sets this size. If the value is too small for your data, memory corruption will result! This function attempts to detect this case and will raise an exception if a buffer overrun may have occurred. If trustmaxnread is True, the value of maxnread is assumed to be accurate, and no checking is performed.

getCurrentVisNum()

Get the serial number of the current UV record.

Returns:the serial number
Return type:int

Counting begins at zero.

getJyPerK()

Get the Jy/K calibration of the current record

Returns:the Jy/K value
Return type:float

For a regular UV dataset, this is just equivalent to reading the “jyperk” UV variable. mirtask.uvdat.UVDatDataSet instances require more complicated processing.

Returns zero if the value could not be determined.

getLineInfo()

Get line information about the current UV record.

Returns:line information, described below.
Return type:six-element integer ndarray

The six integers are [linetype, nchan, chan0, width, step, win0].

  • linetype – the kind of data being read. 1 indicates spectral data; 2 indicates wideband data; 3 indicates velocity-space data. (Symbolic constants for these are defined in mirtask.util.)
  • nchan – the number of channels in the record.
  • chan0 – the index of the first channel in the record. (This index is 1-based in the MIRIAD API, but is adjusted to be 0-based in miriad-python).
  • width – the number of input channels that are averaged together.
  • step – the increment between selected input channels.
  • win0 – If reading spectral or wideband data, -1. If resampling in velocity space, returns the index of the first spectral window contributing to the returned data. (This index is 1-based in the MIRIAD API, and the null return value is 0, but is likewise adjusted to be 0-based here.)
getLinetype(astext=False)

Get the linetype of the current UV record.

Parameters:astext (bool) – if True, return the linetype as its textual value rather than its integer code; default is False.
Returns:the linetype
Return type:int or str

The linetype values are enumerated in mirtask.util.linetypeName().

getNPol()

Get the number of simultaneous polarizations.

Returns:the number
Return type:int

For a regular UV dataset, this is just equivalent to reading the “npol” UV variable. mirtask.uvdat.UVDatDataSet instances require more complicated processing.

The “npol” quantity is used for on-the-fly Stokes processing of UV data. If a full-Stokes correlator is taking data, the ideal output format is one in which there are four consecutive UV records for each baseline / time combination: one for each simultaneous Stokes parameter. The four records can then easily be combined to perform Stokes conversions (e.g. XX and YY to I) with minimal overhead. In order to be able to do this, the Stokes processing code needs to know whether consecutive records have the desired properties, or not. The UV variable npol records this information.

getPol()

Get the polarization code of the current record.

Returns:the polarization code
Return type:int

For a regular UV dataset, this is just equivalent to reading the “pol” UV variable. mirtask.uvdat.UVDatDataSet instances require more complicated processing.

The default polarization is Stokes I. See the constants in mirtask.util.

getScalar(variable, default=None, missingok=True)

Get the value of a scalar UV variable.

Parameters:
  • variable (str) – the name of the variable to fetch
  • default (any) – the value to return if the variable is not defined in this dataset; defaults to None.
  • missingok (bool) – if default should be returned if the variable is not defined in this dataset; defaults to True. If False and the variable is not defined, raises ValueError.
Returns:

the value

Return type:

numpy scalar type

This gets the value of a scalar UV variable, possibly returning a default value if the variable is not found. The return value is a numpy scalar type appropriate for the UV variable. Note that these types propagate, so there is a danger of overflow or underflow if you do some kinds of math with the return value. Furthermore, if you provide default, it will usually be one of the builting Python numeric types, not a NumPy type, so if code depends on the type of the return value, there may be variations in behavior depending on whether the variable was found or not.

This function actually succeeds for array-valued UV variables as well. In that case, the first array element is returned. The most common use of this function, however, is for variables like nants that have only one value (unless the dataset is semantically invalid).

getSkyFrequencies(maxnread=4096, trustmaxnread=False)

Get the sky frequencies of the channels being read.

Parameters:
  • maxnread (int) – size of the data buffer; default 4096
  • trustmaxnread (bool) – whether maxnread is known to be accurate; default False
Returns:

the sky frequencies in GHz

Return type:

double ndarray

This function returns an array of sky frequencies measured in GHz. There’s one element for each channel in the most recently-read UV record. The values may not match what you’d naively determine from the UV variables “sfreq” and “sdf” in various situations.

The MIRIAD subroutine underlying this function requires a preallocated buffer for the output data. This routine can determine the correct size after-the-fact but does not know how large the buffer should be at the time of allocation. The maxnread parameter sets this size. If the value is too small for your data, memory corruption will result! This function attempts to detect this case and will raise an exception if a buffer overrun may have occurred. If trustmaxnread is True, the value of maxnread is assumed to be accurate, and no checking is performed.

getVarComplex(varname, n=1)

Retrieve the current value or values of a complex-valued UV variable.

getVarDouble(varname, n=1)

Retrieve the current value or values of a double-valued UV variable.

getVarFloat(varname, n=1)

Retrieve the current value or values of a float-valued UV variable.

getVarInt(varname, n=1)

Retrieve the current value or values of an int32-valued UV variable.

getVarShort(varname, n=1)

Retrieve the current value or values of an int16-valued UV variable.

getVarString(varname)

Retrieve the current value of a string-valued UV variable. Maximum length of 512 characters.

getVariance()

Get the variance of the first channel of the current UV record.

Returns:the variance
Return type:double

Keep in mind that if the read-in data comprise multiple windows with different channel bandwidths, the variance needs to be scaled appropriately: variance ~ 1 / sqrt (bandwidth).

Returns zero if the variance could not be determined.

initVarsAsInput(linetype)

Initialize the UV reading functions to copy variables from this file as an input file. Linetype should be one of ‘channel’, ‘wide’, or ‘velocity’. Maps to Miriad’s varinit() call.

initVarsAsOutput(input, linetype)

Initialize this dataset as the output file for the UV reading functions. Linetype should be one of ‘channel’, ‘wide’, or ‘velocity’. Maps to Miriad’s varonit() call.

lowlevelRead(preamble, data, flags, length=None)

Read a visibility record from the file. This function should be avoided in favor of the uvdat routines except for certain low-level manipulations. Length defaults to the length of the flags array.

Returns: the number of items read.

makeVarTracker()

Create a UVVarTracker object, which can be used to track the values of UV variables and when they change.

next()

Skip to the next UV data record. On write, this causes an end-of-record mark to be written.

probeVar(varname)

Get information about a given variable. Returns (type, length, updated) or None if the variable is undefined.

type - The variable type character: a (text), r (“real”/float), i (int), d (double), c (complex)

length - The number of elements in this variable; zero if unknown.

updated - True if the variable was updated on the last UV data read.

rewind()

Rewind to the beginning of the file, allowing the UV data to be reread from the start.

rewriteFlags(flags)

Rewrite the channel flagging data for the current visibility record. ‘flags’ should be a 1D integer ndarray of the same length and dtype returned by a uvread call.

scanUntilChange(varname)

Scan through the UV data until the given variable changes. Reads to the end of the record in which the variable changes. Returns False if end-of-file was reached, True otherwise.

setCorrelationType(type)

Set the correlation type that will be used in this vis file.

setPreambleType(*vars)

Specify up to five variables to put in the preamble block. Should be given a list of variable names; ‘uv’ and ‘uvw’ are a special expansion of ‘coord’ that expand out to their respective UV coordinates. Default list is ‘uvw’, ‘time’, ‘baseline’.

trackVar(varname, watch, copy)

Set how the given variable is tracked. If ‘watch’ is true, updated() will return true when this variable changes after a chunk of UV data is read. If ‘copy’ is true, this variable will be copied when copyMarkedVars() is called.

updated()

Return true if any user-specified ‘important variables’ have been updated in the last chunk of data read.

write(preamble, data, flags, length=None)

Write a visibility record consisting of the given preamble, data, flags, and length. Length defaults to the length of the flags array.

writeVarDouble(name, val)

Write a double UV variable. val can either be a single value or an ndarray for array variables.

writeVarFloat(name, val)

Write an float UV variable. val can either be a single value or an ndarray for array variables.

writeVarInt(name, val)

Write an integer UV variable. val can either be a single value or an ndarray for array variables.

writeVarString(name, val)

Write a string UV variable. val will be stringified.

class mirtask.UVVarTracker(owner)
copyTo(output)

Copy the variables tracked by this tracker into the output data set.

track(*vars)

Indicate that the specified variable(s) should be tracked by this tracker. Returns self for convenience.

updated()

Return true if one of the variables tracked by this tracker was updated in the last UV data read.

class mirtask.XYDataSet(path, mode, axes=None)
Synopsis :an opened image dataset

This class provides access to MIRIAD image data. It allows whole image planes to be read in easily using the XYDataSet.readPlane() function.

You shouldn’t create XYDataSet instances directly. Instead, use miriad.ImData.open().

axes = None

An integer ndarray of axis sizes. Stored in “inside-out” format: axes[0] is the most quickly-varying axis, almost always the image column number. axes[1] is the second-most quickly-varying axis, almost always the image row number. axes is set upon creation of the instance and modifications to it after that point have no effect (besides probably causing the methods to crash).

flush()

Write any pending changes to disk.

Returns:self
readPlane(axes=None, buf=None, topIsZero=False)

Read the current plane.

Parameters:
  • axes (int ndarray) – the pixel coordinates of the non-plane axes, or None (the default) to use the current axes
  • buf (masked ndarray of shape (nrow, ncol)) – the buffer into which the data are stored, or None (the default) to allocate a new buffer
  • topIsZero (bool) – whether to invert the image ordering from MIRIAD’s bottom-to-top ordering to top-to-bottom
Returns:

the buffer

Raises :

MiriadError about end-of-file if MIRIAD doesn’t know which plane to read

Reads the current plane into a buffer. If buf is not None, it must be of shape (nrow, ncol) (equivalently, (self.axes[1], self.axes[0])) and be a masked ndarray. Otherwise, a new buffer is allocated.

You must tell MIRIAD which image plane you wish to read before calling this function – otherwise, a MiriadError about end-of-file is raised. You can do this either by giving a non-None value to the axes argument, or by calling setPlane() explicitly. (The former approach is a shorthand for the latter.) Note that the default value of axes (None) doesn’t change which plane should be read, but also doesn’t choose a plane if none has been chosen already. If you want to read the first plane in an image without any setup, the correct call is:

data = ImData ('path').open ('rw').readPlane (axes=[])

MIRIAD’s image coordinate system is “bottom-to-top”, where pixel (0, 0) in a plane is its bottom-left pixel. This can be counterintuitive, but all of MIRIAD’s coordinate routines rely on this system, so you should attempt to get used to it. However, in certain cases it can be useful to read in a plane such that pixel (0, 0) is its top-right pixel. Setting topIsZero to True does this.

See also readRows() and readRow().

readRow(rownum)

Read a row of data from the current plane

Parameters:rownum (int) – the zero-based for number to read
Returns:a masked ndarray of data

The method setPlane() must be called before the first attempt to read or write image data.

Reads one row of data and flags from the current plane into a buffer. The returned array has a shape of (self.axes[0], ). The buffer is stored in the object instance and is shared between all I/O calls, so be careful with concurrent access.

See also readRows() and readPlane().

readRows(topIsZero=False)

Read all rows of data from the current plane

Parameters:topIsZero (bool) – whether to invert the image ordering from MIRIAD’s bottom-to-top ordering to top-to-bottom
Returns:generator yielding masked ndarray of data

Reads all rows of data and flags from the current plane into a buffer. The returned arrays have a shape of (self.axes[0], ). The buffer is stored in the object instance and is shared between all I/O calls, so be careful with concurrent access.

The method setPlane() must be called before the first attempt to read or write image data.

MIRIAD’s image coordinate system is “bottom-to-top”, where pixel (0, 0) in a plane is its bottom-left pixel. This can be counterintuitive, but all of MIRIAD’s coordinate routines rely on this system, so you should attempt to get used to it. However, in certain cases it can be useful to read in a plane such that pixel (0, 0) is its top-right pixel. Setting topIsZero to True does this.

See also readPlane() and readRow().

setPlane(axes=[])

Set the active plane for reading or writing.

Parameters:axes (int ndarray) – the pixel coordinates of the non-plane axes (default zeros)
Returns:self

A MIRIAD image can have any (reasonable) number of dimensions, but is typically read one “plane” at a time. A plane comprises a subcube of the first two axes of data with the coordinates of the other axes held constant. This routine sets the pixel coordinate values of the other axes.

If an image has n axes, axes should have at most n - 2 elements, because two axes refer to the plane being read. However, axes may have fewer elements, with the pixel values of the outer axes being set to zero. It is valid for axes to be an empty list, specifying that all non-axis coordinates should be set to zero, and this is in fact the default argument.

Note that in MIRIAD, array indices are Fortran style and begin at one; in this function, as in Python in general, array indices begin at zero.

wcs()

Retrieve a pywcs.WCS object representing the coordinate system of this image.

Return type:pywcs.WCS, list of str
Returns:tuple of (wcs, warnings)
Raises :ImportError if pywcs is not available
Raises :MemoryError if memory for the instance couldn’t be allocated

Note that wcslib, and hence pywcs, use degrees internally, unlike MIRIAD.

This function returns a tuple of a WCS coordinate system object and a list of warnings encountered when setting up the coordinate system. It’s up to the caller to decide what to do about the warnings, including whether and how to present them to the user.

At the moment, we do not encourage newly-written Python code to attempt to use the classical MIRIAD APIs for coordinate manipulation.

writePlane(maskeddata, axes=None, topIsZero=False)

Write a plane of data.

Parameters:
  • maskeddata (masked ndarray of shape (nrow, ncol)) – the data buffer
  • axes (int ndarray) – the pixel coordinates of the non-plane axes, or None (the default) to use the current axes
  • topIsZero (bool) – whether to invert the image ordering from MIRIAD’s bottom-to-top ordering to top-to-bottom
Returns:

self

Writes data to the current plane. buf must be of shape (nrow, ncol) (equivalently, (self.axes[1], self.axes[0])) and be a masked ndarray.

The method setPlane() must be called before the first attempt to read or write image data. If axes is not None, setPlane() will be called with axes as an argument before performing the read.

MIRIAD’s image coordinate system is “bottom-to-top”, where pixel (0, 0) in a plane is its bottom-left pixel. This can be counterintuitive, but all of MIRIAD’s coordinate routines rely on this system, so data will likely come in this format. Howeve, if maskeddata is stored in a top-to-bottom system, where pixel (0, 0) is its top-right pixel, setting topIsZero to True will write out the data in the correct order.

See also writeRow().

writeRow(rownum, maskeddata)

Write a row of data to the current plane

Parameters:
  • rownum (int) – the zero-based for number to read
  • maskeddata (numpy maskedarray) – the data to write
Returns:

self

Writes one row of data and flags to the current plane. The argument maskeddata must have a shape of (self.axes[0], ). rownum should be between zero and self.axes[1] - 1.

The method setPlane() must be called before the first attempt to read or write image data.

See also writePlane().

class mirtask.MaskItem(dataset, itemname, mode)

A ‘mask’ item contained within a Miriad dataset.

Table Of Contents

Previous topic

High-Level Access to MIRIAD Data: miriad

Next topic

MIRIAD Data Utilities: mirtask.util

This Page