Observing Matrices
Introduction
A fundamental fact about CMB observations is that any given map is filtered in some way. For instance, ground-based experiments apply timestream filters to mitigate the contribution of atmospheric noise, and even full-sky experiments such as WMAP and Planck must mask out the galactic plane.
Generically, any map making procedure can be encoded as some function which transforms an idealized (pixelized) map $\mat{\hat m}$ into an observed map $\mat{\tilde m}$. If the function is a linear operation, then it can be encoded as a so-called observing matrix $\mat R$ with
\[ \mat{\tilde m} = \mat R \mat{\hat m}\]
This section describes the tools provided by CMB.jl
for working with obsering matrices. These tools are specifically motivated by the use of observing matrices in generating reobserved pixel-pixel covariance matrices.
A detailed description of composing an observing matrix is given in BICEP/Keck Array VII (as well as the theses of J. Tolan and J. Willmert).
Loading an Observing Matrix
It is assumed that all observing matrices are sparse, and the native format used by this package is of compressed sparse column (CSC) matrices in accordance with Julia's SparseMatrixCSC
type from the SparseArrays
standard library. Observing matrices are read from disk with the read_obsmat
function, and the following formats are supported:
An HDF5 file written by
write_obsmat
. (See the next section.)numpy
sparse matrices saved to an HDF5 file byh5sparse
in either CSC or compressed sparse row (CSR) format.MATLAB save files. Requires explicitly loading
MAT.jl
first.Julia sparse matrices saved to an HDF5 file in the
JLD
format. Requires explicitly loadingJLD.jl
first.Julia sparse matrices saved to an HDF5 file in the
JLD2
format. Requires explicitly loadingJLD2.jl
first.
For repeated computation, the first format should be preferred due to its support for memory mapping the matrix from disk.[1]
The return is a tuple, first the observing matrix $\mat R$ and second a NamedTuple
of metadata with the following fields:
fields
— a description of the Stokes fields for which the observing matrix applies — e.g. the value"QU"
signals that $\mat R$ is the block matrix $[\mat{R_{QQ}} \mat{R_{QU}}; \mat{R_{UQ}} \mat{R_{UU}}]$.pixels_right
— a description of the "right-hand side" map pixelization, i.e. the pixelization of a map-vector $\mat{\hat m}$ for which the product $\mat R \mat{\hat m}$ is defined.pixels_left
— a description of the "left-hand side" map pixelization, i.e. the pixelization of the resultant map-vector $\mat{\tilde m} = \mat R \mat{\hat m}$.
The pixelization descriptions are in general data-format and situation specific. For the HDF5 backend, the pixelization description may be any "native" HDF5.jl
data type (such as numbers, strings, or arrays thereof) or an HDF5 group which is loaded recursively into a Dict{String,Any}
with leaf-nodes which are a native type.
Exporting an Observing Matrix
To support memory mapping the sparse matrix from disk, the write_obsmat
function is provided to ensure proper data alignment and choice of HDF5 options.
Using Memory Mapping
By default with the HDF5 backend, the component arrays of the sparse matrix are memory mapped. This choice is made for one primary reason — memory mapping the arrays allows multiple instances of Julia to share a single copy of the matrix in RAM, with the OS managing this behavior automatically. This is especially useful for computations being run on a cluster where the cluster manager may launch multiple jobs doing similar calculations on the same compute node.
The default memory mapping behavior can be changed by setting the values of CMB.Files.READ_OBSMAT_MMAP
and CMB.Files.READ_OBSMAT_MMAP_FLAGS
— the former can be set to false
to disable memory mapping completely:
julia> CMB.Files.READ_OBSMAT_MMAP[] = false # disables default memory mapping
while the latter dictates the flags which are used during the underlying mmap
call. On Linux systems, the default is
julia> CMB.Files.READ_OBSMAT_MMAP_FLAGS[] = UnixMmap.MMAP_SHARED | UnixMmap.MMAP_POPULATE
so that the arrays are pre-faulted into active RAM. (On other OSs, the arrays can either be manually faulted with a read loop, or the loads can be left to happen on demand when the matrix is used.)
- 1In particular, it is not possible to memory map the on-disk arrays saved by
numpy
and MATLAB because the internal data structures use 0-indexed pointers which are incompatible with Julia's use of 1-indexing.