imgpipe — imaging pipeline¶
imgpipe runs WSClean on one or more
Measurement Sets, driven by a TOML configuration file. It distributes the
imaging jobs across nodes using the same [worker] pool as calpipe.
WSClean options are written directly in the [wsclean] section of the
config file and map one-to-one to WSClean command-line flags. A few
special values (see below) allow the output directory and some parameters
to be derived automatically from the input MS.
Use cases¶
Image a list of MS files directly¶
imgpipe img.toml SW03_T001.MS SW03_T002.MS SW03_T003.MS
Runs one WSClean job per MS.
Image all MS for a set of observations via the data handler¶
imgpipe img.toml "202312*_NT04"
When [data_handler] is configured in img.toml, imgpipe resolves the
obs_id pattern to MS paths using the data handler and then submits one
WSClean job per MS.
Combine all MS into a single image¶
imgpipe img.toml "202312*_NT04" --combine
All MSs for the same spectral window and obs_id are passed to WSClean in a single call (multi-MS imaging). WSClean averages them jointly during gridding.
Combine across obs_ids too¶
imgpipe img.toml "202312*_NT04" --combine --combine_obs_ids
Groups all MSs across all obs_ids and spectral windows into one WSClean call per spectral window. Useful for deep integrations.
Configuration file¶
[worker]
nodes = 'nancep5' # comma-separated list of hosts, or 'localhost'
max_concurrent = 4 # max simultaneous WSClean jobs per node
env_file = '~/.bashrc' # sourced before each job (to activate software env)
dry_run = false
# Optional: resolve obs_ids -> MS paths via a data handler
[data_handler]
config_file = 'data_handler.toml'
data_level = 'L2'
[wsclean]
name = 'img' # output image name prefix (required)
out-dir = '$ms_basename$/images' # output directory (see path tokens below)
pol = 'I'
size = '1024 1024'
scale = '1arcmin'
weight = 'briggs 0'
data-column = 'CORRECTED_DATA'
niter = 1000
auto-threshold = 3.0
channels-out = 'all' # special value: one output channel per MS channel
WSClean option mapping¶
Every key in [wsclean] except name and out-dir is passed directly to
WSClean as a command-line flag:
String or numeric values become
-key valueBoolean
truebecomes-key(flag with no value)The key
namesets the-nameargument, prefixed with the output directory
Special values for channels-out¶
Value |
Result |
|---|---|
|
Set to the total number of channels in the MS |
|
Set to |
Any integer |
Passed through as-is |
Output directory tokens¶
The out-dir value supports several placeholder tokens that are expanded
at runtime:
Token |
Expands to |
|---|---|
|
Full path of the input MS |
|
Basename of the input MS (directory name without path) |
|
Value of the |
|
Obs_id (only when using the data handler) |
|
Spectral window (only when using the data handler) |
Example:
out-dir = '/data/images/$obs_id$/$sw$/$name$'
run_on_file_host — co-locate imaging with data¶
When the data lives on distributed storage (one MS per node), WSClean
runs fastest on the node that holds the file. Enable this in [worker]:
[worker]
run_on_file_host = true
run_on_file_host_pattern = '\/net/(node\d{3})'
imgpipe extracts the hostname from the MS path using the regex pattern
and submits each job to that host.
Reference¶
imgpipe¶
Calibration pipeline
Usage
imgpipe [OPTIONS] CONFIG_FILE MS_INS_OR_OBS_IDS...
Options
- --version¶
Show the version and exit.
- -c, --combine¶
Combine all MS to produce one image (for each SW/OBS_ID when applicable).
- -o, --combine_obs_ids¶
Combine also OBS_IDS (when applicable).
- -n, --nodes_mpi <nodes_mpi>¶
Nodes to use for distributed imaging
Arguments
- CONFIG_FILE¶
Required argument
- MS_INS_OR_OBS_IDS¶
Required argument(s)