# imgpipe — imaging pipeline `imgpipe` runs [WSClean](https://wsclean.readthedocs.io) on one or more Measurement Sets, driven by a TOML configuration file. It distributes the imaging jobs across nodes using the same `[worker]` pool as `calpipe`. WSClean options are written directly in the `[wsclean]` section of the config file and map one-to-one to WSClean command-line flags. A few special values (see below) allow the output directory and some parameters to be derived automatically from the input MS. ## Use cases ### Image a list of MS files directly ``` imgpipe img.toml SW03_T001.MS SW03_T002.MS SW03_T003.MS ``` Runs one WSClean job per MS. ### Image all MS for a set of observations via the data handler ``` imgpipe img.toml "202312*_NT04" ``` When `[data_handler]` is configured in `img.toml`, `imgpipe` resolves the obs_id pattern to MS paths using the data handler and then submits one WSClean job per MS. ### Combine all MS into a single image ``` imgpipe img.toml "202312*_NT04" --combine ``` All MSs for the same spectral window and obs_id are passed to WSClean in a single call (multi-MS imaging). WSClean averages them jointly during gridding. ### Combine across obs_ids too ``` imgpipe img.toml "202312*_NT04" --combine --combine_obs_ids ``` Groups all MSs across all obs_ids and spectral windows into one WSClean call per spectral window. Useful for deep integrations. ## Configuration file ```toml [worker] nodes = 'nancep5' # comma-separated list of hosts, or 'localhost' max_concurrent = 4 # max simultaneous WSClean jobs per node env_file = '~/.bashrc' # sourced before each job (to activate software env) dry_run = false # Optional: resolve obs_ids -> MS paths via a data handler [data_handler] config_file = 'data_handler.toml' data_level = 'L2' [wsclean] name = 'img' # output image name prefix (required) out-dir = '$ms_basename$/images' # output directory (see path tokens below) pol = 'I' size = '1024 1024' scale = '1arcmin' weight = 'briggs 0' data-column = 'CORRECTED_DATA' niter = 1000 auto-threshold = 3.0 channels-out = 'all' # special value: one output channel per MS channel ``` ### WSClean option mapping Every key in `[wsclean]` except `name` and `out-dir` is passed directly to WSClean as a command-line flag: - String or numeric values become `-key value` - Boolean `true` becomes `-key` (flag with no value) - The key `name` sets the `-name` argument, prefixed with the output directory ### Special values for `channels-out` | Value | Result | |---|---| | `'all'` | Set to the total number of channels in the MS | | `'every N'` | Set to `total_channels // N` | | Any integer | Passed through as-is | ### Output directory tokens The `out-dir` value supports several placeholder tokens that are expanded at runtime: | Token | Expands to | |---|---| | `$ms_in$` | Full path of the input MS | | `$ms_basename$` | Basename of the input MS (directory name without path) | | `$name$` | Value of the `name` key in `[wsclean]` | | `$obs_id$` | Obs_id (only when using the data handler) | | `$sw$` | Spectral window (only when using the data handler) | Example: ```toml out-dir = '/data/images/$obs_id$/$sw$/$name$' ``` ### `run_on_file_host` — co-locate imaging with data When the data lives on distributed storage (one MS per node), WSClean runs fastest on the node that holds the file. Enable this in `[worker]`: ```toml [worker] run_on_file_host = true run_on_file_host_pattern = '\/net/(node\d{3})' ``` `imgpipe` extracts the hostname from the MS path using the regex pattern and submits each job to that host. ## Reference ```{eval-rst} .. click:: nenucal.tools.imgpipe:main :prog: imgpipe :nested: full ```