calpipe — calibration pipeline¶
calpipe orchestrates NenuFAR calibration. It is a wrapper around DP3 (DPPP) and several supporting tools (nenupy, losoto, AOFlagger) driven by a single TOML configuration file.
Basic usage¶
calpipe config.toml SB*.MS
Multiple MS paths can be given; calpipe dispatches each one as an
independent job to the worker pool.
How it works¶
A calpipe config lists the ordered pipeline steps in steps and has one
TOML section per step that overrides the defaults for that task:
steps = ['restore_flags', 'build_sky_model', 'ddecal', 'subtract', 'apply_cal']
[ddecal]
cal.parmdb = 'instrument.h5'
cal.sol_int = 20
cal.uv_min = 20
[subtract]
type = 'subtract'
col_out = 'CORRECTED_DATA'
[apply_cal]
col_in = 'CORRECTED_DATA'
cal.parmdb = 'instrument.h5'
Every parameter has a default (see Global configuration and Task reference); you only need to override what differs from the defaults.
A step name is also its task type. To run the same task type under a
different name, set type explicitly:
steps = ['subtract_ateam', 'subtract_cyga']
[subtract_ateam]
type = 'subtract'
directions = '!Main'
[subtract_cyga]
type = 'subtract'
directions = 'CygA'
The sky model¶
Calibration works by comparing the observed visibilities against a model of
the sky, and solving for the instrumental gains that reconcile the two. The
ddecal, predict, subtract, and peel tasks all need such a model.
A sky model comes in two flavours:
intrinsic — the true flux of the sources on the sky;
apparent — the intrinsic model seen through the primary beam, i.e. what the instrument actually measures. DP3 calibrates against the apparent model.
The build_sky_model task turns an intrinsic model into the apparent one by
applying the NenuFAR beam, and writes the result inside each MS (a .skymodel
plus a DP3 .sourcedb) under the name given by app_sky_model_name. The later
tasks then reference that name — so build_sky_model normally runs once, before
ddecal.
Where the intrinsic model comes from is set by int_sky_model in the
sky_model section:
a catalog —
lcs165orspecfind— queried around the field withincatalog_radius, keeping sources brighter thanmin_flux;a
.skymodelfile you provide.
A-team sources (the handful of very bright off-axis sources: CasA, CygA,
TauA, VirA) are added on top when add_ateam is true, using the intrinsic model
int_ateam_sky_model ('lowres' for the built-in one, or a file) and filtered
by ateam_min_elevation. Including them lets the solver account for — and
later subtract — their contamination.
If you already have an apparent model, set app_sky_model_file instead and
build_sky_model simply copies it (optionally one file per frequency), skipping
the catalog/beam step entirely.
A typical catalog-based configuration:
steps = ['build_sky_model', 'ddecal', 'subtract', 'apply_cal']
[sky_model]
int_sky_model = 'lcs165' # or a path to a .skymodel file
[build_sky_model]
catalog_radius = 20 # deg around the field centre
min_flux = 0.5 # Jy
add_ateam = true
ateam_always_keep = ['CasA', 'CygA']
ateam_min_elevation = 10
To reuse a ready-made apparent model instead of building one:
steps = ['ddecal', 'subtract', 'apply_cal'] # no build_sky_model step
[sky_model]
app_sky_model_file = '/path/to/apparent.skymodel'
See sky_model (global options) and
build_sky_model (task options) for the full parameter
tables.
With a data-handler¶
If [data_handler] is configured, obs_ids or glob patterns can be used
instead of explicit MS paths:
[data_handler]
config_file = 'data_handler.toml' # path to the data-handler config
data_level = 'L2_BP' # data level to calibrate
calpipe config.toml "202312*_NT04"
calpipe config.toml "20231208_NT04:SW03" # restrict to SW03
See Global configuration for the [data_handler] section.
Available tasks¶
Task |
Description |
|---|---|
|
Build apparent sky model from catalog or intrinsic model |
|
Direction-dependent calibration with DP3 DDE-Cal |
|
Apply gain solutions with DP3 ApplyCal |
|
Subtract patches with DP3 |
|
Predict model visibilities with DP3 |
|
Save / restore the FLAG column checkpoint |
|
Pre/post-calibration flagging (AOFlagger, SSINS, bad baselines, …) |
|
Advanced iterative peeling of bright off-axis sources |
|
Smooth gain solutions across multiple MSs |
Detailed parameter tables for each task are in Task reference.
Typical workflows¶
Calibration-only (no sky model build)¶
steps = ['restore_flags', 'ddecal', 'subtract', 'apply_cal']
[ddecal]
cal.parmdb = 'instrument.h5'
cal.uv_min = 20
cal.sol_int = 30
[subtract]
type = 'subtract'
col_out = 'CORRECTED_DATA'
[apply_cal]
col_in = 'CORRECTED_DATA'
cal.parmdb = 'instrument.h5'
With sky model build (first run)¶
steps = ['build_sky_model', 'ddecal', 'subtract', 'apply_cal']
[build_sky_model]
min_flux_path = 15
add_ateam = true
ateam_always_keep = ['CasA', 'CygA']
ateam_min_elevation = 10
Dry-run on a cluster¶
[worker]
nodes = 'node[101-110]'
max_concurrent = 4
dry_run = true
env_file = '/home/user/.bashrc'
Global configuration¶
Global sections apply to the whole pipeline run, not to individual tasks.
They can be omitted — defaults are loaded from default_settings.toml.
worker¶
Controls the distributed execution engine.
Parameter |
Default |
Description |
|---|---|---|
|
|
Comma-separated node list or range expression ( |
|
|
Maximum parallel jobs per node |
|
|
Shell file sourced before each job (e.g. |
|
|
Print commands without executing them |
|
|
Enable verbose worker logging |
|
|
Route each job to the node that holds the MS file |
|
|
Regex with a capture group that extracts the hostname from the MS path (requires |
|
|
DP3 thread count per job (0 = DP3 default) |
Example¶
[worker]
nodes = 'node[101-110]'
max_concurrent = 4
env_file = '/home/user/.bashrc'
run_on_file_host = true
run_on_file_host_pattern = '/net/([^/]+)/'
sky_model¶
Defines the sky model files shared across tasks.
Parameter |
Default |
Description |
|---|---|---|
|
|
Intrinsic sky model: a filename, or one of |
|
|
Intrinsic A-team sky model: a filename, or |
|
|
Base name (without extension) of the apparent sky model written inside each MS |
|
|
If set, skip the catalog fetch and copy this file as the apparent sky model. Supports |
Using a pre-built apparent sky model¶
[sky_model]
app_sky_model_file = '/path/to/apparent.skymodel'
Per-frequency sky model (dict form)¶
[sky_model.app_sky_model_file]
# keys are frequency in MHz; the first key >= the MS centre frequency is used
150 = '/models/app_150mhz.skymodel'
185 = '/models/app_185mhz.skymodel'
data_handler¶
Enables obs_id–based input resolution via a data_handler.toml
file. When this section is present and config_file is non-empty, the
positional arguments to calpipe are treated as obs_id patterns rather than
MS paths.
Parameter |
Default |
Description |
|---|---|---|
|
|
Path to the data-handler TOML config; leave empty to use explicit MS paths |
|
|
Data level to resolve (must be defined in |
Example¶
[data_handler]
config_file = 'data_handler.toml'
data_level = 'L2_BP'
Then invoke:
calpipe calibration.toml "202312*_NT04"
calpipe calibration.toml "20231208_NT04:SW03,SW04"
The obs_id pattern may include a spectral window filter after a colon
(obs_id_pattern:SW_pattern).
Task reference¶
Each task corresponds to a TOML section in the config file. All parameters have defaults; only override what you need.
A step uses its own name as the task type unless you set type explicitly:
[my_subtract]
type = 'subtract' # task type is 'subtract', not 'my_subtract'
col_out = 'CORRECTED_DATA'
build_sky_model¶
Builds an apparent sky model from an intrinsic catalog or file and writes it
into each MS under <MS>/sky_model/<app_sky_model_name>.skymodel and a
SourceDB .sourcedb. If add_ateam is true, A-team patches are appended.
Parameter |
Default |
Description |
|---|---|---|
|
|
Search radius in degrees around the MS phase centre |
|
|
Minimum apparent point-source flux (Jy) |
|
|
Minimum apparent patch flux (Jy) |
|
|
Append A-team patches (CasA, CygA, TauA, VirA) |
|
|
Keep these A-team patches even if below |
|
|
Never add these A-team sources |
|
|
Skip A-team patches below this elevation (degrees) |
Example¶
[build_sky_model]
min_flux_path = 20
add_ateam = true
ateam_always_keep = ['CasA']
ateam_min_elevation = 15
ddecal¶
Runs DP3 DDECal to compute direction-dependent gain solutions. Optionally averages the data before calibrating, smooths solutions afterwards, and produces diagnostic plots.
Data and directions¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Input visibility column |
|
|
Patch names to calibrate on, or |
|
|
Time averaging factor before calibration |
|
|
Frequency averaging factor before calibration |
Solver¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Output H5Parm filename (relative to each MS) |
|
|
Solution interval in time slots |
|
|
Calibration mode: |
|
|
Minimum baseline length in wavelengths |
|
|
Spectral smoothness constraint kernel size in Hz (0 = disabled) |
|
|
DP3 solver algorithm (see DP3 docs) |
|
|
Sub-solutions per interval per direction, e.g. |
|
|
Additional DP3 DDECal parameters passed verbatim, e.g. |
Solution smoothing¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Smooth solutions in time and frequency after calibration |
|
|
Gaussian FWHM in minutes (non-Main directions) |
|
|
Gaussian FWHM in MHz (non-Main directions) |
|
|
Gaussian FWHM in minutes for the |
|
|
Gaussian FWHM in MHz for the |
|
|
Clip solutions more than this many sigma above the median |
Diagnostic plots¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Write amplitude/phase plots to |
Example¶
[ddecal]
col_in = 'DATA'
directions = 'all'
cal.parmdb = 'instrument.h5'
cal.sol_int = 30
cal.uv_min = 20
cal.smoothnessconstraint = 2e6
cal.extra.maxiter = 50
smooth_sol.time_min = 10
apply_cal¶
Applies gain solutions from an H5Parm using DP3 ApplyCal.
Parameter |
Default |
Description |
|---|---|---|
|
|
Input column |
|
|
Output column |
|
|
Which direction’s solutions to apply (for multi-direction H5Parm) |
|
|
H5Parm file to read |
|
|
Solution type to apply: |
Example¶
[apply_cal]
col_in = 'CORRECTED_DATA'
col_out = 'CORRECTED_DATA'
direction = 'Main'
cal.parmdb = 'instrument.h5'
subtract¶
Subtracts model visibilities for selected sky-model directions using DP3.
Parameter |
Default |
Description |
|---|---|---|
|
|
Input column |
|
|
Output column |
|
|
Directions to subtract. Use |
|
|
H5Parm with gain solutions for each direction |
|
|
Solution type: |
Example¶
[subtract_ateam]
type = 'subtract'
col_in = 'DATA'
col_out = 'CORRECTED_DATA'
directions = '!Main'
cal.parmdb = 'instrument.h5'
predict¶
Predicts model visibilities into a data column using DP3.
Parameter |
Default |
Description |
|---|---|---|
|
|
Output column |
|
|
Directions to predict |
|
|
H5Parm for gain correction during prediction (empty = no correction) |
|
|
Solution type |
restore_flags¶
Checkpoints the FLAG column. On first call the current flags are saved to
flag_name; on subsequent calls (when the file already exists) the saved
flags are restored. This ensures calibration starts from a clean, repeatable
flag state.
Parameter |
Default |
Description |
|---|---|---|
|
|
Path (relative to each MS) for the flag checkpoint file |
Example¶
[restore_flags]
flag_name = 'flags_before_cal.h5'
flagger¶
Runs one or more flagging algorithms in sequence. Each sub-flagger is independently enabled/disabled.
AOFlagger¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Run AOFlagger |
|
|
AOFlagger strategy file (name or path) |
|
|
Column to flag |
SSINS¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Run the SSINS flagger |
|
|
SSINS settings file (note: parameter name has a typo inherited from the original code) |
Bad baselines / stations¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Detect and flag outlier baselines/stations via AOQuality |
|
|
Flag stations whose amplitude deviates more than this many sigma |
|
|
Flag baselines whose amplitude deviates more than this many sigma |
Manual baseline flagging¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Flag specific baselines/stations |
|
|
DP3 baseline string, e.g. |
|
|
Path to a file mapping obs_ids to baseline strings — text or JSON format (see below) |
baselines_from_file accepts two formats, detected automatically by extension:
Text (
.txtor any other extension): one<obs_id> <baselines>pair per line, where<baselines>is a DP3-formatted string such asMR003&&*;MR017&&*. Lines starting with#are ignored.JSON (
.json): the output ofnenudata bad-stationsoraostats find-bad-stations. Keys are obs_ids; values are lists of bare antenna names. The&&*suffix is appended automatically. Keys starting with_(e.g._meta) are ignored.
# JSON format — points at the file managed by nenudata bad-stations
baselinesflag.baselines_from_file = 'bad_stations.json'
See nenudata — bad stations for the full management workflow.
Frequency flagging¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Flag a fixed frequency range |
|
|
|
Scan flagging¶
Parameter |
Default |
Description |
|---|---|---|
|
|
Flag time scans with anomalously high residuals |
|
|
Sigma threshold above which a scan is flagged |
Example¶
[flagger]
do_aoflagger = true
aoflagger.strategy = 'nenufar_1s1c'
aoflagger.data_col = 'CORRECTED_DATA'
do_badbaselines = true
badbaselines.nsigma_stations = 5
badbaselines.nsigma_baselines = 8
do_baselinesflag = true
baselinesflag.baselines_from_file = 'bad_baselines.txt'
peel¶
Iterative peeling of bright off-axis sources. The MS is first copied with a postfix, then each source is phase-shifted, calibrated, and subtracted in turn.
Parameter |
Default |
Description |
|---|---|---|
|
|
Suffix appended to the MS name for the peeling copy |
|
|
H5Parm with the initial DD calibration |
|
|
Solution type of the initial calibration |
|
|
Scales the solution interval by |
|
|
Minimum solution interval |
|
|
Maximum solution interval |
|
|
Calibration mode for peeling |
|
|
Minimum baseline length in wavelengths |
|
|
Additional DP3 DDECal parameters |
|
|
Phase-shift to each source before calibrating |
|
|
Time averaging after phase shift |
|
|
Frequency averaging after phase shift |
|
|
Smooth solutions after each peel iteration |
|
|
FWHM in minutes for solution smoothing |
|
|
FWHM in MHz for solution smoothing |
multims_smooth_sol¶
Smooths gain solutions across multiple MSs jointly (e.g. across spectral windows or time chunks from the same night), then writes the result to a new H5Parm.
Parameter |
Default |
Description |
|---|---|---|
|
|
Input H5Parm (relative to each MS) |
|
|
Output H5Parm (relative to each MS) |
|
|
Directory for diagnostic plots |
Example¶
steps = ['ddecal', 'multims_smooth_sol', 'apply_cal']
[ddecal]
cal.parmdb = 'instrument_init.h5'
do_smooth_sol = false # skip per-MS smoothing; smooth jointly instead
[multims_smooth_sol]
parmdb_in = 'instrument_init.h5'
parmdb_out = 'instrument_smooth.h5'
[apply_cal]
cal.parmdb = 'instrument_smooth.h5'