flagtool — RFI flagging and flag management

flagtool provides RFI flagging and flag management utilities for NenuFAR Measurement Sets.

Flag management commands (backup, restore, reset, copy) let you safely checkpoint and roll back the FLAG column before and after any flagging step.

Three complementary flagging algorithms are available:

  • ssins — SSINS (Sky-Subtracted Incoherent Noise Spectra): broadband RFI detection in spectra-spectra space, run at multiple time averaging scales.

  • delay_flagger — delay-space flagging: flags time slots and baselines whose delay power spectrum (Stokes I and V) exceeds a sigma threshold outside the foreground wedge.

  • vis_flagger — configurable pipeline of visibility-domain filters: time/baseline/frequency thresholds, delay outliers, and narrow-band frequency masks.


backup / restore / reset / copy — flag management

These four commands manage the FLAG column of a Measurement Set. They are intended to be used as safety checkpoints: back up the current FLAG state before running a flagging algorithm, and restore it if the result is unsatisfactory.

The backup format is an HDF5 file that stores the full FLAG array. It is independent of the MS format and can be kept indefinitely alongside the data.

Note that ssins, delay_flagger, and vis_flagger all accept --backup and --restore options that automate this workflow within a single call.

backup

Save the current FLAG column of an MS to an external file.

flagtool backup obs.MS flags_original.h5

restore

Write a previously saved flag backup back into an MS.

flagtool restore obs.MS flags_original.h5

reset

Clear all flags in one or more MSs (sets every FLAG entry to False). Useful for starting a flagging run from a clean state.

flagtool reset obs.MS
flagtool reset SW01.MS SW02.MS SW03.MS

copy

Copy the FLAG column from one MS to another. Both MSs must have the same shape (same number of rows, channels, and correlations).

flagtool copy source.MS destination.MS

A typical use case is propagating flags from a calibrator MS to a corresponding target MS that was observed in the same time window.


ssins — SSINS RFI flagger

SSINS (Sky-Subtracted Incoherent Noise Spectra) detects broadband and narrowband RFI in the time–frequency plane. The algorithm operates on the spectra-spectra representation: the difference of adjacent time samples, which suppresses the sky signal and leaves a noise+RFI residual. See Wilensky et al. (2019) for the method.

The flagger runs at multiple time averaging scales simultaneously (by default 1, 4, and 8 samples) to catch both short bursts and persistent RFI. At each scale, iterative sigma-clipping is applied in the frequency direction. After the core SSINS step, three secondary threshold flaggers extend the flags to full time slots, frequency channels, baselines, and snapshots where the flag fraction exceeds a configurable threshold.

Weights, if present in the MS, are applied to the data before flagging.

Usage

flagtool ssins obs.MS

Loads baselines between 20 m and 4000 m from the DATA column, runs SSINS with the built-in configuration, and writes the result back to the MS. Add --plot_dir flag_plots/ to write diagnostic PNGs (ssins_flag_<ms>.png for the spectra-spectra before/after, baseline_flag_<ms>.png for the per-baseline flag fraction).

Common options: --dry_run previews without writing (pair with --plot_dir for tuning); --backup checkpoints the FLAG column first and --restore rolls it back and exits; --data_col selects the input column; --umin/--umax restrict the baseline range; and --config points at a custom SSINS configuration file (below).

SSINS configuration file

The built-in defaults are in nenucal/templates/ssins_config.toml. Copy and edit to customise:

# Time averaging factors at which to run SSINS
n_time_avg = [1, 4, 8]

# Iterative sigma-clipping sequence for each averaging factor
nsigmas.1 = [8, 5, 5]
nsigmas.4 = [8, 5, 5]
nsigmas.8 = [8, 5, 5]

# Flag a frequency channel entirely when this fraction of time slots are flagged
percentage_freq_full_flag = 0.4
# Flag a time slot entirely when this fraction of frequency channels are flagged
percentage_time_full_flag = 0.4

# Secondary: flag full time/frequency when this fraction of baselines are flagged
time_freq_threshold = 0.6
# Secondary: flag a full baseline when this fraction of its time/freq is flagged
baseline_threshold = 0.75
# Secondary: flag a snapshot (all baselines, one time slot) when this fraction is flagged
snapshot_threshold = 0.5

Increasing nsigmas values makes the flagger more conservative (fewer flags). Decreasing them makes it more aggressive.

Reference

flagtool ssins

Run the SSINS flagger algorithm on MS_IN

Usage

flagtool ssins [OPTIONS] MS_IN

Options

--umin <umin>

Min baseline (in meter)

Default:

20

--umax <umax>

Max baseline (in meter)

Default:

4000

--config <config>

SSINS configuration file

--data_col <data_col>

DATA column to run SSINS on

--dry_run

Do not apply flags

--backup

Backup flags before SSINS

--restore

Restore previously backup-ed flags

--plot_dir <plot_dir>

Plot directory

--backup_file <backup_file>

Backup filename

Arguments

MS_IN

Required argument


delay_flagger — delay-space RFI flagging

delay_flagger flags baselines and time slots whose delay power spectrum contains excess power outside the foreground wedge. This delay-space approach was first used for LOFAR in arXiv:2503.05576.

The algorithm:

  1. Loads baselines in the range [--umin, --umax] and averages --n_time_avg consecutive time samples to boost sensitivity.

  2. Computes the delay power spectrum for Stokes I, Stokes V, and the time-differential (consecutive-sample difference) using a Blackman-Harris window. The differential spectrum provides a reference noise floor.

  3. Computes the ratio of the Stokes I (or V) delay power to the differential reference, restricted to delays outside the horizon wedge and below 4 µs.

  4. Bins this ratio into --n_times time bins and sigma-clips at --n_sigma_i (Stokes I) and --n_sigma_v (Stokes V). Outlier baseline × time bins are flagged.

  5. Flags are propagated back to the original time resolution by mapping each time bin to the corresponding original samples.

Stokes V flags catch circularly polarised RFI that is invisible in Stokes I. Running both channels simultaneously provides better completeness.

The baseline range [50, 400] m is a deliberate choice: very short baselines can be dominated by mutual coupling, and very long baselines push the wedge further in delay, leaving less clean delay space for the outlier test.

Usage

flagtool delay_flagger obs.MS

With no options this averages 50 time samples, bins into 20 time bins, and sigma-clips at 6 for both Stokes I and V.

A typical tuning run lowers the thresholds and writes diagnostic plots so you can inspect what is being flagged before committing:

flagtool delay_flagger obs.MS --n_sigma_i 4 --n_sigma_v 4 --plot_dir delay_plots/

This writes a per-baseline flag map to delay_plots/; add --plot_ps_all_baselines to also dump each baseline’s delay power spectrum. Use --backup (and --restore) to checkpoint the FLAG column first, and --dry_run to preview without writing flags. See the options reference below for the full list.

Reference

flagtool delay_flagger

Run the delay flagger algorithm on MS_IN

Usage

flagtool delay_flagger [OPTIONS] MS_IN

Options

--data_col <data_col>

DATA column to run delay flagger on

--umin <umin>

Min baseline (in meter)

Default:

50

--umax <umax>

Max baseline (in meter)

Default:

400

--n_time_avg <n_time_avg>

Number of time slot to averages

--n_times <n_times>

Number of time bins

--n_sigma_i <n_sigma_i>

Stokes I n sigma threshold

--n_sigma_v <n_sigma_v>

Stokes V n sigma threshold

--backup

Backup flags before delay flagger

--restore

Restore previously backup-ed flags

--plot_dir <plot_dir>

Plot directory

--plot_ps_all_baselines
--backup_file <backup_file>

Backup filename

--dry_run

Do not apply flags

Arguments

MS_IN

Required argument


vis_flagger — configurable visibility flagging pipeline

vis_flagger runs an ordered sequence of flagging filters on a Measurement Set. Which filters are applied, and in what order, is controlled by a TOML configuration file. A single run can combine coarse global thresholds (removing dominated time slots or baselines) with fine-grained spectral and delay masks.

Settings not specified in the user config are filled in from the built-in defaults (nenucal/templates/default_vis_flagger_config.toml).

These visibility-domain filters were first used for NenuFAR in arXiv:2503.05576.

Available filters

Filter name

What it does

flag_time_threshold

Flag entire time slots where the flagged fraction exceeds a threshold

flag_baseline_threshold

Flag entire baselines where the flagged fraction exceeds a threshold

flag_freqs_band

Flag specific frequency bands using delay-domain and sigma criteria

flag_delay_time_outliers

Flag time slots with outlier delay power outside the wedge

flag_delay_baseline_outliers

Flag baselines with outlier delay power

flag_time_freq_outliers

Flag outliers in the time–frequency plane (Stokes I or V)

The filters key in the config controls the active set and their order.

Usage

flagtool vis_flagger obs.MS vis_flagger.toml

Runs the filters listed in the config (below) in order. Add --plot_dir flag_plots/ to have each active filter write diagnostic PNGs. As with the other flaggers, --dry_run previews without writing, and --backup / --restore checkpoint and roll back the FLAG column.

Configuration file

Start from the built-in template and enable only the filters you need:

# Baseline selection for all filters
umin = 50
umax = 500
data_col = 'DATA'
n_time_avg = 50

# Ordered list of filters to apply
filters = ['flag_time_threshold', 'flag_baseline_threshold',
           'flag_delay_time_outliers', 'flag_delay_baseline_outliers']

# Flag entire time slots if more than 35 % of their visibilities are flagged
flag_time_threshold = 0.35

# Flag entire baselines if more than 80 % of their visibilities are flagged
flag_baseline_threshold = 0.8

# Empty dict — no narrow-band frequency masks
flag_freqs_band = {}

[flag_delay_time_outliers]
n_sigma = 5          # sigma threshold for outlier detection
window_fct = 'hann'  # window function for delay transform
wedge_factor = 1.2   # multiplier on the horizon delay to define the wedge exclusion zone
delay_max = 4        # upper delay limit in µs for the outlier test
hpass_filter = true  # apply a high-pass filter before delay transform

[flag_delay_baseline_outliers]
n_sigma = 5
window_fct = 'hann'
wedge_factor = 1.2
delay_max = 4
max_threshold = 3    # flag baselines whose peak delay power ratio exceeds this
min_threshold = 1.8  # flag baselines whose median delay power ratio exceeds this
hpass_filter = true

[flag_time_freq_outliers]
n_sigma = 5
hpass_filter = true
hpass_n_chan = 4     # number of channels for the high-pass filter
stokes = 'I'         # Stokes parameter to use ('I' or 'V')

Adding a narrow-band frequency mask

filters = ['flag_time_threshold', 'flag_freqs_band', 'flag_delay_time_outliers']

[flag_freqs_band.fm_band]
freqs_filter = [87.5e6, 108e6]   # flag this frequency range
freqs_match  = [80e6, 120e6]     # reference band for the delay model
n_sigma = 5
stokes = 'I'
hpass_filter = true

Multiple named bands can be added as separate [flag_freqs_band.<name>] sections; each is applied independently in the order they appear in filters.

Reference

flagtool vis_flagger

Run the vis_flagger algorithm on MS_IN with configuration CONFIG

Usage

flagtool vis_flagger [OPTIONS] MS_IN CONFIG

Options

--backup

Backup flags before vis flagger

--restore

Restore previously backup-ed flags

--plot_dir <plot_dir>

Plot directory

--backup_file <backup_file>

Backup filename

--dry_run

Do not apply flags

Arguments

MS_IN

Required argument

CONFIG

Required argument