# data_handler.toml reference `data_handler.toml` is the configuration file read by all `nenudata` commands. It describes the spectral layout, data-level paths, cluster nodes, DP3 parsets, and the registry of observations. ## TOML ordering note TOML requires **top-level scalar keys** (e.g. `bp_cal_path`, `cal_obs_id_patterns`) to appear **before the first section header** (`[...]`) in the file. Place them at the very top or immediately after `[obs_ids]`/`[spectral_windows]` before any subsection block. --- ## Required sections ### `[spectral_windows]` Maps a spectral window (SW) name to the inclusive range of sub-band indices it covers. ```toml [spectral_windows] SW01 = [192, 250] SW02 = [251, 312] SW03 = [313, 371] ``` A **composite** SW can also be defined as a list of existing SW names. It is treated as a concatenation of those sub-bands and is available wherever an SW argument is accepted. ```toml [spectral_windows] SW01 = [192, 250] SW02 = [251, 312] SW_ALL = ["SW01", "SW02"] # composite ``` --- ### `[data_level_path]` Maps a data-level label (arbitrary string, e.g. `L1`, `L2`, `L2_BP`) to a path template. The following placeholders are expanded at runtime: | Placeholder | Replaced with | |---------------|--------------------------------------------------| | `%YEAR%` | 4-digit year from the obs_id date prefix | | `%MONTH%` | 2-digit month from the obs_id date prefix | | `%OBS_ID%` | The obs_id (N1 short name for N2 observations) | | `%N1_OBS_ID%` | The full NenuFAR obs_id (long name for N2) | | `%NODE%` | The node responsible for the SW / time chunk | If neither `%OBS_ID%` nor `%N1_OBS_ID%` appears in the template the obs_id is appended to the path automatically. ```toml [data_level_path] L1 = "/databf/nenufar-nri/LT01/%YEAR%/%MONTH%/%N1_OBS_ID%/L1/" L2 = "/net/%NODE%/data/users/lofareor/nenufar/obs/L2/" L2_BP = "/databf/nenufar-nri/LT01/%YEAR%/%MONTH%/%N1_OBS_ID%/L2_BP/" ``` --- ### `[obs_ids]` Registry of **N1 observations** (short, SW-parallel). Each key is an obs_id; the value is a list of nodes, **one entry per spectral window in the order they appear in `[spectral_windows]`**. The same node handles both reading L1 and writing L2 for that SW. ```toml [obs_ids] "20231208_CASA" = ["node101", "node102", "node103"] # ^ SW01 ^ SW02 ^ SW03 "20231210_NT04" = ["node104", "node105", "node106"] ``` This section is **populated automatically** by `nenudata update_data_handler`. --- ## Optional — observation registry ### `[n2_obs_ids.]` Registry of **N2 observations** (long, time-parallel). Each subsection key is the short obs_id used everywhere in the pipeline; the body provides: | Key | Type | Description | |--------------|---------------|---------------------------------------------------------------------| | `n1_obs_ids` | list[string] | Full NenuFAR obs_id(s) used to locate the raw L1 data on disk | | `nodes` | string | Comma-separated node list; time chunks are round-robin distributed | ```toml [n2_obs_ids.20231208_NT04] n1_obs_ids = ["20231208_213000_20231209_070000_NT04_COSMIC_DAWN"] nodes = "node101,node102,node103" ``` Node range notation (e.g. `node[101-103]`) is also accepted by `nenudata` wherever a node string appears. --- ## Optional — calibration The calibration keys below are **top-level scalars/tables** and must appear before any `[section]` header in the file. ### `bad_stations_file` Path to the JSON file managed by `nenudata bad-stations`. Stores per-obs_id lists of bad antennas. The file is created on first write. ```toml bad_stations_file = "bad_stations.json" ``` Used by `nenudata bad-stations` subcommands and by the `flagger` calpipe task when `baselinesflag.baselines_from_file` points to a `.json` file. See [bad stations](bad_stations.md) for the full workflow. --- ### `bp_cal_path` Path template to the bandpass calibration HDF5 file. `%OBS_ID%` is replaced with the calibrator obs_id at runtime. ```toml bp_cal_path = "/databf/.../bp_cal/%OBS_ID%/bp_gains_%OBS_ID%.h5" ``` Used by `l1_to_l2` when the DP3 parset contains the `%BP_CAL_FILE%` marker. --- ### `cal_obs_id_patterns` List of glob patterns used to identify calibrator observations in `[obs_ids]`. Defaults to `["*CYGA*", "*CASA*"]` if omitted. `update_data_handler` excludes matching obs_ids from `[obs_calibration_map]`. ```toml cal_obs_id_patterns = ["*CYGA*", "*CASA*", "*3C*"] ``` --- ### `[obs_calibration_map]` Maps each science obs_id to the calibrator obs_id whose BP solutions are used during L1→L2 processing. Populated automatically by `nenudata update_data_handler`; can also be edited by hand. ```toml [obs_calibration_map] "20231208_NT04" = "20231210_CASA" "20231212_NT04" = "20231210_CASA" ``` --- ## Optional — DP3 pipeline ### `[l1_to_l2_config.]` One subsection per L2 data level. The only required key is `dppp_config`, which points to the DP3 parset file used when running `nenudata l1_to_l2 `. ```toml [l1_to_l2_config.L2_BP] dppp_config = "/home/users/lofareor/parsets/dppp_l1_to_l2_bp.parset" [l1_to_l2_config.L2_12C40S] dppp_config = "/home/users/lofareor/parsets/dppp_l1_to_l2_cyga.parset" ``` If the parset contains the `%BP_CAL_FILE%` marker, `l1_to_l2` automatically injects `applybandpass.parmdb=` for the correct calibrator. --- ## Optional — LST binning ### `[lst_binning.]` Defines the LST grid for `nenudata l2_to_l3_lst`. One subsection per L3 output level. | Key | Type | Description | |-------------|-------|---------------------------------------------| | `start` | float | Start of the LST range (hours) | | `end` | float | End of the LST range (hours) | | `width` | float | Width of each LST bin (hours) | | `longitude` | float | Geographic longitude of the array (radians) | ```toml [lst_binning.L3] start = 6.0 end = 10.0 width = 0.1 longitude = 0.037 ``` Output MS filenames include an LST tag: `SW03_T063.MS` for bin centre 6.3 h. --- ## Optional — remote data transfer ### `[remote_hosts.]` One subsection per remote site. Used by `nenudata transfer` to pull L1 data from an archive host. | Key | Type | Description | |----------------------|--------|--------------------------------------------------------------| | `host` | string | SSH hostname | | `level` | string | Data level present on the remote (e.g. `"L1"`) | | `data_path` | string | Path template; supports `%YEAR%`, `%MONTH%`, `%OBS_ID%`, `%LEVEL%`, `%NODE%` | | `nodes` | string | Comma-separated list or range pattern of remote nodes | | `password_file` | string | Path to an SSH password / key file (optional) | | `data_handler_path` | string | Absolute path to the data-handler config on the remote host (optional); when set, `push_l2` rsyncs `n2_obs_ids_push.json` alongside it | ```toml [remote_hosts.dawn] host = "dawn" level = "L2_BP" data_path = "/net/%NODE%/data/obs/%LEVEL%/%OBS_ID%/" nodes = "node1[01-15]" data_handler_path = "/home/user/nenuflow/data_handler.toml" # optional ``` --- ## Optional — N2 grouping ### `[n2_group]` Groups multiple N2 obs_ids under a single virtual obs_id for `l2_to_l3_lst`. Each key is the virtual group name; the value is the list of N2 obs_ids it contains. Every listed obs_id must already be defined in `[n2_obs_ids]`. ```toml [n2_group] "2023_NT04_ALL" = ["20231208_NT04", "20231210_NT04", "20231212_NT04"] ``` When `nenudata l2_to_l3_lst` receives `2023_NT04_ALL` as the obs_id it processes all constituent N2 obs_ids together. --- ## Minimal example ```toml [spectral_windows] SW03 = [313, 371] SW04 = [372, 430] [data_level_path] L1 = "/databf/nenufar-nri/LT01/%YEAR%/%MONTH%/%N1_OBS_ID%/L1/" L2_BP = "/net/%NODE%/data/users/lofareor/nenufar/obs/L2_BP/" [obs_ids] "20231208_CASA" = ["node101", "node102"] "20231210_NT04" = ["node103", "node104"] ``` ## Full example ```toml # top-level scalars must appear before any [section] header bp_cal_path = "/databf/.../bp_gains/%OBS_ID%/bp_cal_%OBS_ID%.h5" cal_obs_id_patterns = ["*CYGA*", "*CASA*"] [spectral_windows] SW01 = [192, 250] SW02 = [251, 312] SW03 = [313, 371] SW_LOW = ["SW01", "SW02"] # composite [data_level_path] L1 = "/databf/nenufar-nri/LT01/%YEAR%/%MONTH%/%N1_OBS_ID%/L1/" L2_BP = "/net/%NODE%/data/users/lofareor/nenufar/obs/L2_BP/" [obs_ids] "20231208_CASA" = ["node101", "node102", "node103"] "20231210_NT04" = ["node101", "node102", "node103"] [n2_obs_ids.20231208_NT04_LONG] n1_obs_ids = ["20231208_213000_20231209_070000_NT04_COSMIC_DAWN"] nodes = "node101,node102,node103" [obs_calibration_map] "20231210_NT04" = "20231208_CASA" [l1_to_l2_config.L2_BP] dppp_config = "/home/users/lofareor/parsets/dppp_l1_to_l2_bp.parset" [lst_binning.L3] start = 6.0 end = 10.0 width = 0.1 longitude = 0.037 [remote_hosts.databf] host = "databfnfr" level = "L1" data_path = "/data/nenufar-nri/LT01/%YEAR%/%MONTH%/%OBS_ID%/L1/" nodes = "node101,node102" [n2_group] "2023_NT04_ALL" = ["20231208_NT04_LONG"] ```