# Data transfer — retrieve, retrieve_l2, push_l2 These commands move data between the NenuFAR site and the processing cluster using rsync over SSH. Transfer jobs are distributed across nodes using the same worker pool as the calibration pipeline. Remote hosts must be declared in the `[remote_hosts]` section of the data-handler config: ```toml [remote_hosts.dawn] host = 'dawn.obspm.fr' level = 'L1' data_path = '/data/nenuraw/%YEAR%/%MONTH%/%OBS_ID%' nodes = 'node[101-110]' ``` ## retrieve — pull L1 raw data from a remote site ``` nenudata retrieve dawn "202312*_NT04" ``` Rsyncs the L1 sub-band MSs (`SB{sb}.MS`) for each obs_id and spectral window from the remote host into the local `L1` data level directory. ``` nenudata retrieve dawn "202312*_NT04" --sws SW03 --dry_run ``` Use `--dry_run` to print the rsync commands without running them. ## retrieve_l2 — pull processed L2 data from a remote site ``` nenudata retrieve_l2 dawn "202312*_NT04" L2 ``` Retrieves already-processed L2 MSs by querying the remote host for their paths (`nenudata get_ms` on the remote) and then rsyncing each file. MS files are distributed across local nodes in round-robin order. ## push_l2 — push processed L2 data to a remote site ```bash nenudata push_l2 dawn "202312*_NT04" L2_BP ``` Rsyncs local L2 MSs to the remote site. Remote directories are created via SSH before the transfer. MS files are cycled across remote nodes in round-robin order. ```bash # Push a single SW nenudata push_l2 dawn "202312*_NT04" L2_BP --sws SW03 # Dry-run: print rsync commands without executing nenudata push_l2 dawn "202312*_NT04" L2_BP --dry_run ``` ### N2 obs_ids JSON After the transfer completes, `push_l2` writes an `n2_obs_ids_push.json` file next to the local data-handler config. This JSON records the N2 obs_ids that were pushed and the remote node pool, so the remote data handler can be updated without manual editing. If `data_handler_path` is set in the `[remote_hosts.*]` section, the JSON is also rsynced automatically to the directory containing the remote data handler: ```toml [remote_hosts.dawn] host = "dawn" data_path = "/net/%NODE%/data/obs/%LEVEL%/%OBS_ID%/" nodes = "node1[01-15]" data_handler_path = "/home/user/nenuflow/data_handler.toml" # optional ``` After the push, `push_l2` prints the exact `import-n2` command to run on the remote server. --- ## import-n2 — register pushed N2 obs_ids in the remote data handler Run this command **on the remote server** after receiving an `n2_obs_ids_push.json` from `push_l2`: ```bash nenudata import-n2 /path/to/n2_obs_ids_push.json -c data_handler.toml ``` Reads the JSON and appends any new `[n2_obs_ids.*]` sections to the data handler. Entries that are already registered are silently skipped, making the command safe to re-run: ``` Imported 2: 20231208_NT04, 20231210_NT04 Skipped 1 already registered: 20231205_NT04 ``` The resulting data handler can immediately be used by `nenudata l1_to_l2`, `nenudata quality-collect`, etc. on the remote server. ### Options | Option | Default | Description | |---|---|---| | `--config` / `-c` | `data_handler.toml` | Data-handler config file to update | --- ## Reference ```{eval-rst} .. click:: nenucal.tools.nenudata:retrieve :prog: nenudata retrieve :nested: full ``` ```{eval-rst} .. click:: nenucal.tools.nenudata:retrieve_l2 :prog: nenudata retrieve_l2 :nested: full ``` ```{eval-rst} .. click:: nenucal.tools.nenudata:push_l2 :prog: nenudata push_l2 :nested: full ``` ```{eval-rst} .. click:: nenucal.tools.nenudata:import_n2 :prog: nenudata import-n2 :nested: full ``` ## See also - [pipeline](pipeline.md) — process data after retrieval