Data transfer — retrieve, retrieve_l2, push_l2¶
These commands move data between the NenuFAR site and the processing cluster using rsync over SSH. Transfer jobs are distributed across nodes using the same worker pool as the calibration pipeline.
Remote hosts must be declared in the [remote_hosts] section of the
data-handler config:
[remote_hosts.dawn]
host = 'dawn.obspm.fr'
level = 'L1'
data_path = '/data/nenuraw/%YEAR%/%MONTH%/%OBS_ID%'
nodes = 'node[101-110]'
retrieve — pull L1 raw data from a remote site¶
nenudata retrieve dawn "202312*_NT04"
Rsyncs the L1 sub-band MSs (SB{sb}.MS) for each obs_id and spectral window
from the remote host into the local L1 data level directory.
nenudata retrieve dawn "202312*_NT04" --sws SW03 --dry_run
Use --dry_run to print the rsync commands without running them.
retrieve_l2 — pull processed L2 data from a remote site¶
nenudata retrieve_l2 dawn "202312*_NT04" L2
Retrieves already-processed L2 MSs by querying the remote host for their
paths (nenudata get_ms on the remote) and then rsyncing each file.
MS files are distributed across local nodes in round-robin order.
push_l2 — push processed L2 data to a remote site¶
nenudata push_l2 dawn "202312*_NT04" L2_BP
Rsyncs local L2 MSs to the remote site. Remote directories are created via SSH before the transfer. MS files are cycled across remote nodes in round-robin order.
# Push a single SW
nenudata push_l2 dawn "202312*_NT04" L2_BP --sws SW03
# Dry-run: print rsync commands without executing
nenudata push_l2 dawn "202312*_NT04" L2_BP --dry_run
N2 obs_ids JSON¶
After the transfer completes, push_l2 writes an n2_obs_ids_push.json file
next to the local data-handler config. This JSON records the N2 obs_ids that
were pushed and the remote node pool, so the remote data handler can be updated
without manual editing.
If data_handler_path is set in the [remote_hosts.*] section, the JSON is
also rsynced automatically to the directory containing the remote data handler:
[remote_hosts.dawn]
host = "dawn"
data_path = "/net/%NODE%/data/obs/%LEVEL%/%OBS_ID%/"
nodes = "node1[01-15]"
data_handler_path = "/home/user/nenuflow/data_handler.toml" # optional
After the push, push_l2 prints the exact import-n2 command to run on the
remote server.
import-n2 — register pushed N2 obs_ids in the remote data handler¶
Run this command on the remote server after receiving an n2_obs_ids_push.json
from push_l2:
nenudata import-n2 /path/to/n2_obs_ids_push.json -c data_handler.toml
Reads the JSON and appends any new [n2_obs_ids.*] sections to the data
handler. Entries that are already registered are silently skipped, making the
command safe to re-run:
Imported 2: 20231208_NT04, 20231210_NT04
Skipped 1 already registered: 20231205_NT04
The resulting data handler can immediately be used by nenudata l1_to_l2,
nenudata quality-collect, etc. on the remote server.
Options¶
Option |
Default |
Description |
|---|---|---|
|
|
Data-handler config file to update |
Reference¶
nenudata retrieve¶
Return a list of all MS corresponding to given OBS_IDS and SWS
Usage
nenudata retrieve [OPTIONS] REMOTE_HOST OBS_IDS
Options
- -c, --config <config>¶
Data handler configuration file
- --dry_run¶
Run in dry mode
- --run_on_host <run_on_host>¶
Run rsync on specified host
- -s, --sws <sws>¶
Spectral windows
Arguments
- REMOTE_HOST¶
Required argument
- OBS_IDS¶
Required argument
nenudata retrieve_l2¶
Retrieve L2 data corresponding to given OBS_IDS and SWS
Usage
nenudata retrieve_l2 [OPTIONS] REMOTE_HOST_NAME OBS_IDS L2_LEVEL
Options
- -c, --config <config>¶
Data handler configuration file
- --dry_run¶
Run in dry mode
- --run_on_host <run_on_host>¶
Run rsync on specified host
- -s, --sws <sws>¶
Spectral windows
Arguments
- REMOTE_HOST_NAME¶
Required argument
- OBS_IDS¶
Required argument
- L2_LEVEL¶
Required argument
nenudata push_l2¶
Push L2 data from local to remote host corresponding to given OBS_IDS and SWS
Usage
nenudata push_l2 [OPTIONS] REMOTE_HOST_NAME OBS_IDS L2_LEVEL
Options
- -c, --config <config>¶
Data handler configuration file
- --dry_run¶
Run in dry mode
- -s, --sws <sws>¶
Spectral windows
Arguments
- REMOTE_HOST_NAME¶
Required argument
- OBS_IDS¶
Required argument
- L2_LEVEL¶
Required argument
nenudata import-n2¶
Register N2 obs_ids from a JSON file generated by push_l2 into the data handler.
Usage
nenudata import-n2 [OPTIONS] N2_JSON
Options
- -c, --config <config>¶
Data handler configuration file
Arguments
- N2_JSON¶
Required argument
See also¶
pipeline — process data after retrieval