Data transfer — retrieve, retrieve_l2, push_l2

These commands move data between the NenuFAR site and the processing cluster using rsync over SSH. Transfer jobs are distributed across nodes using the same worker pool as the calibration pipeline.

Remote hosts must be declared in the [remote_hosts] section of the data-handler config:

[remote_hosts.dawn]
host = 'dawn.obspm.fr'
level = 'L1'
data_path = '/data/nenuraw/%YEAR%/%MONTH%/%OBS_ID%'
nodes = 'node[101-110]'

retrieve — pull L1 raw data from a remote site

nenudata retrieve dawn "202312*_NT04"

Rsyncs the L1 sub-band MSs (SB{sb}.MS) for each obs_id and spectral window from the remote host into the local L1 data level directory.

nenudata retrieve dawn "202312*_NT04" --sws SW03 --dry_run

Use --dry_run to print the rsync commands without running them.

retrieve_l2 — pull processed L2 data from a remote site

nenudata retrieve_l2 dawn "202312*_NT04" L2

Retrieves already-processed L2 MSs by querying the remote host for their paths (nenudata get_ms on the remote) and then rsyncing each file. MS files are distributed across local nodes in round-robin order.

push_l2 — push processed L2 data to a remote site

nenudata push_l2 dawn "202312*_NT04" L2_BP

Rsyncs local L2 MSs to the remote site. Remote directories are created via SSH before the transfer. MS files are cycled across remote nodes in round-robin order.

# Push a single SW
nenudata push_l2 dawn "202312*_NT04" L2_BP --sws SW03

# Dry-run: print rsync commands without executing
nenudata push_l2 dawn "202312*_NT04" L2_BP --dry_run

N2 obs_ids JSON

After the transfer completes, push_l2 writes an n2_obs_ids_push.json file next to the local data-handler config. This JSON records the N2 obs_ids that were pushed and the remote node pool, so the remote data handler can be updated without manual editing.

If data_handler_path is set in the [remote_hosts.*] section, the JSON is also rsynced automatically to the directory containing the remote data handler:

[remote_hosts.dawn]
host              = "dawn"
data_path         = "/net/%NODE%/data/obs/%LEVEL%/%OBS_ID%/"
nodes             = "node1[01-15]"
data_handler_path = "/home/user/nenuflow/data_handler.toml"  # optional

After the push, push_l2 prints the exact import-n2 command to run on the remote server.


import-n2 — register pushed N2 obs_ids in the remote data handler

Run this command on the remote server after receiving an n2_obs_ids_push.json from push_l2:

nenudata import-n2 /path/to/n2_obs_ids_push.json -c data_handler.toml

Reads the JSON and appends any new [n2_obs_ids.*] sections to the data handler. Entries that are already registered are silently skipped, making the command safe to re-run:

Imported 2: 20231208_NT04, 20231210_NT04
Skipped 1 already registered: 20231205_NT04

The resulting data handler can immediately be used by nenudata l1_to_l2, nenudata quality-collect, etc. on the remote server.

Options

Option

Default

Description

--config / -c

data_handler.toml

Data-handler config file to update


Reference

nenudata retrieve

Return a list of all MS corresponding to given OBS_IDS and SWS

Usage

nenudata retrieve [OPTIONS] REMOTE_HOST OBS_IDS

Options

-c, --config <config>

Data handler configuration file

--dry_run

Run in dry mode

--run_on_host <run_on_host>

Run rsync on specified host

-s, --sws <sws>

Spectral windows

Arguments

REMOTE_HOST

Required argument

OBS_IDS

Required argument

nenudata retrieve_l2

Retrieve L2 data corresponding to given OBS_IDS and SWS

Usage

nenudata retrieve_l2 [OPTIONS] REMOTE_HOST_NAME OBS_IDS L2_LEVEL

Options

-c, --config <config>

Data handler configuration file

--dry_run

Run in dry mode

--run_on_host <run_on_host>

Run rsync on specified host

-s, --sws <sws>

Spectral windows

Arguments

REMOTE_HOST_NAME

Required argument

OBS_IDS

Required argument

L2_LEVEL

Required argument

nenudata push_l2

Push L2 data from local to remote host corresponding to given OBS_IDS and SWS

Usage

nenudata push_l2 [OPTIONS] REMOTE_HOST_NAME OBS_IDS L2_LEVEL

Options

-c, --config <config>

Data handler configuration file

--dry_run

Run in dry mode

-s, --sws <sws>

Spectral windows

--only-n2

Restrict OBS_IDS resolution to N2 obs_ids only

Arguments

REMOTE_HOST_NAME

Required argument

OBS_IDS

Required argument

L2_LEVEL

Required argument

nenudata import-n2

Register N2 obs_ids from a JSON file generated by push_l2 into the data handler.

Usage

nenudata import-n2 [OPTIONS] N2_JSON

Options

-c, --config <config>

Data handler configuration file

Arguments

N2_JSON

Required argument

See also

  • pipeline — process data after retrieval