Configuration Architecture¶
This document describes the current configuration system in pclean.
Overview¶
pclean uses a single-source-of-truth configuration built on
pydantic v2 BaseModel classes.
The top-level PcleanConfig groups all parameters into nine logical
sub-models and provides:
Built-in defaults matching CASA
tcleanbehaviour.YAML file I/O for reproducible imaging runs.
Layered composition (base + overlay merging).
Backward-compatible flat-kwargs interface for the
pclean()function.to_casa_*()bridge methods that translate user-facing parameters into the CASA-native dicts consumed by the C++ synthesis tools.
Sub-model Hierarchy¶
PcleanConfig
├── selection: SelectionConfig # vis, field, spw, timerange, ...
├── image: ImageConfig # imagename, imsize, cell, specmode, nchan, ...
├── grid: GridConfig # gridder, facets, wprojplanes, pblimit, ...
├── weight: WeightConfig # weighting, robust, noise, uvtaper, ...
├── deconvolution: DeconvolutionConfig # deconvolver, scales, masking params, ...
├── iteration: IterationConfig # niter, gain, threshold, nmajor, ...
├── normalization: NormConfig # pblimit, normtype, psfcutoff
├── misc: MiscConfig # restart, savemodel, calcres, calcpsf
└── cluster: ClusterConfig # parallel, nworkers, type, ...
└── slurm: SlurmConfig # queue, account, walltime, ...
All fields carry typed defaults. Constructing PcleanConfig() with no
arguments produces a valid configuration equivalent to tclean defaults.
Four Ways to Build a Config¶
1. Direct Python Construction¶
from pclean.config import PcleanConfig, ImageConfig, IterationConfig
config = PcleanConfig(
image=ImageConfig(imagename='test', imsize=[512, 512], cell='0.5arcsec'),
iteration=IterationConfig(niter=500, threshold='1mJy'),
)
2. YAML File¶
config = PcleanConfig.from_yaml('my_config.yaml')
A YAML file mirrors the sub-model hierarchy:
selection:
vis: my_data.ms
field: '0'
image:
imagename: output
imsize: [1024, 1024]
cell: 0.5arcsec
specmode: cube
nchan: 128
iteration:
niter: 1000
cluster:
parallel: true
nworkers: 8
Only fields that differ from the defaults need to be specified.
3. Layered Composition (Merge)¶
base = PcleanConfig.from_yaml('defaults.yaml')
overlay = PcleanConfig.from_yaml('my_overrides.yaml')
config = PcleanConfig.merge(base, overlay)
Later configs win. Merging is deep — nested sub-model fields are merged recursively rather than replaced wholesale.
4. Flat Keyword Arguments (Backward Compat)¶
The pclean() function still accepts the 80+ flat keywords familiar
from CASA tclean. Internally it calls:
config = PcleanConfig.from_flat_kwargs(vis='my.ms', imagename='test', ...)
which routes each keyword into the correct sub-model. When a --pconfig
YAML file is also provided, the flat kwargs override the file values.
Merge Order¶
When multiple sources are provided the merge priority is (highest wins):
Explicit keyword arguments / CLI flags
--pconfigYAML file--preset(later--presetflags override earlier ones)Built-in pydantic defaults
Presets¶
Named presets live under src/pclean/configs/presets/ as YAML files
(e.g. vlass.yaml) and are bundled inside the wheel via
tool.setuptools.package-data. They can be loaded via:
from pclean.config import load_preset
config = load_preset('vlass')
or from the CLI:
python -m pclean --preset vlass --selection.vis my.ms --image.imagename out
Presets set only the fields relevant to the observing programme; everything else falls through to the built-in defaults.
CLI Integration¶
The python -m pclean CLI supports three config-related flags:
Flag |
Purpose |
|---|---|
|
Load a YAML config as the base |
|
Load a named preset as the base |
|
Write the resolved config to YAML and exit |
Dot-notation overrides on the command line (e.g.
--cluster.nworkers 16) are merged on top.
CASA Bridge Methods¶
CASA’s C++ synthesis tools (synthesisimager, synthesisdeconvolver,
synthesisnormalizer, iterbotsink) expect parameter dicts with
CASA-internal field names and conventions that differ from the
user-facing API. PcleanConfig provides bridge methods that perform
these translations:
Method |
Produces |
Notable Translations |
|---|---|---|
|
|
|
|
|
Ensures |
|
|
Injects |
|
|
|
|
|
Injects |
|
|
|
|
|
|
|
|
Passes |
|
Full dict of all above |
Serializable payload for continuum-parallel workers |
Bridge methods live on PcleanConfig (not on sub-models) because many
CASA translations cross sub-model boundaries — for example, imagename
appears in impars, gridpars, normpars, and iterpars.
Convenience Properties¶
PcleanConfig exposes frequently accessed values as properties to avoid
repetitive sub-model traversal:
Property |
Returns |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
Always |
|
Number of measurement sets in |
Data Flow Through Engines¶
pclean(vis=..., **kw) # user entry point
│
▼
PcleanConfig.from_flat_kwargs() # build config
│
├─ parallel=False ──► SerialImager(config)
│ └── calls to_casa_*() once in __init__
│ └── passes dicts to C++ tools
│
├─ parallel=True, cube ──► ParallelCubeImager(config)
│ │ └── partition_cube(config) → list[PcleanConfig]
│ │ └── submit config.model_dump() to Dask workers
│ ▼
│ Workers: PcleanConfig.model_validate(dict) → SerialImager(config)
│
└─ parallel=True, mfs ──► ParallelContinuumImager(config)
│ └── partition_continuum(config) → list[dict] (CASA bundles)
│ └── submit bundles to Dask actors
▼
Workers: receive CASA bundle dicts, use directly with synthesisimager
Coordinator: uses config.to_casa_normpars/decpars/iterpars for normalizer/deconvolver/iterbot
Why Two Serialization Strategies?¶
Cube workers run a full
SerialImager(imaging + deconvolution), so they need the complete hierarchical config to call allto_casa_*()methods. The config is serialized viamodel_dump()and reconstructed on the worker withmodel_validate().Continuum workers only run
synthesisimager(gridding). They receive pre-translated CASA bundle dicts fromPcleanConfig.to_casa_bundle(). This avoids a round-trip through pydantic on the worker side and naturally handles the fact thatsynthesisutils.contdatapartition()returns CASA-native selpars that do not cleanly map back to user-facing config fields.
Partitioning¶
Function |
Input |
Output |
Strategy |
|---|---|---|---|
|
|
|
Uses |
|
|
|
Calls |
File Inventory¶
File |
Role |
|---|---|
|
Sub-model definitions, |
|
|
|
CLI; |
|
Accepts |
|
Dask worker functions; cube tasks accept config dicts, continuum tasks accept CASA bundles |
|
Cube engine; accepts |
|
Continuum engine; accepts |
|
|
|
Auto-generated reference YAML with all built-in default values |
|
VLASS continuum imaging preset |
|
Deprecated legacy |
Defaults Reference¶
The canonical defaults are the pydantic Field defaults on each
sub-model class in config.py. The file
src/pclean/configs/defaults.yaml is a machine-generated snapshot and
should be regenerated after changing any default:
pixi run -e dev gen-defaults