Parallelization Modes

Cube Mode (specmode='cube')

        flowchart LR
    subgraph Coord["Coordinator"]
        direction TB
        A[pclean] --> B[Partition<br/>channels]
        B --> C[Submit]
        G[Gather] --> H[Concat<br/>subcubes]
        H --> I[Final cube]
    end

    subgraph Workers["Dask Workers (embarrassingly parallel)"]
        direction TB

        subgraph W0["Worker 0 · ch 0-23"]
            direction LR
            W0a[setup] --> W0b[PSF] --> W0c[PB] --> W0d[Major 1] --> W0e[Converge?] --> W0f[Mask] --> W0g[Minor] --> W0h[Major 2] --> W0i[Done]
        end

        subgraph W1["Worker 1 · ch 24-47"]
            direction LR
            W1a[setup] --> W1b[PSF → PB → Major → Minor → Done]
        end

        subgraph W2["Worker 2 · ch 48-70"]
            direction LR
            W2a[setup] --> W2b[PSF → PB → Major → Minor → Done]
        end

        subgraph W3["Worker 3 · ch 71-93"]
            direction LR
            W3a[setup] --> W3b[PSF → PB → Major → Minor → Done]
        end

        subgraph W4["Worker 4 · ch 94-116"]
            direction LR
            W4a[setup] --> W4b[PSF → PB → Major → Minor → Done]
        end
    end

    C --> W0a
    C --> W1a
    C --> W2a
    C --> W3a
    C --> W4a
    W0i --> G
    W1b --> G
    W2b --> G
    W3b --> G
    W4b --> G

    style Coord fill:#e1f5fe
    style Workers fill:#c8e6c9
    style W0 fill:#a5d6a7
    style W1 fill:#a5d6a7
    style W2 fill:#a5d6a7
    style W3 fill:#a5d6a7
    style W4 fill:#a5d6a7
    

Continuum Mode (specmode='mfs')

        flowchart LR
    subgraph Init["Setup"]
        direction TB
        A[pclean] --> B[Partition<br/>rows] --> C[Create<br/>actors]
    end

    subgraph PSF["PSF (parallel)"]
        direction TB
        P0[Worker 0] ~~~ P1[Worker 1] ~~~ P2[Worker N]
    end

    subgraph PB["PB (parallel)"]
        direction TB
        PB0[Worker 0] ~~~ PB1[Worker 1] ~~~ PB2[Worker N]
    end

    subgraph Maj1["Major Cycle (parallel)"]
        direction TB
        M0[Worker 0<br/>grid] ~~~ M1[Worker 1<br/>grid] ~~~ M2[Worker N<br/>grid]
    end

    subgraph Loop["Iteration Loop (coordinator)"]
        direction TB
        POST[Gather +<br/>Normalize] --> MASK[setupMask]
        MASK --> CONV{Converged?}
        CONV -->|No| MINOR[Minor cycle<br/>serial deconv]
        MINOR --> PRE[Scatter<br/>model]
        PRE --> MAJ2
        CONV -->|Yes| RESTORE[Restore +<br/>PBcor]
    end

    subgraph MAJ2["Next Major (parallel)"]
        direction TB
        M20[Worker 0] ~~~ M21[Worker 1] ~~~ M22[Worker N]
    end

    C --> PSF
    PSF --> NORM1[Normalize<br/>PSF]
    NORM1 --> PB
    PB --> NORM2[Normalize<br/>PB]
    NORM2 --> Maj1
    Maj1 --> POST
    MAJ2 --> POST

    style Init fill:#e1f5fe
    style PSF fill:#c8e6c9
    style PB fill:#c8e6c9
    style Maj1 fill:#c8e6c9
    style MAJ2 fill:#c8e6c9
    style Loop fill:#fff9c4
    style MINOR fill:#ffecb3
    

Key Differences

Aspect

Cube

Continuum (MFS)

What’s parallel

Entire pipeline per subcube

Only gridding/degridding (major cycle)

Minor cycle

Parallel (per subcube)

Serial on coordinator

Communication

None (embarrassingly parallel)

Gather/scatter each major cycle

Partition axis

Frequency channels

Visibility rows

Final assembly

imageconcat of subcubes

Normalizer gathers partial images

Known Limitations

weighting='briggsbwtaper' in Parallel Cube Mode

The briggsbwtaper weighting scheme (CAS-13021) requires the fractional bandwidth of the full cube:

fracBW = 2 * (maxFreq - minFreq) / (maxFreq + minFreq)

In parallel cube mode each Dask worker images an independent sub-cube (often a single channel), so the C++ auto-computation of fracBW from the sub-cube’s spectral axis would produce 0.0 and fail.

Fix

The fracBW parameter needs to be exposed through the casatools Python binding (synthesisimager.setweighting(fracbw=...)), then pclean can pre-computes it from the full cube start/width/nchan before dispatching to workers. Each worker receives the correct full-bandwidth fracBW scalar alongside its independent per-channel Briggs density grid.

Requirements:

  • casatools must be rebuilt from the patched XML and C++ sources

  • start and width must be specified as frequency quantities (e.g. "100GHz") so the pre-computation can resolve them. If they are not parseable, fracBW falls back to 0.0 (auto-compute), which will still fail for single-channel sub-cubes.

Fallback workaround

Use weighting='briggs' (with perchanweightdensity=True, the default), which computes per-channel Briggs weights independently — this is compatible with per-channel parallelization but does not offer the improved imaging fidelity of off-axis sources for wide-bandwidth cubes.

pclean(
    ...
    weighting="briggs",   # not "briggsbwtaper"
    robust=0.5,
    perchanweightdensity=True,
    parallel=True,
    cube_chunksize=1,
)

CASA tclean reference

tclean itself also restricts briggsbwtaper:

  • Requires perchanweightdensity=True

  • Requires specmode='cube' (not 'mfs' or 'cont')

  • Requires npixels=0

See task_tclean.py lines 218–236 in casa6.

Cube Gridding Must Stay Enabled for nchan=1 Subcubes

In upstream CASA the C++ default is doingCubeGridding_p = True. For specmode='cube' the C++ guard condition is:

if ((itsMaxShape[3] > 1 || mode.contains("cube")) && doingCubeGridding_p)

This means cube-mode images always take the CubeMajorCycleAlgorithm path — even with nchan=1 — because mode.contains("cube") is true. The CubeMajorCycleAlgorithm runs a different gridding code path than the non-cube runMajorCycle, and also handles all PSF/residual normalization internally (gatherpsfweight, dividepsfbyweight, etc.).

pclean must NOT call setcubegridding(False) for single-channel subcubes. Disabling cube gridding switches to the non-cube runMajorCycle path, producing fundamentally different gridded visibilities and causing residual flux errors of 3×–21× compared to the result from the same data imaged with cube gridding enabled. This was verified empirically (2026-03-11) by comparing serial pclean (nchan=1, cube gridding disabled) against tclean (nchan=1, default cube gridding enabled).

This applies regardless of whether the MS is backed by standard CTDS or ADIOS2 storage managers. Cube gridding is always left at the C++ default (enabled).

The only place pclean calls setcubegridding(False) is in partition._resolve_frequency_grid(), which creates a throwaway tiny-image synthesisimager purely to resolve the spectral coordinate grid — not for science imaging.