YAML Configuration System

qvartools experiment scripts support a three-tier configuration system:

Built-in defaults — sensible values for all parameters
YAML config files — reproducible experiment configurations
CLI overrides — quick parameter adjustments

CLI arguments take precedence over YAML values, which take precedence over built-in defaults.

Using Config Files

Each pipeline group has a matching YAML config in experiments/pipelines/configs/:

python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py \
    --config experiments/pipelines/configs/02_nf_dci.yaml

CLI Overrides

Any parameter can be overridden on the command line:

# Use YAML config but override the molecule and max epochs
python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py lih \
    --config experiments/pipelines/configs/02_nf_dci.yaml \
    --max-epochs 200 \
    --teacher-weight 0.6

Available Config Files

File	Pipeline Group
`01_dci.yaml`	Direct-CI (HF+S+D) — no NF training
`02_nf_dci.yaml`	NF-trained + Direct-CI merged basis
`03_nf_dci_pt2.yaml`	NF + DCI + PT2 perturbative expansion
`04_nf_only.yaml`	NF-only basis (ablation, no DCI merge)
`05_hf_only.yaml`	HF-only reference state (baseline)
`06_iterative_nqs.yaml`	Iterative NQS sampling + diag
`07_iterative_nqs_dci.yaml`	NF+DCI merge then iterative NQS
`08_iterative_nqs_dci_pt2.yaml`	NF+DCI+PT2 then iterative NQS

Config File Structure

A typical YAML config file looks like this:

# ---- Molecule -----------------------------------------------
molecule: h2                  # Molecule identifier

# ---- Training loss weights ----------------------------------
teacher_weight: 0.5           # Teacher KL-divergence weight
physics_weight: 0.4           # Physics-informed energy weight
entropy_weight: 0.1           # Entropy regularisation weight

# ---- Training parameters ------------------------------------
max_epochs: 400               # Maximum training epochs
min_epochs: 100               # Minimum before early stopping
samples_per_batch: 2000       # Samples per training batch

# ---- SKQD parameters ----------------------------------------
max_krylov_dim: 15            # Maximum Krylov dimension
shots_per_krylov: 100000      # Shots per Krylov vector

# ---- Hardware -----------------------------------------------
device: auto                  # auto, cpu, or cuda

All keys are flat (no nested sections). Keys use underscores and match the PipelineConfig field names where applicable.

Parameter Reference

Common Parameters

Parameter	Default	Description
`molecule`	`h2`	Molecule identifier (h2, lih, beh2, h2o, nh3, ch4, n2, c2h4)
`device`	`auto`	PyTorch device: `auto` (detect GPU), `cpu`, `cuda`
`verbose`	`true`	Print detailed progress

Training Parameters

Parameter	Default	Description
`teacher_weight`	`0.5`	Weight for teacher KL-divergence loss
`physics_weight`	`0.4`	Weight for variational energy loss
`entropy_weight`	`0.1`	Weight for entropy regularization
`max_epochs`	auto-scaled	Maximum training epochs
`min_epochs`	auto-scaled	Minimum epochs before early stopping
`samples_per_batch`	auto-scaled	Samples drawn per training batch

SKQD Parameters

Parameter	Default	Description
`max_krylov_dim`	auto-scaled	Maximum Krylov subspace dimension
`shots_per_krylov`	auto-scaled	Shot budget per Krylov vector

SQD Parameters

Parameter	Default	Description
`sqd_num_batches`	auto-scaled	Number of SQD sample batches
`sqd_self_consistent_iters`	`5`	Self-consistent iteration count
`sqd_noise_rate`	auto-scaled	Bitflip noise rate for shot simulation

Auto-Scaling

When parameters are not specified in the config file or CLI, qvartools automatically scales them based on the Hilbert-space size. This auto-scaling uses the number of valid configurations (determined by the molecule’s orbital and electron counts) to choose appropriate values for training epochs, samples, network sizes, and SKQD/SQD parameters.

Explicit config values always override auto-scaled defaults.