Pipeline

The pipeline module provides the top-level orchestrator that ties all stages together.

Pipeline Config

class qvartools.pipeline_config.PipelineConfig(use_particle_conserving_flow=True, nqs_type='dense', nf_hidden_dims=<factory>, nqs_hidden_dims=<factory>, samples_per_batch=2000, num_batches=1, max_epochs=400, min_epochs=100, convergence_threshold=0.2, teacher_weight=1.0, physics_weight=0.0, entropy_weight=0.0, flow_lr=0.0005, nqs_lr=0.001, max_accumulated_basis=4096, use_diversity_selection=True, max_diverse_configs=2048, rank_2_fraction=0.5, use_residual_expansion=True, residual_iterations=8, residual_configs_per_iter=150, residual_threshold=1e-06, use_perturbative_selection=True, subspace_mode='classical_krylov', sqd_num_batches=5, sqd_batch_size=0, sqd_self_consistent_iters=3, sqd_spin_penalty=0.0, sqd_noise_rate=0.0, sqd_use_spin_symmetry=True, max_krylov_dim=15, time_step=0.1, shots_per_krylov=100000, skqd_regularization=1e-08, skip_skqd=False, auto_time_step=True, quantum_num_trotter_steps=1, quantum_total_evolution_time=3.14159, quantum_shots=100000, quantum_cudaq_target='nvidia', quantum_cudaq_option='fp64', use_local_energy=True, use_ci_seeding=False, use_davidson=True, davidson_threshold=500, skip_nf_training=False, device='cpu', max_connections_per_config=0, diagonal_only_warmup_epochs=0, stochastic_connections_fraction=1.0)[source]

Bases: object

Hyperparameters for the flow-guided Krylov / SQD pipeline.

Supports three subspace diagonalization modes:

"classical_krylov": Classical exact time evolution (no Trotter error).
"skqd": Real SKQD via quantum circuit Trotterized evolution (CUDA-Q).
"sqd": IBM SQD sampling-based batch diagonalization.

When skip_nf_training is True, the pipeline operates in Direct-CI mode: it generates HF + singles + doubles without NF training, then proceeds directly to subspace diagonalization.

Parameters:

use_particle_conserving_flow (bool) – Use a particle-conserving flow architecture (default True).
nf_hidden_dims (list of int) – Hidden-layer dimensions for the normalizing flow.
nqs_hidden_dims (list of int) – Hidden-layer dimensions for the NQS.
samples_per_batch (int) – Training samples drawn per batch.
num_batches (int) – Number of batches per training epoch.
max_epochs (int) – Maximum training epochs.
min_epochs (int) – Minimum training epochs before convergence check.
convergence_threshold (float) – Relative energy change for convergence.
teacher_weight (float) – Weight of the teacher (KL) loss term.
physics_weight (float) – Weight of the physics (energy) loss term.
entropy_weight (float) – Weight of the entropy regularization term.
flow_lr (float) – Learning rate for the normalizing flow.
nqs_lr (float) – Learning rate for the NQS.
max_accumulated_basis (int) – Hard limit on accumulated basis size.
use_diversity_selection (bool) – Apply diversity-aware selection to the basis.
max_diverse_configs (int) – Maximum configurations after diversity selection.
rank_2_fraction (float) – Fraction of the diversity budget for double excitations.
use_residual_expansion (bool) – Enable residual / perturbative basis expansion.
residual_iterations (int) – Number of residual expansion iterations.
residual_configs_per_iter (int) – Configurations added per residual iteration.
residual_threshold (float) – Minimum residual magnitude for inclusion.
use_perturbative_selection (bool) – Use CIPSI-style perturbative selection instead of residual.
subspace_mode (str) – Subspace diag backend: "classical_krylov" (default, exact time evolution), "skqd" (CUDA-Q Trotterized circuits), "skqd_quantum" (alias for "skqd"), or "sqd".
sqd_num_batches (int) – Number of random batches for SQD.
sqd_batch_size (int) – Configurations per SQD batch (0 = auto).
sqd_self_consistent_iters (int) – Self-consistent occupancy iterations for SQD.
sqd_spin_penalty (float) – Spin-penalty coefficient for SQD.
sqd_noise_rate (float) – Noise rate for SQD-Recovery mode (0 = clean).
sqd_use_spin_symmetry (bool) – Enable spin-symmetry enhancement in SQD.
max_krylov_dim (int) – Maximum Krylov dimension for SKQD.
time_step (float) – Time step for SKQD evolution.
shots_per_krylov (int) – Measurement shots per Krylov state.
skqd_regularization (float) – Tikhonov regularization for the SKQD overlap matrix.
skip_skqd (bool) – Skip SKQD and use direct diagonalization.
auto_time_step (bool) – Auto-compute time step from spectral range.
quantum_num_trotter_steps (int) – Trotter steps for quantum SKQD.
quantum_total_evolution_time (float) – Total evolution time for quantum SKQD.
quantum_shots (int) – Shots for quantum circuit measurements.
quantum_cudaq_target (str) – CUDA-Q target backend.
quantum_cudaq_option (str) – CUDA-Q precision option.
use_local_energy (bool) – Use local-energy estimator during training.
use_ci_seeding (bool) – Seed flow training with CI configurations.
use_davidson (bool) – Use Davidson eigensolver for large matrices.
davidson_threshold (int) – Basis size above which Davidson is preferred.
skip_nf_training (bool) – Skip NF training (Direct-CI mode).
device (str) – Torch device string.
max_connections_per_config (int) – Max Hamiltonian connections per config (0 = unlimited).
diagonal_only_warmup_epochs (int) – Epochs using diagonal-only Hamiltonian at start.
stochastic_connections_fraction (float) – Fraction of connections to sample stochastically.
nqs_type (str)

Examples

>>> cfg = PipelineConfig(max_epochs=200, subspace_mode="sqd")
>>> cfg.subspace_mode
'sqd'

use_particle_conserving_flow: bool = True

nqs_type: str = 'dense'

nf_hidden_dims: list[int]

nqs_hidden_dims: list[int]

samples_per_batch: int = 2000

num_batches: int = 1

max_epochs: int = 400

min_epochs: int = 100

convergence_threshold: float = 0.2

teacher_weight: float = 1.0

physics_weight: float = 0.0

entropy_weight: float = 0.0

flow_lr: float = 0.0005

nqs_lr: float = 0.001

max_accumulated_basis: int = 4096

use_diversity_selection: bool = True

max_diverse_configs: int = 2048

rank_2_fraction: float = 0.5

use_residual_expansion: bool = True

residual_iterations: int = 8

residual_configs_per_iter: int = 150

residual_threshold: float = 1e-06

use_perturbative_selection: bool = True

subspace_mode: str = 'classical_krylov'

sqd_num_batches: int = 5

sqd_batch_size: int = 0

sqd_self_consistent_iters: int = 3

sqd_spin_penalty: float = 0.0

sqd_noise_rate: float = 0.0

sqd_use_spin_symmetry: bool = True

max_krylov_dim: int = 15

time_step: float = 0.1

shots_per_krylov: int = 100000

skqd_regularization: float = 1e-08

skip_skqd: bool = False

auto_time_step: bool = True

quantum_num_trotter_steps: int = 1

quantum_total_evolution_time: float = 3.14159

quantum_shots: int = 100000

quantum_cudaq_target: str = 'nvidia'

quantum_cudaq_option: str = 'fp64'

use_local_energy: bool = True

use_ci_seeding: bool = False

use_davidson: bool = True

davidson_threshold: int = 500

skip_nf_training: bool = False

device: str = 'cpu'

max_connections_per_config: int = 0

diagonal_only_warmup_epochs: int = 0

stochastic_connections_fraction: float = 1.0

adapt_to_system_size(n_valid_configs, verbose=True)[source]

Return a new config with parameters scaled for the given system size.

Classifies the Hilbert-space size into four tiers and adjusts only the basis limits and NQS network dimensions. Training hyperparameters (samples, epochs, batches, learning rates) and SKQD parameters (krylov_dim, shots) are preserved at their paper-aligned defaults for small/medium systems to ensure accuracy.

The adaptation strategy follows the original Flow-Guided-Krylov branches: only basis capacity, NQS capacity, and (for very large systems) epoch/sample budgets are adjusted.

Parameters:

n_valid_configs (int) – Number of valid (particle-conserving) configurations in the Hilbert space. For spin systems, use 2**num_sites.
verbose (bool) – If True, print adaptation diagnostics.

Returns:

A new config with scaled hyperparameters.

Return type:

PipelineConfig

Examples

>>> cfg = PipelineConfig()
>>> adapted = cfg.adapt_to_system_size(500, verbose=False)
>>> adapted.max_diverse_configs <= 500
True

FlowGuidedKrylovPipeline

class qvartools.pipeline.FlowGuidedKrylovPipeline(hamiltonian, config=None, exact_energy=None, auto_adapt=True)[source]

Bases: object

Main orchestrator for the flow-guided Krylov / SQD pipeline.

Supports three subspace diagonalization modes via config.subspace_mode:

"classical_krylov": Classical exact time evolution (no Trotter error).
"skqd": Real SKQD via quantum circuit Trotterized evolution (CUDA-Q).
"sqd": IBM SQD sampling-based batch diagonalization.

When config.skip_nf_training is True, operates in Direct-CI mode: generates HF + singles + doubles deterministically, then proceeds directly to subspace diagonalization.

Parameters:

hamiltonian (MolecularHamiltonian or Hamiltonian) – The system Hamiltonian.
config (PipelineConfig) – Pipeline hyperparameters.
exact_energy (float or None, optional) – Known exact (FCI) energy for error reporting.
auto_adapt (bool, optional) – If True, automatically scale the config to the system size.

Examples

>>> pipeline = FlowGuidedKrylovPipeline(hamiltonian, PipelineConfig())
>>> results = pipeline.run(progress=False)
>>> "final_energy" in results
True

train_flow_nqs(progress=True)[source]

Stage 1: Physics-guided joint training of the flow and NQS.

If config.skip_nf_training is True, generates essential configs (HF + singles + doubles) directly without NF training.

Parameters:: progress (bool, optional) – If True (default), log training progress.
Returns:: Training history with loss/energy lists, or {"energies": [], "skipped": True} in Direct-CI mode.
Return type:: dict

extract_and_select_basis()[source]

Stage 2: Extract the accumulated basis and apply diversity selection.

In Direct-CI mode, uses essential configs directly.

Returns:: Selected basis configurations.
Return type:: torch.Tensor

run_subspace_diag(progress=True)[source]

Stage 3: Subspace diagonalization via SKQD, SKQD-Quantum, or SQD.

Routes to the appropriate backend based on config.subspace_mode.

Parameters:: progress (bool, optional) – If True (default), log diagonalization progress.
Returns:: Backend-specific results dictionary. Always populates self.results["combined_energy"].
Return type:: dict
Raises:: RuntimeError – If no basis is available (call extract_and_select_basis() first).

run_residual_expansion(basis)[source]

Expand the basis via residual or perturbative selection.

This is the legacy Stage 3 from the 4-stage pipeline. When using run() or run_subspace_diag(), this is NOT called – those methods route directly to the appropriate subspace diag backend.

Parameters:: basis (torch.Tensor) – Current basis configurations, shape (n_basis, num_sites).
Returns:: Expanded basis configurations on config.device.
Return type:: torch.Tensor

run(progress=True)[source]

Execute the complete pipeline.

Runs training (or Direct-CI), basis extraction, and subspace diagonalization in sequence.

Parameters:: progress (bool, optional) – If True (default), log progress for all stages.
Returns:: Aggregated results dictionary with keys including "final_energy", "nf_basis_size", "error_mha" (when exact_energy is set), and stage-specific sub-dicts.
Return type:: dict

Convenience Functions

qvartools.pipeline.run_molecular_benchmark(molecule, config=None, verbose=True)[source]

Load a molecule from the registry and run the full pipeline.

Parameters:

molecule (str) – Molecule name (case-insensitive). Must be a key in MOLECULE_REGISTRY.
config (PipelineConfig or None, optional) – Pipeline hyperparameters. If None, uses defaults.
verbose (bool, optional) – If True (default), print a summary to stdout.

Returns:

Pipeline results dictionary (see FlowGuidedKrylovPipeline.run()).

Return type:

dict

Examples

>>> results = run_molecular_benchmark("H2")
>>> "final_energy" in results
True