Profiling API¶
Memory Profiling¶
Memory measurement infrastructure for QPE circuit compilation profiling.
- Provides multi-layer memory measurement:
resource.getrusage(RUSAGE_SELF/RUSAGE_CHILDREN) → process lifetime peak RSS
/proc/self/status VmRSS/VmPeak → current/peak RSS (Linux kernel level)
tracemalloc → Python heap only
This module has zero project dependencies (stdlib only).
- class q2m3.profiling.memory.MemorySnapshot(label, rss_mb, vm_peak_mb, maxrss_mb, maxrss_children_mb, tracemalloc_peak_mb, tracemalloc_current_mb, elapsed_s=0.0)[source]¶
Bases:
objectMulti-layer memory snapshot at a single point in time.
- Parameters:
label (str)
rss_mb (float)
vm_peak_mb (float)
maxrss_mb (float)
maxrss_children_mb (float)
tracemalloc_peak_mb (float)
tracemalloc_current_mb (float)
elapsed_s (float)
- class q2m3.profiling.memory.ProfileResult(molecule, n_system_qubits, n_estimation_wires, n_trotter, n_terms, ir_scale, mode='dynamic', phase_a=None, phase_b=None, phase_c=None, timeline_peak_mb=0.0, timeline_samples=<factory>, ir_analysis=<factory>, prob_sum=0.0, error=None)[source]¶
Bases:
objectComplete profiling result for one parameter combination.
- Parameters:
molecule (str)
n_system_qubits (int)
n_estimation_wires (int)
n_trotter (int)
n_terms (int)
ir_scale (int)
mode (str)
phase_a (MemorySnapshot | None)
phase_b (MemorySnapshot | None)
phase_c (MemorySnapshot | None)
timeline_peak_mb (float)
timeline_samples (list)
ir_analysis (list)
prob_sum (float)
error (str | None)
- q2m3.profiling.memory.read_proc_status(pid='self')[source]¶
Parse /proc/[pid]/status for VmRSS and VmPeak (Linux only).
- Parameters:
pid (int | str) – Process ID to read, or “self” for current process.
- Return type:
dict[str, float]
- q2m3.profiling.memory.read_smaps_rollup(pid='self')[source]¶
Parse /proc/[pid]/smaps_rollup to categorize RSS into memory types.
- Returns dict with MB values for:
Rss: total resident
Pss: proportional share
Anonymous: heap + anonymous mmap (C++ allocations, LLVM JIT code)
LazyFree, AnonHugePages, etc.
- Parameters:
pid (int | str)
- Return type:
dict[str, float]
- q2m3.profiling.memory.take_snapshot(label)[source]¶
Take a comprehensive memory snapshot from all measurement layers.
Captures both RUSAGE_SELF (Python process) and RUSAGE_CHILDREN (catalyst subprocess) to detect the measurement blind spot where MLIR→LLVM compilation happens in a child process.
- Parameters:
label (str)
- Return type:
- class q2m3.profiling.memory.MemoryTimeline(interval_s=0.1)[source]¶
Bases:
objectBackground daemon thread sampling /proc/self/status VmRSS at 100ms intervals.
Also captures /proc/self/smaps_rollup at peak RSS to categorize memory.
- Parameters:
interval_s (float)
- property peak_smaps: dict[str, float]¶
Memory categorization captured at peak RSS moment.
- class q2m3.profiling.memory.ParentSideMonitor(pid, interval_s=0.1)[source]¶
Bases:
objectMonitor a child process’s RSS from the parent side via /proc/PID/status.
This provides an independent measurement that validates the child’s self-reported ru_maxrss. Critical for detecting measurement blind spots where getrusage misses certain allocation patterns (mmap, LLVM JIT, etc.).
- Parameters:
pid (int)
interval_s (float)
- property peak_hwm_mb: float¶
kernel-tracked high water mark for RSS (most reliable).
- Type:
VmHWM
Timing Utilities¶
General-purpose timing utilities for q2m3.
Provides lightweight profiling tools for measuring code section and function execution times. These utilities are independent of QPE-specific profiling and can be used anywhere in the codebase.
- q2m3.profiling.timing.profile_section(name, verbose=True)[source]¶
Context manager for profiling a code section.
- Parameters:
name (str) – Name of the section being profiled
verbose (bool) – Whether to print timing information
- Yields:
dict – A timing info dictionary that gets updated with elapsed time
Example:
with profile_section("QPE calculation") as timing: result = run_qpe() print(f"Elapsed: {timing['elapsed']:.3f}s")
- q2m3.profiling.timing.profile_function(func=None, *, verbose=True)[source]¶
Decorator for profiling a function’s execution time.
- Parameters:
func (Callable) – The function to profile
verbose (bool) – Whether to print timing information
- Returns:
Wrapped function that tracks execution time
- Return type:
Callable
Example:
@profile_function def compute_energy(): ... @profile_function(verbose=False) def silent_compute(): ...
- class q2m3.profiling.timing.ProfilingStats(name='default')[source]¶
Bases:
objectAccumulator for profiling statistics across multiple runs.
- Parameters:
name (str)
- property total: float¶
Total elapsed time.
- property count: int¶
Number of recorded timings.
- property mean: float¶
Mean elapsed time.
- property min: float¶
Minimum elapsed time.
- property max: float¶
Maximum elapsed time.
Catalyst IR Analysis¶
Catalyst IR stage analysis utilities.
WARNING: ir_output_dir() uses os.chdir() which modifies global process state. It is NOT thread-safe. Do not call concurrently from multiple threads. Must be used as the outermost context manager wrapping @qjit decoration.
- q2m3.profiling.catalyst_ir.ir_output_dir(path=None)[source]¶
Manage Catalyst IR workspace directory.
Must wrap @qjit decoration (not just the call), because Catalyst captures os.getcwd() at decoration time to determine the IR output location.
- Parameters:
path (str | None) – If provided, IR files are written here and preserved after exit. If None, a temporary directory is used and auto-cleaned on exit.
- Yields:
The workspace directory path (user-specified or auto-created tempdir).
- Return type:
Generator[str, None, None]
- q2m3.profiling.catalyst_ir.analyze_ir_stages(compiled_fn, stages=None)[source]¶
Export IR text from each compilation stage and measure size/lines.
- Parameters:
compiled_fn (Any) – A Catalyst @qjit compiled function.
stages (list[str] | None) – Override list of stage names. If None, uses COMPILATION_STAGES.
- Returns:
List of (stage_name, size_kb, n_lines). Empty list if Catalyst is unavailable or all stages fail.
- Return type:
list[tuple[str, float, int]]
QPE Profiling¶
Three-phase QPE compilation profiling workflow.
- Provides profile_* functions for measuring memory usage and timing across:
Phase A: PySCF → PennyLane Hamiltonian construction
Phase B: @qjit QPE circuit compilation (H_dynamic and H_fixed modes)
Phase C: Repeated execution of already-compiled circuit
Catalyst is an optional dependency. profile_qjit_compilation[_fixed] raise ImportError at call time if catalyst is not installed.
- q2m3.profiling.qpe_profiler.profile_hamiltonian_build(mol, n_est, n_trotter)[source]¶
Profile Phase A: PySCF → PennyLane Hamiltonian construction.
- Returns:
Tuple of (snapshot, ops, coeffs, hf_state, circuit_params)
- Parameters:
mol (MoleculeConfig)
n_est (int)
n_trotter (int)
- Return type:
tuple[MemorySnapshot, list, list[float], ndarray, dict]
- q2m3.profiling.qpe_profiler.profile_qjit_compilation(ops, coeffs, hf_state, circuit_params, ir_dir=None, keep_intermediate=True)[source]¶
Profile Phase B: H_dynamic mode @qjit QPE circuit compilation.
This is the critical phase — first call triggers MLIR→LLVM compilation.
IMPORTANT: @qjit is applied functionally INSIDE ir_output_dir context, because Catalyst captures os.getcwd() at decoration time to determine the IR workspace location.
- Parameters:
ops (list) – PennyLane operators used to build the runtime Hamiltonian.
coeffs (list[float]) – Hamiltonian coefficients supplied at runtime.
hf_state (ndarray) – Hartree-Fock occupation vector for system-state preparation.
circuit_params (dict) – Circuit metadata from profile_hamiltonian_build().
ir_dir (str | None) – If provided, IR files are preserved at this path. If None, a tempdir is used and auto-cleaned after IR analysis.
keep_intermediate (bool) – If True, retain all 6 IR stages in memory for analysis. If False, only final stage is kept (tests memory impact).
- Returns:
Tuple of (snapshot, timeline, ir_analysis, compiled_fn)
- Return type:
tuple[MemorySnapshot, MemoryTimeline, list, Any]
- q2m3.profiling.qpe_profiler.profile_qjit_compilation_fixed(ops, coeffs, hf_state, circuit_params, ir_dir=None, keep_intermediate=True)[source]¶
Profile Phase B for H_fixed mode: Hamiltonian built OUTSIDE @qjit.
Coefficients are Python floats → Catalyst can constant-fold them into MLIR. The compiled function takes zero arguments.
IMPORTANT: @qjit is applied functionally INSIDE ir_output_dir context, because Catalyst captures os.getcwd() at decoration time.
- Parameters:
ops (list) – PennyLane operators used to build the fixed Hamiltonian.
coeffs (list[float]) – Hamiltonian coefficients embedded as Python floats before @qjit.
hf_state (ndarray) – Hartree-Fock occupation vector for system-state preparation.
circuit_params (dict) – Circuit metadata from profile_hamiltonian_build().
ir_dir (str | None) – If provided, IR files are preserved at this path. If None, a tempdir is used and auto-cleaned after IR analysis.
keep_intermediate (bool) – If True, retain all 6 IR stages for analysis.
- Returns:
Tuple of (snapshot, timeline, ir_analysis, compiled_fn)
- Return type:
tuple[MemorySnapshot, MemoryTimeline, list, Any]
- q2m3.profiling.qpe_profiler.profile_execution(compiled_fn, coeffs, n_calls=5, is_fixed=False)[source]¶
Profile Phase C: repeated execution of already-compiled circuit.
- Returns:
Tuple of (snapshot, prob_sum_from_last_call)
- Parameters:
compiled_fn (Any)
coeffs (list[float])
n_calls (int)
is_fixed (bool)
- Return type:
tuple[MemorySnapshot, float]
Profiling Orchestrator¶
Subprocess orchestration and parameter sweep for QPE compilation profiling.
Provides functions to run profiling passes in isolated subprocesses (avoiding ru_maxrss accumulation) and to sweep across parameter grids.
All progress reporting uses an injectable on_progress callback — no direct
console output (Rich, print, etc.) so callers can plug in any UI layer.
- q2m3.profiling.orchestrator.run_single_profile(mol, n_est, n_trotter, mode='both', ir_dir=None, on_progress=None)[source]¶
Execute all three profiling phases for one parameter combination.
- Parameters:
mol (MoleculeConfig) – Molecule configuration.
n_est (int) – Number of estimation wires.
n_trotter (int) – Trotter decomposition order.
mode (str) – “dynamic”, “fixed”, or “both”.
ir_dir (str | None) – Directory to preserve IR files. None uses a tempdir.
on_progress (Callable[[str], None] | None) – Optional callback for progress messages.
- Returns:
A single ProfileResult for “dynamic”/”fixed”, or a tuple of (fixed_result, dynamic_result) when mode=”both”.
- Return type:
ProfileResult | tuple[ProfileResult, ProfileResult]
- q2m3.profiling.orchestrator.run_single_profile_in_subprocess(mol_key, n_est, n_trotter, queue, mode='dynamic', ir_dir=None)[source]¶
Run a single profiling pass inside a subprocess (avoids ru_maxrss accumulation).
Results are placed into queue — this function does not return the result.
- Parameters:
mol_key (str)
n_est (int)
n_trotter (int)
queue (Queue)
mode (str)
ir_dir (str | None)
- Return type:
None
- q2m3.profiling.orchestrator.run_both_modes(mol_key, n_est, n_trotter, ir_dir=None, on_progress=None)[source]¶
Run fixed and dynamic modes in isolated subprocesses with parent-side monitoring.
- Returns:
Tuple of (result_fixed, result_dynamic, parent_data_fixed, parent_data_dynamic).
- Parameters:
mol_key (str)
n_est (int)
n_trotter (int)
ir_dir (str | None)
on_progress (Callable[[str], None] | None)
- Return type:
tuple[ProfileResult, ProfileResult, dict, dict]
- q2m3.profiling.orchestrator.run_sweep(mol_key='h2', mode='dynamic', grid=None, on_progress=None)[source]¶
Run parameter sweep with subprocess isolation.
- Parameters:
mol_key (str) – Key into MOLECULES dict.
mode (str) – “dynamic” or “fixed”.
grid (list[tuple[int, int]] | None) – List of (n_est, n_trotter) pairs. Defaults to H2_SWEEP_GRID.
on_progress (Callable[[str], None] | None) – Optional callback for progress messages.
- Returns:
Dict mapping (n_est, n_trotter) to ProfileResult for successful runs. Failed grid points are silently skipped (logged at warning level).
- Return type:
dict[tuple[int, int], ProfileResult]