Profiling API

Memory Profiling

Memory measurement infrastructure for QPE circuit compilation profiling.

Provides multi-layer memory measurement:
  • resource.getrusage(RUSAGE_SELF/RUSAGE_CHILDREN) → process lifetime peak RSS

  • /proc/self/status VmRSS/VmPeak → current/peak RSS (Linux kernel level)

  • tracemalloc → Python heap only

This module has zero project dependencies (stdlib only).

class q2m3.profiling.memory.MemorySnapshot(label, rss_mb, vm_peak_mb, maxrss_mb, maxrss_children_mb, tracemalloc_peak_mb, tracemalloc_current_mb, elapsed_s=0.0)[source]

Bases: object

Multi-layer memory snapshot at a single point in time.

Parameters:
  • label (str)

  • rss_mb (float)

  • vm_peak_mb (float)

  • maxrss_mb (float)

  • maxrss_children_mb (float)

  • tracemalloc_peak_mb (float)

  • tracemalloc_current_mb (float)

  • elapsed_s (float)

class q2m3.profiling.memory.ProfileResult(molecule, n_system_qubits, n_estimation_wires, n_trotter, n_terms, ir_scale, mode='dynamic', phase_a=None, phase_b=None, phase_c=None, timeline_peak_mb=0.0, timeline_samples=<factory>, ir_analysis=<factory>, prob_sum=0.0, error=None)[source]

Bases: object

Complete profiling result for one parameter combination.

Parameters:
  • molecule (str)

  • n_system_qubits (int)

  • n_estimation_wires (int)

  • n_trotter (int)

  • n_terms (int)

  • ir_scale (int)

  • mode (str)

  • phase_a (MemorySnapshot | None)

  • phase_b (MemorySnapshot | None)

  • phase_c (MemorySnapshot | None)

  • timeline_peak_mb (float)

  • timeline_samples (list)

  • ir_analysis (list)

  • prob_sum (float)

  • error (str | None)

q2m3.profiling.memory.read_proc_status(pid='self')[source]

Parse /proc/[pid]/status for VmRSS and VmPeak (Linux only).

Parameters:

pid (int | str) – Process ID to read, or “self” for current process.

Return type:

dict[str, float]

q2m3.profiling.memory.read_smaps_rollup(pid='self')[source]

Parse /proc/[pid]/smaps_rollup to categorize RSS into memory types.

Returns dict with MB values for:
  • Rss: total resident

  • Pss: proportional share

  • Anonymous: heap + anonymous mmap (C++ allocations, LLVM JIT code)

  • LazyFree, AnonHugePages, etc.

Parameters:

pid (int | str)

Return type:

dict[str, float]

q2m3.profiling.memory.take_snapshot(label)[source]

Take a comprehensive memory snapshot from all measurement layers.

Captures both RUSAGE_SELF (Python process) and RUSAGE_CHILDREN (catalyst subprocess) to detect the measurement blind spot where MLIR→LLVM compilation happens in a child process.

Parameters:

label (str)

Return type:

MemorySnapshot

class q2m3.profiling.memory.MemoryTimeline(interval_s=0.1)[source]

Bases: object

Background daemon thread sampling /proc/self/status VmRSS at 100ms intervals.

Also captures /proc/self/smaps_rollup at peak RSS to categorize memory.

Parameters:

interval_s (float)

property peak_smaps: dict[str, float]

Memory categorization captured at peak RSS moment.

class q2m3.profiling.memory.ParentSideMonitor(pid, interval_s=0.1)[source]

Bases: object

Monitor a child process’s RSS from the parent side via /proc/PID/status.

This provides an independent measurement that validates the child’s self-reported ru_maxrss. Critical for detecting measurement blind spots where getrusage misses certain allocation patterns (mmap, LLVM JIT, etc.).

Parameters:
  • pid (int)

  • interval_s (float)

property peak_hwm_mb: float

kernel-tracked high water mark for RSS (most reliable).

Type:

VmHWM

Timing Utilities

General-purpose timing utilities for q2m3.

Provides lightweight profiling tools for measuring code section and function execution times. These utilities are independent of QPE-specific profiling and can be used anywhere in the codebase.

q2m3.profiling.timing.profile_section(name, verbose=True)[source]

Context manager for profiling a code section.

Parameters:
  • name (str) – Name of the section being profiled

  • verbose (bool) – Whether to print timing information

Yields:

dict – A timing info dictionary that gets updated with elapsed time

Example:

with profile_section("QPE calculation") as timing:
    result = run_qpe()
print(f"Elapsed: {timing['elapsed']:.3f}s")
q2m3.profiling.timing.profile_function(func=None, *, verbose=True)[source]

Decorator for profiling a function’s execution time.

Parameters:
  • func (Callable) – The function to profile

  • verbose (bool) – Whether to print timing information

Returns:

Wrapped function that tracks execution time

Return type:

Callable

Example:

@profile_function
def compute_energy():
    ...

@profile_function(verbose=False)
def silent_compute():
    ...
class q2m3.profiling.timing.ProfilingStats(name='default')[source]

Bases: object

Accumulator for profiling statistics across multiple runs.

Parameters:

name (str)

record(elapsed)[source]

Record a timing measurement.

Parameters:

elapsed (float)

Return type:

None

property total: float

Total elapsed time.

property count: int

Number of recorded timings.

property mean: float

Mean elapsed time.

property min: float

Minimum elapsed time.

property max: float

Maximum elapsed time.

summary()[source]

Return a summary string of profiling stats.

Return type:

str

Catalyst IR Analysis

Catalyst IR stage analysis utilities.

WARNING: ir_output_dir() uses os.chdir() which modifies global process state. It is NOT thread-safe. Do not call concurrently from multiple threads. Must be used as the outermost context manager wrapping @qjit decoration.

q2m3.profiling.catalyst_ir.ir_output_dir(path=None)[source]

Manage Catalyst IR workspace directory.

Must wrap @qjit decoration (not just the call), because Catalyst captures os.getcwd() at decoration time to determine the IR output location.

Parameters:

path (str | None) – If provided, IR files are written here and preserved after exit. If None, a temporary directory is used and auto-cleaned on exit.

Yields:

The workspace directory path (user-specified or auto-created tempdir).

Return type:

Generator[str, None, None]

q2m3.profiling.catalyst_ir.analyze_ir_stages(compiled_fn, stages=None)[source]

Export IR text from each compilation stage and measure size/lines.

Parameters:
  • compiled_fn (Any) – A Catalyst @qjit compiled function.

  • stages (list[str] | None) – Override list of stage names. If None, uses COMPILATION_STAGES.

Returns:

List of (stage_name, size_kb, n_lines). Empty list if Catalyst is unavailable or all stages fail.

Return type:

list[tuple[str, float, int]]

QPE Profiling

Three-phase QPE compilation profiling workflow.

Provides profile_* functions for measuring memory usage and timing across:
  • Phase A: PySCF → PennyLane Hamiltonian construction

  • Phase B: @qjit QPE circuit compilation (H_dynamic and H_fixed modes)

  • Phase C: Repeated execution of already-compiled circuit

Catalyst is an optional dependency. profile_qjit_compilation[_fixed] raise ImportError at call time if catalyst is not installed.

q2m3.profiling.qpe_profiler.profile_hamiltonian_build(mol, n_est, n_trotter)[source]

Profile Phase A: PySCF → PennyLane Hamiltonian construction.

Returns:

Tuple of (snapshot, ops, coeffs, hf_state, circuit_params)

Parameters:
Return type:

tuple[MemorySnapshot, list, list[float], ndarray, dict]

q2m3.profiling.qpe_profiler.profile_qjit_compilation(ops, coeffs, hf_state, circuit_params, ir_dir=None, keep_intermediate=True)[source]

Profile Phase B: H_dynamic mode @qjit QPE circuit compilation.

This is the critical phase — first call triggers MLIR→LLVM compilation.

IMPORTANT: @qjit is applied functionally INSIDE ir_output_dir context, because Catalyst captures os.getcwd() at decoration time to determine the IR workspace location.

Parameters:
  • ops (list) – PennyLane operators used to build the runtime Hamiltonian.

  • coeffs (list[float]) – Hamiltonian coefficients supplied at runtime.

  • hf_state (ndarray) – Hartree-Fock occupation vector for system-state preparation.

  • circuit_params (dict) – Circuit metadata from profile_hamiltonian_build().

  • ir_dir (str | None) – If provided, IR files are preserved at this path. If None, a tempdir is used and auto-cleaned after IR analysis.

  • keep_intermediate (bool) – If True, retain all 6 IR stages in memory for analysis. If False, only final stage is kept (tests memory impact).

Returns:

Tuple of (snapshot, timeline, ir_analysis, compiled_fn)

Return type:

tuple[MemorySnapshot, MemoryTimeline, list, Any]

q2m3.profiling.qpe_profiler.profile_qjit_compilation_fixed(ops, coeffs, hf_state, circuit_params, ir_dir=None, keep_intermediate=True)[source]

Profile Phase B for H_fixed mode: Hamiltonian built OUTSIDE @qjit.

Coefficients are Python floats → Catalyst can constant-fold them into MLIR. The compiled function takes zero arguments.

IMPORTANT: @qjit is applied functionally INSIDE ir_output_dir context, because Catalyst captures os.getcwd() at decoration time.

Parameters:
  • ops (list) – PennyLane operators used to build the fixed Hamiltonian.

  • coeffs (list[float]) – Hamiltonian coefficients embedded as Python floats before @qjit.

  • hf_state (ndarray) – Hartree-Fock occupation vector for system-state preparation.

  • circuit_params (dict) – Circuit metadata from profile_hamiltonian_build().

  • ir_dir (str | None) – If provided, IR files are preserved at this path. If None, a tempdir is used and auto-cleaned after IR analysis.

  • keep_intermediate (bool) – If True, retain all 6 IR stages for analysis.

Returns:

Tuple of (snapshot, timeline, ir_analysis, compiled_fn)

Return type:

tuple[MemorySnapshot, MemoryTimeline, list, Any]

q2m3.profiling.qpe_profiler.profile_execution(compiled_fn, coeffs, n_calls=5, is_fixed=False)[source]

Profile Phase C: repeated execution of already-compiled circuit.

Returns:

Tuple of (snapshot, prob_sum_from_last_call)

Parameters:
  • compiled_fn (Any)

  • coeffs (list[float])

  • n_calls (int)

  • is_fixed (bool)

Return type:

tuple[MemorySnapshot, float]

Profiling Orchestrator

Subprocess orchestration and parameter sweep for QPE compilation profiling.

Provides functions to run profiling passes in isolated subprocesses (avoiding ru_maxrss accumulation) and to sweep across parameter grids.

All progress reporting uses an injectable on_progress callback — no direct console output (Rich, print, etc.) so callers can plug in any UI layer.

q2m3.profiling.orchestrator.run_single_profile(mol, n_est, n_trotter, mode='both', ir_dir=None, on_progress=None)[source]

Execute all three profiling phases for one parameter combination.

Parameters:
  • mol (MoleculeConfig) – Molecule configuration.

  • n_est (int) – Number of estimation wires.

  • n_trotter (int) – Trotter decomposition order.

  • mode (str) – “dynamic”, “fixed”, or “both”.

  • ir_dir (str | None) – Directory to preserve IR files. None uses a tempdir.

  • on_progress (Callable[[str], None] | None) – Optional callback for progress messages.

Returns:

A single ProfileResult for “dynamic”/”fixed”, or a tuple of (fixed_result, dynamic_result) when mode=”both”.

Return type:

ProfileResult | tuple[ProfileResult, ProfileResult]

q2m3.profiling.orchestrator.run_single_profile_in_subprocess(mol_key, n_est, n_trotter, queue, mode='dynamic', ir_dir=None)[source]

Run a single profiling pass inside a subprocess (avoids ru_maxrss accumulation).

Results are placed into queue — this function does not return the result.

Parameters:
  • mol_key (str)

  • n_est (int)

  • n_trotter (int)

  • queue (Queue)

  • mode (str)

  • ir_dir (str | None)

Return type:

None

q2m3.profiling.orchestrator.run_both_modes(mol_key, n_est, n_trotter, ir_dir=None, on_progress=None)[source]

Run fixed and dynamic modes in isolated subprocesses with parent-side monitoring.

Returns:

Tuple of (result_fixed, result_dynamic, parent_data_fixed, parent_data_dynamic).

Parameters:
  • mol_key (str)

  • n_est (int)

  • n_trotter (int)

  • ir_dir (str | None)

  • on_progress (Callable[[str], None] | None)

Return type:

tuple[ProfileResult, ProfileResult, dict, dict]

q2m3.profiling.orchestrator.run_sweep(mol_key='h2', mode='dynamic', grid=None, on_progress=None)[source]

Run parameter sweep with subprocess isolation.

Parameters:
  • mol_key (str) – Key into MOLECULES dict.

  • mode (str) – “dynamic” or “fixed”.

  • grid (list[tuple[int, int]] | None) – List of (n_est, n_trotter) pairs. Defaults to H2_SWEEP_GRID.

  • on_progress (Callable[[str], None] | None) – Optional callback for progress messages.

Returns:

Dict mapping (n_est, n_trotter) to ProfileResult for successful runs. Failed grid points are silently skipped (logged at warning level).

Return type:

dict[tuple[int, int], ProfileResult]