Profiling API¶

Memory Profiling¶

Memory measurement infrastructure for QPE circuit compilation profiling.

Provides multi-layer memory measurement:

resource.getrusage(RUSAGE_SELF/RUSAGE_CHILDREN) → process lifetime peak RSS
/proc/self/status VmRSS/VmPeak → current/peak RSS (Linux kernel level)
tracemalloc → Python heap only

This module has zero project dependencies (stdlib only).

class q2m3.profiling.memory.MemorySnapshot(label, rss_mb, vm_peak_mb, maxrss_mb, maxrss_children_mb, tracemalloc_peak_mb, tracemalloc_current_mb, elapsed_s=0.0)[source]¶

Bases: object

Multi-layer memory snapshot at a single point in time.

Parameters:

label (str)
rss_mb (float)
vm_peak_mb (float)
maxrss_mb (float)
maxrss_children_mb (float)
tracemalloc_peak_mb (float)
tracemalloc_current_mb (float)
elapsed_s (float)

class q2m3.profiling.memory.ProfileResult(molecule, n_system_qubits, n_estimation_wires, n_trotter, n_terms, ir_scale, mode='dynamic', phase_a=None, phase_b=None, phase_c=None, timeline_peak_mb=0.0, timeline_samples=<factory>, ir_analysis=<factory>, prob_sum=0.0, error=None)[source]¶

Bases: object

Complete profiling result for one parameter combination.

Parameters:

molecule (str)
n_system_qubits (int)
n_estimation_wires (int)
n_trotter (int)
n_terms (int)
ir_scale (int)
mode (str)
phase_a (MemorySnapshot | None)
phase_b (MemorySnapshot | None)
phase_c (MemorySnapshot | None)
timeline_peak_mb (float)
timeline_samples (list)
ir_analysis (list)
prob_sum (float)
error (str | None)

q2m3.profiling.memory.read_proc_status(pid='self')[source]¶

Parse /proc/[pid]/status for VmRSS and VmPeak (Linux only).

Parameters:: pid (int | str) – Process ID to read, or “self” for current process.
Return type:: dict[str, float]

q2m3.profiling.memory.read_smaps_rollup(pid='self')[source]¶

Parse /proc/[pid]/smaps_rollup to categorize RSS into memory types.

Returns dict with MB values for:

Rss: total resident
Pss: proportional share
Anonymous: heap + anonymous mmap (C++ allocations, LLVM JIT code)
LazyFree, AnonHugePages, etc.

Parameters:: pid (int | str)
Return type:: dict[str, float]

q2m3.profiling.memory.take_snapshot(label)[source]¶

Take a comprehensive memory snapshot from all measurement layers.

Captures both RUSAGE_SELF (Python process) and RUSAGE_CHILDREN (catalyst subprocess) to detect the measurement blind spot where MLIR→LLVM compilation happens in a child process.

Parameters:: label (str)
Return type:: MemorySnapshot

class q2m3.profiling.memory.MemoryTimeline(interval_s=0.1)[source]¶

Bases: object

Background daemon thread sampling /proc/self/status VmRSS at 100ms intervals.

Also captures /proc/self/smaps_rollup at peak RSS to categorize memory.

Parameters:: interval_s (float)

property peak_smaps: dict[str, float]¶: Memory categorization captured at peak RSS moment.

class q2m3.profiling.memory.ParentSideMonitor(pid, interval_s=0.1)[source]¶

Bases: object

Monitor a child process’s RSS from the parent side via /proc/PID/status.

This provides an independent measurement that validates the child’s self-reported ru_maxrss. Critical for detecting measurement blind spots where getrusage misses certain allocation patterns (mmap, LLVM JIT, etc.).

Parameters:

pid (int)
interval_s (float)

property peak_hwm_mb: float¶

kernel-tracked high water mark for RSS (most reliable).

Type:: VmHWM

Timing Utilities¶

General-purpose timing utilities for q2m3.

Provides lightweight profiling tools for measuring code section and function execution times. These utilities are independent of QPE-specific profiling and can be used anywhere in the codebase.

q2m3.profiling.timing.profile_section(name, verbose=True)[source]¶

Context manager for profiling a code section.

Parameters:

name (str) – Name of the section being profiled
verbose (bool) – Whether to print timing information

Yields:

dict – A timing info dictionary that gets updated with elapsed time

Example:

with profile_section("QPE calculation") as timing:
    result = run_qpe()
print(f"Elapsed: {timing['elapsed']:.3f}s")

q2m3.profiling.timing.profile_function(func=None, *, verbose=True)[source]¶

Decorator for profiling a function’s execution time.

Parameters:

func (Callable) – The function to profile
verbose (bool) – Whether to print timing information

Returns:

Wrapped function that tracks execution time

Return type:

Callable

Example:

@profile_function
def compute_energy():
    ...

@profile_function(verbose=False)
def silent_compute():
    ...

class q2m3.profiling.timing.ProfilingStats(name='default')[source]¶

Bases: object

Accumulator for profiling statistics across multiple runs.

Parameters:: name (str)

record(elapsed)[source]¶

Record a timing measurement.

Parameters:: elapsed (float)
Return type:: None

property total: float¶: Total elapsed time.

property count: int¶: Number of recorded timings.

property mean: float¶: Mean elapsed time.

property min: float¶: Minimum elapsed time.

property max: float¶: Maximum elapsed time.

summary()[source]¶

Return a summary string of profiling stats.

Return type:: str

Catalyst IR Analysis¶

Catalyst IR stage analysis utilities.

WARNING: ir_output_dir() uses os.chdir() which modifies global process state. It is NOT thread-safe. Do not call concurrently from multiple threads. Must be used as the outermost context manager wrapping @qjit decoration.

q2m3.profiling.catalyst_ir.ir_output_dir(path=None)[source]¶

Manage Catalyst IR workspace directory.

Must wrap @qjit decoration (not just the call), because Catalyst captures os.getcwd() at decoration time to determine the IR output location.

Parameters:: path (str | None) – If provided, IR files are written here and preserved after exit. If None, a temporary directory is used and auto-cleaned on exit.
Yields:: The workspace directory path (user-specified or auto-created tempdir).
Return type:: Generator[str, None, None]

q2m3.profiling.catalyst_ir.analyze_ir_stages(compiled_fn, stages=None)[source]¶

Export IR text from each compilation stage and measure size/lines.

Parameters:

compiled_fn (Any) – A Catalyst @qjit compiled function.
stages (list[str] | None) – Override list of stage names. If None, uses COMPILATION_STAGES.

Returns:

List of (stage_name, size_kb, n_lines). Empty list if Catalyst is unavailable or all stages fail.

Return type:

list[tuple[str, float, int]]

QPE Profiling¶

Three-phase QPE compilation profiling workflow.

Provides profile_* functions for measuring memory usage and timing across:

Phase A: PySCF → PennyLane Hamiltonian construction
Phase B: @qjit QPE circuit compilation (H_dynamic and H_fixed modes)
Phase C: Repeated execution of already-compiled circuit

Catalyst is an optional dependency. profile_qjit_compilation[_fixed] raise ImportError at call time if catalyst is not installed.

q2m3.profiling.qpe_profiler.profile_hamiltonian_build(mol, n_est, n_trotter)[source]¶

Profile Phase A: PySCF → PennyLane Hamiltonian construction.

Returns:

Tuple of (snapshot, ops, coeffs, hf_state, circuit_params)

Parameters:

mol (MoleculeConfig)
n_est (int)
n_trotter (int)

Return type:

tuple[MemorySnapshot, list, list[float], ndarray, dict]

q2m3.profiling.qpe_profiler.profile_qjit_compilation(ops, coeffs, hf_state, circuit_params, ir_dir=None, keep_intermediate=True)[source]¶

Profile Phase B: H_dynamic mode @qjit QPE circuit compilation.

This is the critical phase — first call triggers MLIR→LLVM compilation.

IMPORTANT: @qjit is applied functionally INSIDE ir_output_dir context, because Catalyst captures os.getcwd() at decoration time to determine the IR workspace location.

Parameters:

ops (list) – PennyLane operators used to build the runtime Hamiltonian.
coeffs (list[float]) – Hamiltonian coefficients supplied at runtime.
hf_state (ndarray) – Hartree-Fock occupation vector for system-state preparation.
circuit_params (dict) – Circuit metadata from profile_hamiltonian_build().
ir_dir (str | None) – If provided, IR files are preserved at this path. If None, a tempdir is used and auto-cleaned after IR analysis.
keep_intermediate (bool) – If True, retain all 6 IR stages in memory for analysis. If False, only final stage is kept (tests memory impact).

Returns:

Tuple of (snapshot, timeline, ir_analysis, compiled_fn)

Return type:

tuple[MemorySnapshot, MemoryTimeline, list, Any]

q2m3.profiling.qpe_profiler.profile_qjit_compilation_fixed(ops, coeffs, hf_state, circuit_params, ir_dir=None, keep_intermediate=True)[source]¶

Profile Phase B for H_fixed mode: Hamiltonian built OUTSIDE @qjit.

Coefficients are Python floats → Catalyst can constant-fold them into MLIR. The compiled function takes zero arguments.

IMPORTANT: @qjit is applied functionally INSIDE ir_output_dir context, because Catalyst captures os.getcwd() at decoration time.

Parameters:

ops (list) – PennyLane operators used to build the fixed Hamiltonian.
coeffs (list[float]) – Hamiltonian coefficients embedded as Python floats before @qjit.
hf_state (ndarray) – Hartree-Fock occupation vector for system-state preparation.
circuit_params (dict) – Circuit metadata from profile_hamiltonian_build().
ir_dir (str | None) – If provided, IR files are preserved at this path. If None, a tempdir is used and auto-cleaned after IR analysis.
keep_intermediate (bool) – If True, retain all 6 IR stages for analysis.

Returns:

Tuple of (snapshot, timeline, ir_analysis, compiled_fn)

Return type:

tuple[MemorySnapshot, MemoryTimeline, list, Any]

q2m3.profiling.qpe_profiler.profile_execution(compiled_fn, coeffs, n_calls=5, is_fixed=False)[source]¶

Profile Phase C: repeated execution of already-compiled circuit.

Returns:

Tuple of (snapshot, prob_sum_from_last_call)

Parameters:

compiled_fn (Any)
coeffs (list[float])
n_calls (int)
is_fixed (bool)

Return type:

tuple[MemorySnapshot, float]

Profiling Orchestrator¶

Subprocess orchestration and parameter sweep for QPE compilation profiling.

Provides functions to run profiling passes in isolated subprocesses (avoiding ru_maxrss accumulation) and to sweep across parameter grids.

All progress reporting uses an injectable on_progress callback — no direct console output (Rich, print, etc.) so callers can plug in any UI layer.

q2m3.profiling.orchestrator.run_single_profile(mol, n_est, n_trotter, mode='both', ir_dir=None, on_progress=None)[source]¶

Execute all three profiling phases for one parameter combination.

Parameters:

mol (MoleculeConfig) – Molecule configuration.
n_est (int) – Number of estimation wires.
n_trotter (int) – Trotter decomposition order.
mode (str) – “dynamic”, “fixed”, or “both”.
ir_dir (str | None) – Directory to preserve IR files. None uses a tempdir.
on_progress (Callable[[str], None] | None) – Optional callback for progress messages.

Returns:

A single ProfileResult for “dynamic”/”fixed”, or a tuple of (fixed_result, dynamic_result) when mode=”both”.

Return type:

ProfileResult | tuple[ProfileResult, ProfileResult]

q2m3.profiling.orchestrator.run_single_profile_in_subprocess(mol_key, n_est, n_trotter, queue, mode='dynamic', ir_dir=None)[source]¶

Run a single profiling pass inside a subprocess (avoids ru_maxrss accumulation).

Results are placed into queue — this function does not return the result.

Parameters:

mol_key (str)
n_est (int)
n_trotter (int)
queue (Queue)
mode (str)
ir_dir (str | None)

Return type:

None

q2m3.profiling.orchestrator.run_both_modes(mol_key, n_est, n_trotter, ir_dir=None, on_progress=None)[source]¶

Run fixed and dynamic modes in isolated subprocesses with parent-side monitoring.

Returns:

Tuple of (result_fixed, result_dynamic, parent_data_fixed, parent_data_dynamic).

Parameters:

mol_key (str)
n_est (int)
n_trotter (int)
ir_dir (str | None)
on_progress (Callable[[str], None] | None)

Return type:

tuple[ProfileResult, ProfileResult, dict, dict]

q2m3.profiling.orchestrator.run_sweep(mol_key='h2', mode='dynamic', grid=None, on_progress=None)[source]¶

Run parameter sweep with subprocess isolation.

Parameters:

mol_key (str) – Key into MOLECULES dict.
mode (str) – “dynamic” or “fixed”.
grid (list[tuple[int, int]] | None) – List of (n_est, n_trotter) pairs. Defaults to H2_SWEEP_GRID.
on_progress (Callable[[str], None] | None) – Optional callback for progress messages.

Returns:

Dict mapping (n_est, n_trotter) to ProfileResult for successful runs. Failed grid points are silently skipped (logged at warning level).

Return type:

dict[tuple[int, int], ProfileResult]