Skip to content

Pcpu delta spike#423

Draft
asmacdo wants to merge 10 commits intocon:mainfrom
asmacdo:pcpu-delta-spike
Draft

Pcpu delta spike#423
asmacdo wants to merge 10 commits intocon:mainfrom
asmacdo:pcpu-delta-spike

Conversation

@asmacdo
Copy link
Copy Markdown
Member

@asmacdo asmacdo commented Apr 28, 2026

Warning: PR description is currently just accumulated notes from claude-code session. Some inaccuracies to clean up, lots of cruft to remove.

Linux delta-pcpu sampler (fixes #399)

Why

ps -o pcpu on Linux reports a process's lifetime average
(cputime / realtime), not instantaneous CPU. Summing it across a
churn of short-lived high-CPU workers is unbounded — duct logged
5363% on a 20-core box (#399). See
findings/calc-sketch-linux-pcpu.md (notebook worktree) for the
full mechanism.

Changes

  • instantaneous-pcpu calculation For each pid, recover cumulative cputime
    from pcpu × etimes by taking a delta across consecutive samples, divide
    by delta etimes.
  • Audit-trail fields. ProcessStats gains pcpu_raw (the
    unmodified ps value) and etimes (float seconds). pcpu's
    semantics shift to "corrected when computable, equals
    pcpu_raw at fallback." The string etime is retained.
  • Fallback policy. When a pid is first observed, or
    Δetimes ≤ 0, pcpu falls back to pcpu_raw rather than 0.0.
    Silently reporting 0% for a churning workload would mislead
    worse than carrying the lifetime ratio. Tradeoff is documented
    in Limitations below.
  • State plumbing. The previous-sample raw values live on the
    Tracker (which already holds cross-sample state); Linux/Mac
    sampler functions stay pure with signature
    (session_id, prev) -> (Optional[Sample], prev').
  • Drop young pids (smon's pattern). Pids whose ps etime
    reads "00:00" (too young for ps to report a non-zero elapsed
    time) are skipped from the sample. Mitigates oddity from observing the plot: add max's to legend, and check what is "total" pcpu/plot average? #399's
    churn-of-short-lived-workers case more aggressively than the
    pcpu_raw fallback alone — inflated lifetime ratios from
    brand-new pids never enter the sum at all. Toggled by a
    module-level DROP_YOUNG_PIDS constant (default True); see
    TODOs for follow-up.
  • Empty-sample handling. When every pid in a ps round is
    filtered (common at process startup before etime ticks past
    1s), _get_sample_linux now returns (None, new_prev) instead
    of an empty Sample. Mirrors the existing _get_sample_mac
    no-pids path so the monitor loop skips the round rather than
    crashing on Averages.from_sample's not-None assertions.
  • Linux only. macOS ps -o pcpu is a 5/8-aging exponential
    average, not lifetime-cumulative; the delta math is invalid
    there. Darwin keeps current behavior; finding documented in
    findings/darwin-pcpu.md.
  • Schema bump 0.2.2 → 0.3.0. Reflects the additive fields and
    the pcpu/total_pcpu semantics shift.
  • Sub-1s sample-interval warning. cli.py::run_command emits
    a one-shot lgr.warning when --sample-interval < 1.0,
    flagging that phase artifacts (false pcpu spikes) are likely on
    Linux. Suppressed under --quiet / --log-level NONE.

Empirical validation

Workload: 4s busy + 10s idle Python loop, --s-i 1 --r-i 1.

peak pcpu average pcpu
main (lifetime calc) 300% 80.1%
this PR 100% 27.7%
ground truth ~100% 28.6%

Wall-time identical → no measurable overhead. Logs:
~/devel/duct/.duct/logs/2026.04.27T20.00.28-164596_* (main),
~/devel/duct/.worktrees/pcpu-delta-spike/.duct/logs/2026.04.27T20.00.07-164502_*.

Limitations (honest)

Captured in PROBLEMS.md (notebook worktree):

  • Sub-interval spikes are smoothed. A 0.2s 100% burst inside a
    1s sample reads as 20%. Strictly better than the lifetime
    method for late-run spikes, but lower-resolution than a
    hypothetical /proc/<pid>/stat source.
  • Short-lived workers lose their audit record entirely.
    DROP_YOUNG_PIDS discards the whole record (including
    cmd/rss/vsz/pmem), not just pcpu, for any pid ps
    reports with etime="00:00". Tradeoff in favor of mitigating
    oddity from observing the plot: add max's to legend, and check what is "total" pcpu/plot average? #399 churn over preserving full visibility on every pid. Pids
    that survive long enough for etime to tick past 1s are
    observed normally on subsequent samples.
  • First-observed pids with etime > 0 still fall back to
    pcpu_raw.
    A pid that's been alive for > 1s before duct
    first samples it has no prior to delta against, so its first
    recorded pcpu is the lifetime ratio. Less common than the
    etime="00:00" case but not eliminated by DROP_YOUNG_PIDS.
    Bounds-style representation (pcpu_lower/pcpu_upper) is the
    long-term direction; deferred to a follow-up PR.
  • Sub-second sample intervals produce phase artifacts. ps etime resolution is 1s, so when configured sample-interval is
    below that, Δetime collapses to 0 between many sample pairs.
    The corrected calc falls back to pcpu_raw (the lifetime
    ratio) on those samples, which can show as false pcpu spikes —
    especially for short-lived pids whose lifetime ratio is
    artificially high. At default 1.0s the issue does not occur in
    practice (positive overhead keeps actual interval ≥ 1s);
    cli.py warns when the configured value is below 1.0s.
  • First-sample volatility from procps. Occasionally the very
    first record for a brand-new pid reports nonsense etime
    (~10⁶-year readings observed once across ~7 manual runs). The
    fallback protects the calc, but the raw value persists in the
    audit record. Mitigation deferred.

Schema and back-compat

  • New required fields on ProcessStats: pcpu_raw: float,
    etimes: float. Additive in JSON.
  • pcpu and total_pcpu keep their names; semantics shift on
    Linux (corrected) and are unchanged on Darwin (lifetime).
  • __schema_version__ bumped to 0.3.0.
  • No deserialization shim is shipped: nothing in
    con-duct pp / plot / ls reconstructs ProcessStats from
    parsed JSON — all consumers read dicts. Old usage.jsonl files
    remain plottable; their pcpu values continue to reflect the
    old lifetime semantics (correct for files written by older
    duct).
  • Mixed-version caveat: users who have both new and old
    con-duct versions in rotation should be aware that
    pcpu/total_pcpu numbers from new-duct runs are not
    comparable to numbers from old-duct runs on the same workload.

TODOs left for follow-up

  • If a future consumer ever deserializes ProcessStats from
    parsed JSON, it will need defaults for pcpu_raw/etimes on
    pre-0.3.0 records. None exist today; flagged so the back-compat
    cost is visible if one is added.
  • ls.py:113-119 already branches on parse_version for
    pre-0.2.1 and pre-0.2.2 info.json files. If usage.jsonl ever
    needs analogous version-aware loading, the same pattern
    applies.
  • Promote DROP_YOUNG_PIDS from a module-level constant to an
    env var + CLI flag.
    Today the toggle lives only in
    _constants.py and is overridden in tests via a session-scoped
    monkeypatch in test/conftest.py. That mechanism doesn't reach
    subprocesses, which is why test_e2e.py had to bump several
    fixture sleeps to ≥1s instead. The proper fix is the same
    pattern as DUCT_SAMPLE_INTERVAL / DUCT_REPORT_INTERVAL:
    env-var-driven default in _constants.py, paired CLI flag in
    cli.py, parameter threaded through duct_executeReport
    _get_sample_linux. Once that lands, the e2e fixture sleeps
    can return to short values and the conftest override can use
    the env var directly. TODO (post-merge): file the issue
    using the draft at
    .worktrees/resource-measurement-notebook/FOLLOWUP_ISSUE_drop_young_pids.md
    — only if this PR is accepted upstream.
  • Consider bumping --sample-interval default from 1.0s to
    2.0s.
    At configured 1.0s with positive overhead the corrected
    calc is reliable (Δetime ≥ 1 in steady state), so the bump is
    not a correctness fix — it's a conservatism call that gives
    users an extra safety margin away from the 1s etime quantum.
    Pushing the default also pushes the observability floor (sub-2s
    pids become single-sample with pcpu_raw fallback). Worth a
    separate discussion / PR.
  • docs/resource-statistics.md rewrite on top of the new calc.
    Will supersede Document what duct's resource statistics actually measure #414. Tracked separately.

Test coverage

  • test/duct_main/test_duct_utils.py — parser
    (etime_to_etimes) and calculator (instantaneous_pcpu)
    green/red paths.
  • test/duct_main/test_aggregation.py and test_report.py
    fixtures updated for the new fields.
  • test/duct_main/test_sampling.py (new) — three-call test of
    _get_sample_linux with mocked ps: first-observation
    fallback, corrected pcpu via prev lookup, pid-disappearance
    handling in the rebuilt new_prev, and etime="00:00"
    drop assertions. Plus a dedicated all-young-dropped test
    asserting (None, new_prev) is returned. Both opt into
    DROP_YOUNG_PIDS=True via the enable_drop_young_pids
    fixture (the conftest disables the filter session-wide for
    in-process tests).
  • test/duct_main/test_report.py::test_collect_sample_round_trips_prev_raw
    (new) — regression guard for the load-bearing decision that
    the Tracker holds the cross-sample state.
  • test/conftest.py gains a session-autouse fixture that
    flips DROP_YOUNG_PIDS=False for in-process tests (so
    pre-existing fixtures that rely on observing sub-second
    workloads keep working) and a function-scoped opt-in
    fixture for the unit tests above.
  • test/duct_main/test_e2e.py — sample-asserting tests
    (test_spawn_children, test_session_modes,
    test_session_mode_behavior_difference) bumped from
    sub-second to sleep 2 so the workload outlives ps's
    etime quantum. Tests that don't assert on samples
    (test_sanity, test_logging_levels) keep their
    original short sleeps. The empty-sample (None, new_prev)
    return in _get_sample_linux keeps duct exiting cleanly
    on those short-sleep tests rather than crashing the
    monitor thread.

A multi-step trajectory test was considered and dropped: the
sampler rebuilds new_prev from scratch on each call, so a
3-step series cannot fail when a 2-step one passes. The
parameterized cases in test_instantaneous_pcpu_green already
cover the math.

Commits

  1. d6054d6 parser: etime_to_etimes
  2. 56bae66 math: instantaneous_pcpu calculator
  3. 7ec330f fallback: instantaneous_pcpu returns
    curr_pcpu (not None)
  4. ab382ad schema: persist pcpu_raw and etimes on
    ProcessStats
  5. a1e61ae sampling: wire delta-pcpu into the Linux sampler
  6. 3953290 constants: bump schema_version to 0.3.0
  7. dd18f13 tests: cover prev-state threading in
    _get_sample_linux
  8. ba24400 tests: cover prev_raw round-trip in
    Report.collect_sample
  9. a945ea6 sampling: drop young pids (etime=="00:00") by
    default (also brings the empty-sample None-return,
    conftest override, and targeted e2e sleep bumps)
  10. 0d742f5 cli: warn on sub-1s sample-interval

Each commit is atomic and pre-commit-clean; the series reads as a
coherent fix unit per Operating principles.

asmacdo and others added 10 commits April 27, 2026 17:55
Parses ps's elapsed-time string format ([[DD-]HH:]MM:SS) into float
seconds. The downstream instantaneous-pcpu calculation needs a
numeric elapsed time, but ps -o etime emits only the formatted
string. A future change may switch to ps -o etimes (which ps can
produce directly as seconds); the parser stays useful either way
for reading existing usage.jsonl logs that persist the string form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure function that recovers instantaneous %CPU between two ps
samples of the same pid by inverting the procps cumulative-pcpu
identity: cputime = pcpu * etime / 100. Subtracting cputime at two
samples and dividing by the interval gives a value in the same
units as the input pcpu.

Linux-only by assumption (Darwin's ps reports a 5/8-decayed EWMA,
not a cumulative ratio, so the identity does not hold there).
Returns None when the elapsed-time interval is non-positive --
either the pid has not aged a full ps tick, or etimes regressed
(suspected pid reuse).

This commit only adds the function and its tests; wiring into the
sampler is a separate change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add two required fields to ProcessStats so the sampler keeps a
parsed numeric elapsed time alongside ps's raw outputs:

- etimes: float seconds derived from etime via etime_to_etimes.
  Used by the upcoming delta-pcpu calc; etime (string) is retained
  in this commit for back-compat with existing usage.jsonl readers.
- pcpu_raw: copy of the value ps reported. Equal to pcpu in this
  commit; pcpu will diverge once the delta calc is wired and
  pcpu_raw becomes the audit trail for the derivation.

Both samplers (Linux and Darwin) populate the new fields. Aggregate
takes max for pcpu_raw and the later sample's etimes, mirroring
existing behavior for pcpu and etime. Test fixtures across
test_aggregation and test_report pick up matching values.

A backwards-compat layer for existing usage.jsonl logs that pre-date
these fields is intentionally deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous semantics returned None when the elapsed-time interval
was non-positive (etimes did not advance, or regressed). That
forced every caller into the same fallback boilerplate.

New semantics: return curr_pcpu in the non-positive case. A
sophisticated consumer can still detect "fallback was used" by
comparing the result against pcpu_raw post-hoc; the audit data is
unchanged. The function's return type tightens from float | None
to float, and the only existing caller (the test suite) is updated.

Rationale: returning curr_pcpu is the conservative answer that
keeps the value honest about "something is here" rather than
emitting a silent zero -- avoiding the false-zero failure mode of
under-reporting churning workloads as idle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Linux _get_sample_linux now accepts a prev-state dict mapping
pid -> (pcpu_raw, etimes), looks up each pid's prior raw inputs,
and computes corrected pcpu via instantaneous_pcpu. First-observed
pids fall back to pcpu_raw (no prior to delta against). The dict
is rebuilt fresh from the current sample, so pids that disappear
between samples are dropped automatically -- no unbounded growth
across long runs.

Mac sampler matches the new signature for uniform dispatch but
returns prev unchanged; Darwin's ps reports a 5/8-decayed EWMA,
not a cumulative ratio, so the delta calc does not apply there.

Report holds the prev_raw dict across collect_sample calls,
initialized empty in __init__. No public-API changes; tests that
construct Sample/ProcessStats directly are untouched.

Limitations (also captured in PROBLEMS.md in the
resource-measurement-notebook worktree):
- For workloads where no pid survives long enough for etimes to
  advance past the previous sample's value (the con#399
  trigger case of sequentially-churning short-lived workers),
  every observation falls back to pcpu_raw and the calc provides
  no improvement.
- Sub-second sample intervals frequently land both samples in the
  same etime quantum, yielding the fallback path.
- The persisted pcpu_raw alongside the corrected pcpu lets a
  consumer detect "fallback was used" by comparing the two.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ProcessStats gained two required fields (pcpu_raw, etimes) and
pcpu's semantics shifted on Linux from "lifetime average" to
"corrected when computable, equals pcpu_raw at fallback." The
schema is additive on the JSON wire (existing keys keep their
names), but consumers reading new files with old expectations
should know the meaning of pcpu/total_pcpu has changed --
hence the minor bump.

ls.py's ensure_compliant_schema patches info.json fields only
(working_directory at 0.2.1, message at 0.2.2). No 0.3.0 entry
is needed there: the additions live on usage.jsonl ProcessStats,
and no consumer in this codebase reconstructs ProcessStats from
parsed JSON (all read dicts directly).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The unit tests in test_duct_utils.py exercise instantaneous_pcpu
in isolation, but nothing covers the wiring layer between it and
the live sampler -- the dict-keyed prev lookup, the
first-observation fallback, and the rebuild-from-scratch behavior
of new_prev that drops vanished pids.

Three-call test with a mocked ps subprocess:
- Call 1: empty prev -> pcpu falls back to pcpu_raw.
- Call 2: corrected via instantaneous_pcpu (matches the constant
  84% case from the parameterized green tests).
- Call 3: pid disappears, new pid appears -> new_prev contains
  only the surviving pid.

Catches regressions where the (pcpu_raw, etimes) tuple order is
swapped, the prev lookup is dropped, or new_prev accumulates
stale entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One of the load-bearing decisions in the delta-pcpu work is that
the Tracker holds the cross-sample state (rather than a
module-level dict or a sampler-as-class). Without a regression
test, refactoring self.prev_raw to a local would silently disable
the calc on every sample after the first.

Two-call test mocking _get_sample with a side_effect that
inspects each invocation's input prev. Asserts the first call
gets {} and the second gets the dict returned from the first;
asserts report.prev_raw lands on the latest returned value after
each call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adopts smon's pattern of skipping pids whose ps elapsed-time
reads "00:00" -- pids too young for ps to report a meaningful
etime. Mitigates the con#399 churn-of-short-lived-workers
case more aggressively than the pcpu_raw fallback alone, by
preventing inflated lifetime ratios from ever entering the
sample.

Tradeoff: loses the entire record (cmd/rss/vsz/pmem) for any
pid ps reports with etime=="00:00", not just the pcpu field.
Pids that survive long enough for etime to tick past 1s are
observed normally on subsequent samples.

Toggled by DROP_YOUNG_PIDS in _constants.py rather than a CLI
flag for now. Promoting it to an env-var + CLI flag (matching
the DUCT_SAMPLE_INTERVAL / DUCT_REPORT_INTERVAL pattern) is
tracked as a follow-up; see PR description.

Empty-sample handling:
_get_sample_linux now returns (None, new_prev) when every pid
in a ps round is filtered (common at process startup before
etime ticks past 1s). Mirrors the existing _get_sample_mac
no-pids path so the monitor loop skips the round rather than
crashing on Averages.from_sample's not-None assertions. The
return type narrows from Sample to Optional[Sample] to match.

Tests:
- test_sampling.py grows a young-pid line in the threading test
  (asserts both sample and new_prev exclude it) and a dedicated
  all-young-dropped test (asserts the (None, new_prev) return).
  Both opt into DROP_YOUNG_PIDS=True via enable_drop_young_pids
  applied as a usefixtures mark.
- test/conftest.py gains a session-autouse fixture that flips
  DROP_YOUNG_PIDS=False for in-process tests (so pre-existing
  fixtures asserting on sub-second workloads keep working) and
  a function-scoped enable_drop_young_pids opt-in for unit tests
  that specifically exercise the drop. The session override
  reaches in-process tests only; subprocess tests in test_e2e.py
  inherit the production default.
- test_e2e.py: sample-asserting tests (test_spawn_children,
  test_session_modes, test_session_mode_behavior_difference)
  bumped from sub-second to >=1s sleeps. Tests that don't assert
  on samples (test_sanity, test_logging_levels) keep their short
  sleeps -- the empty-sample handling makes them exit cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ps reports etime at 1-second resolution, so configured
sample-interval below 1s produces phase artifacts in the
corrected pcpu calc -- adjacent sample pairs frequently land
in the same etime quantum, the calc falls back to pcpu_raw
(lifetime ratio), and short-lived pids with high startup CPU
show as false spikes.

At the default 1.0s with positive sampling overhead the issue
does not occur in practice (actual interval >= 1s, Δetime >= 1
in steady state). The warning fires only when the user
explicitly configures a sub-1s interval; --quiet and
--log-level NONE suppress it through the normal log machinery.

Bumping the default to 2.0s for an extra safety margin is
captured as a TODO in the PR description; not a correctness
fix, just a conservatism call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

❌ Patch coverage is 96.22642% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.99%. Comparing base (2162da9) to head (0d742f5).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
src/con_duct/cli.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #423      +/-   ##
==========================================
+ Coverage   91.87%   91.99%   +0.12%     
==========================================
  Files          15       15              
  Lines        1120     1162      +42     
  Branches      139      147       +8     
==========================================
+ Hits         1029     1069      +40     
- Misses         69       71       +2     
  Partials       22       22              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

oddity from observing the plot: add max's to legend, and check what is "total" pcpu/plot average?

1 participant