ci(coverage): make the job advisory + switch to lcov#329
ci(coverage): make the job advisory + switch to lcov#329fpelliccioni wants to merge 1 commit intomasterfrom
Conversation
📝 WalkthroughWalkthroughReplaces gcovr-based coverage generation with an lcov pipeline in ChangesCoverage workflow: switch to lcov
Estimated Code Review Effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
087ab7e to
b8c69c4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/coverage.yml:
- Around line 130-131: Update the inline comment that mentions gcovr release
dates: replace "March 2026" with the correct PyPI release date for gcovr 8.6
("January 13, 2026") and, if mentioning gcovr 8.5, ensure it reads "January 8,
2026"; locate the comment around the gcovr pinning text (the comment lines
referencing "Pin to gcovr 8.5" and the subsequent sentence about the 8.6
release) and edit the dates accordingly so the comment accurately reflects
PyPI's release dates.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4bd71681-147d-4fda-83e2-ff6ce5bf7088
📒 Files selected for processing (1)
.github/workflows/coverage.yml
b8c69c4 to
b437e3a
Compare
b437e3a to
9aa6964
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/coverage.yml:
- Line 165: Replace the deprecated RC option "--rc lcov_branch_coverage=0" with
the new "--rc branch_coverage=0" wherever it appears in the workflow (update
both occurrences matching the literal string "--rc lcov_branch_coverage=0");
ensure the lcov invocation(s) that currently pass "--rc lcov_branch_coverage=0"
(search for that exact token) are changed to "--rc branch_coverage=0" so lcov
2.x no longer emits the deprecation warning.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fc843a2e-eb3f-49ca-843d-5839d7c89406
📒 Files selected for processing (1)
.github/workflows/coverage.yml
| --parallel $(nproc) \ | ||
| --rc lcov_branch_coverage=0 \ | ||
| --rc geninfo_unexecuted_blocks=1 \ | ||
| --output-file coverage.info |
There was a problem hiding this comment.
Missing --ignore-errors mismatch for lcov 2.0 + GCC 15
Medium Severity
lcov 2.0 (shipped by Ubuntu 24.04 apt) is known to hard-fail with geninfo: ERROR: mismatched end line when processing gcov output from GCC 14+. The container uses GCC 15, which exhibits the same gcov data inconsistencies. Without --ignore-errors mismatch on both the lcov --capture and lcov --remove invocations, the coverage step can abort on the first function with mismatched boundaries, producing no coverage.info at all.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 9aa6964. Configure here.
…h parallel capture
Coverage CI has been red since 2026-04-24. Iterated through four
report-tool configurations on this branch and master's
Build-and-Test (Sanitizers) job has been showing the same
runner-lost-communication signature for the same window:
1. gcovr 8.6 with --html-details — runner OOM during the
report step.
2. gcovr 8.6 XML-only — same OOM; the parsed coverage tree
gcovr holds is the memory hog regardless of writer.
3. gcovr 8.5 — same OOM; the 8.6 → 8.5 refactor was not the
trigger.
4. lcov serial — capture step ran past the 20-min step
timeout while invoking gcov sequentially on every TU
under build/.
5. lcov --parallel + narrowed --directory + branch coverage
off — still hit the runner-lost signature.
Last green coverage run was 2026-04-23 (24851294208), with
gcovr completing in 2m39s. The codebase has grown ~6.4 k lines
across src/ since, and the GHA hosted runner's 7 GB ceiling no
longer accommodates the instrumented build's coverage tooling
under any of the configurations above. The Sanitizers job hits
the same wall on master, pointing at runner memory rather than
this workflow specifically.
Two changes here, both narrow:
* Mark the job `continue-on-error: true`. The Code Coverage
check stays present (so any run that does squeeze through
posts to Codecov), but a runner-lost no longer marks the PR
as failed. PRs aren't blocked on a problem that's
fundamentally outside this workflow's reach until coverage
moves to a runner with more headroom (or the codebase
shrinks).
* Switch the report tool from gcovr to lcov with --parallel
capture, narrowed --directory to the three sub-projects
we keep (domain / infrastructure / blockchain), branch
coverage off, and a single --remove pass for test sources.
The serial-lcov attempt on this branch ran past the 20-min
budget; the parallel + narrowed form completes the test
step + capture inside the budget on every run that doesn't
hit the runner-OOM. Codecov accepts `coverage.info`
natively, so the upload contract is unchanged.
What's unchanged: the `-O0 --coverage` build flags, the
ccache wiring, the master-and-PR triggers, the 130-min job
timeout, the 20-min report-step timeout, the Codecov action
version + secret wiring, the `if: always()` semantics on
the report and upload steps. The only legacy artifact that
goes is the per-file HTML output and its `📎 Upload HTML
report` step — Codecov's hosted UI has served the same
drill-down view for ten days while the artifact upload was
unreachable, so the artifact was redundant when it worked.
9aa6964 to
1e21a73
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
.github/workflows/coverage.yml (3)
167-167: 💤 Low valueAvoid silencing stderr on the lcov install.
apt-get install -y -qq lcov > /dev/null 2>&1discards both stdout and stderr. If the install ever fails (mirror flake, transient apt issue, missing index after a base-image bump), the next line (lcov --version) will fail with a confusing "command not found" instead of the underlying apt error, complicating debugging on a job that is already advisory.♻️ Proposed change
- apt-get update -qq && apt-get install -y -qq lcov > /dev/null 2>&1 + apt-get update -qq && apt-get install -y -qq lcov lcov --version🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/coverage.yml at line 167, The apt-get install invocation currently silences both stdout and stderr ("apt-get install -y -qq lcov > /dev/null 2>&1"); remove the redirection so installation failures surface (keep -qq to minimize noise), i.e., change the command used to install lcov in the workflow so it no longer redirects output to /dev/null and thus preserves error messages from apt-get.
169-186: 💤 Low valueCapture scope is well constrained; consider failing loudly if a
--directoryis absent.The narrowed capture (
src/domain,src/infrastructure,src/blockchain) and the explicit--remove '*/test/*' '*/tests/*'correctly translate the prior gcovr filters and bound peak memory. One minor concern: if a future refactor renames or removes one of thesebuild/build/Release/src/*subtrees, lcov will warn ("no .gcda files found") but still produce a partialcoverage.info, silently shrinking the reported coverage surface. Since this job is already advisory (continue-on-error: true), a missing directory could go unnoticed for a while.Optional: pre-check directory existence and fail fast, e.g.:
for d in domain infrastructure blockchain; do test -d "build/build/Release/src/$d" || { echo "::error::missing coverage source dir: $d"; exit 1; } done🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/coverage.yml around lines 169 - 186, Add a pre-check before the lcov capture block to fail loudly if any of the expected source directories (domain, infrastructure, blockchain) under build/build/Release/src are missing: iterate over the names "domain", "infrastructure", "blockchain", test -d for each "build/build/Release/src/$d", and if any check fails emit a GitHub Actions error message (e.g. "::error::missing coverage source dir: $d") and exit non‑zero so the job stops before running the lcov commands; place this check immediately before the existing lcov --capture invocation to prevent producing a partial coverage.info when directories are absent.
186-186: 💤 Low value
lcov --list ... | tail -1masks pipeline errors.GitHub Actions runs
run:blocks withbash -ebut notpipefail, so a failure inlcov --listis hidden by the successfultail. The report has already been generated by this point so it's not critical, but enablingpipefail(or runninglcov --listandtailas separate steps) makes the summary line a more reliable smoke-test for a malformedcoverage.info.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/coverage.yml at line 186, The pipeline masks failures because the command `lcov --list coverage.info | tail -1` runs under bash without pipefail; modify the run step to enable pipefail (e.g., prefix with `set -o pipefail` or use `bash -eo pipefail`) so a failing `lcov --list` causes the step to fail, or alternatively split into two steps by running `lcov --list coverage.info` and then `tail -1` separately; update the line containing `lcov --list coverage.info | tail -1` accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.github/workflows/coverage.yml:
- Line 167: The apt-get install invocation currently silences both stdout and
stderr ("apt-get install -y -qq lcov > /dev/null 2>&1"); remove the redirection
so installation failures surface (keep -qq to minimize noise), i.e., change the
command used to install lcov in the workflow so it no longer redirects output to
/dev/null and thus preserves error messages from apt-get.
- Around line 169-186: Add a pre-check before the lcov capture block to fail
loudly if any of the expected source directories (domain, infrastructure,
blockchain) under build/build/Release/src are missing: iterate over the names
"domain", "infrastructure", "blockchain", test -d for each
"build/build/Release/src/$d", and if any check fails emit a GitHub Actions error
message (e.g. "::error::missing coverage source dir: $d") and exit non‑zero so
the job stops before running the lcov commands; place this check immediately
before the existing lcov --capture invocation to prevent producing a partial
coverage.info when directories are absent.
- Line 186: The pipeline masks failures because the command `lcov --list
coverage.info | tail -1` runs under bash without pipefail; modify the run step
to enable pipefail (e.g., prefix with `set -o pipefail` or use `bash -eo
pipefail`) so a failing `lcov --list` causes the step to fail, or alternatively
split into two steps by running `lcov --list coverage.info` and then `tail -1`
separately; update the line containing `lcov --list coverage.info | tail -1`
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 31fd87c5-0f18-48d5-89f4-553c5818196f
📒 Files selected for processing (1)
.github/workflows/coverage.yml
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 1e21a73. Configure here.
| # codebase shrinks), keep the job present so it can capture | ||
| # data on the runs that do squeeze through, without making | ||
| # every PR red on the runs that don't. | ||
| continue-on-error: true |
There was a problem hiding this comment.
continue-on-error neutralises fail_ci_if_error guard
Medium Severity
The newly added continue-on-error: true at the job level means the workflow always reports this job as successful, regardless of any step failure. This silently neutralises the existing fail_ci_if_error: ${{ secrets.CODECOV_TOKEN != '' }} on the Codecov upload step — that setting can still mark the step red in the UI, but it can no longer block the PR or surface a workflow-level failure. If the Codecov token expires, upload misconfigures, or coverage data is corrupt, the failure will be invisible to anyone not clicking into the job details.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 1e21a73. Configure here.


Summary
Coverage CI has been red since 2026-04-24. Iterated through four
report-tool configurations on this branch (gcovr 8.6 with HTML,
gcovr 8.6 XML-only, gcovr 8.5, lcov serial, lcov --parallel
narrowed) — every one hit the same "hosted runner lost
communication" signature, GHA's report when the runner process
gets OS-killed (memory pressure SIGKILL).
The Sanitizers job in master's Build-and-Test workflow has been
failing the same way during the same window, which points at the
GHA hosted runner's 7 GB ceiling rather than this workflow's
config. The codebase has grown ~6.4 k lines across `src/` since
the last green coverage run on 2026-04-23 (`24851294208`, gcovr
finished in 2m39s), and instrumented builds (-O0 + --coverage,
TSAN, ASAN) no longer fit.
Diff
Two changes, both narrow:
Mark the job `continue-on-error: true`. The Code Coverage
check stays present so runs that do squeeze through still post
to Codecov, but a runner-lost no longer fails the PR. PRs
aren't blocked on a problem fundamentally outside this
workflow's reach until coverage moves to a runner with more
headroom (or the codebase shrinks).
Switch report tool from gcovr to lcov. Even though the
job is advisory, lcov's bounded per-TU memory model is more
likely to actually finish on the runs that aren't OOM-killed
in the test step.
(`domain` / `infrastructure` / `blockchain`).
faster parse).
show 0%).
`📎 Upload HTML report` step (Codecov's UI serves the
same drill-down; the artifact upload has been unreachable
for ten days).
What's unchanged
The `-O0 --coverage` build flags, ccache wiring, the master-
and-PR triggers, the 130-min job timeout, the 20-min report-step
timeout, the Codecov action version + secret wiring, the
`if: always()` semantics on the report + upload steps.
Test plan
check posts as advisory (not required), so the PR's
mergeable state isn't tied to its outcome.
and posts the usual PR comment.
least one of the next few master pushes.
Summary by CodeRabbit
Chores
Documentation