Skip to content

CI speedups#7407

Open
christianbaroni wants to merge 11 commits into@christian/fix-e2e-brittlenessfrom
@christian/ci-speedups
Open

CI speedups#7407
christianbaroni wants to merge 11 commits into@christian/fix-e2e-brittlenessfrom
@christian/ci-speedups

Conversation

@christianbaroni
Copy link
Copy Markdown
Member

@christianbaroni christianbaroni commented Apr 29, 2026

What changed (plus any additional context for devs)

Adds duration-based e2e sharding and persisted Prettier/ESLint caches on top of #7406.

E2E sharding now runs as a feedback loop instead of a fixed file-order split: each successful run writes updated pass/retry timings to timings.tsv, and later runs use those weights to keep the shard workload balanced as flow runtimes change.

Shard balancing

  • scripts/e2e-run.sh plans shards from timing history, placing the highest-weight remaining flow on the shard with the lowest planned total
  • Each platform restores timing data from all four shard caches before planning, so every shard uses the same suite-wide history. Flows without history start at a 60s default weight.
  • Retry cost is tracked separately from pass duration. Retried flows affect the next plan, but retry time is capped at the passing duration and smoothed across runs.
  • E2E artifacts include plan.tsv and timings.tsv, which show the produced shard assignment and the timing data saved for later runs

Lint

  • lint:ci uses cache-aware Prettier and ESLint commands with explicit .cache/prettier and .cache/eslint paths
  • The unit test workflow restores those caches from dependency, lint config, and TypeScript inputs, so warm PR runs skip unchanged files without reusing stale lint results

Workflow cancellation

  • Android/iOS build, shard, and perf jobs use PR-number concurrency groups
  • Newer commits cancel older e2e shard jobs for the same PR shard

Results

Each value is the median of the latest five successful runs for each PR. Measurements use active GitHub job/step time and exclude runner/device wait time.

Area PR 7406 baseline PR 7407 speedups Change
Unit/lint job 4:08 3:54 14s faster
Lint step only 1:49 1:04 45s faster
Android slowest e2e shard 9:10 7:48 1:22 faster
Android fastest/slowest gap 3:33 1:30 2:03 tighter
iOS slowest e2e shard 10:31 9:18 1:13 faster
iOS fastest/slowest gap 4:56 0:49 4:07 tighter

Sharding evidence

Planned weights come from plan.tsv. Measured totals come from timings.tsv. Weights and totals are seconds, ordered from oldest to newest successful run.

Android drops from a 3:30 gap to sub-minute gaps in runs 3-5. iOS drops to 0:12 on run 2 and stays at 0:15 or better.

Android:

Run Planned shard weights Measured flow totals Fastest vs. slowest gap
1 323 / 322 / 332 / 318 259 / 355 / 469 / 337 3:30
2 266 / 291 / 275 / 264 267 / 274 / 337 / 261 1:16
3 266 / 266 / 297 / 280 242 / 241 / 300 / 275 0:59
4 260 / 293 / 280 / 264 239 / 274 / 262 / 259 0:35
5 267 / 277 / 261 / 287 254 / 273 / 242 / 291 0:49

iOS:

Run Planned shard weights Measured flow totals Fastest vs. slowest gap
1 407 / 441 / 420 / 421 422 / 454 / 415 / 381 1:13
2 421 / 407 / 425 / 418 410 / 408 / 398 / 404 0:12
3 419 / 408 / 419 / 416 416 / 416 / 409 / 417 0:08
4 418 / 409 / 419 / 417 418 / 413 / 420 / 414 0:07
5 415 / 405 / 417 / 426 427 / 420 / 433 / 418 0:15

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

🧪 Flashlight Performance Report (AWS Device Farm)

🔀 Commit: 80ac448

📎 View Artifacts

Metric Current Δ vs Baseline
Time to Interactive (TTI) 5611 ms
Average FPS 56.73
Average RAM 394.5 MB

@github-actions
Copy link
Copy Markdown

Launch in simulator or device for eb15bc3

@christianbaroni christianbaroni force-pushed the @christian/ci-speedups branch from e2279de to 4933037 Compare April 29, 2026 04:03
@github-actions
Copy link
Copy Markdown

Launch in simulator or device for f6fe291

@github-actions
Copy link
Copy Markdown

Launch in simulator or device for 26912a7

@github-actions
Copy link
Copy Markdown

Launch in simulator or device for 2103b7c

@github-actions
Copy link
Copy Markdown

Launch in simulator or device for f93f6d5

@christianbaroni christianbaroni force-pushed the @christian/ci-speedups branch from 4b8b33a to 0f85ea3 Compare April 30, 2026 18:51
@github-actions
Copy link
Copy Markdown

Launch in simulator or device for 0418a29

@github-actions
Copy link
Copy Markdown

Launch in simulator or device for 23bf33c

@github-actions
Copy link
Copy Markdown

Launch in simulator or device for b67aa9f

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Launch in simulator or device for d8a47fe

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Launch in simulator or device for fde5471

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Launch in simulator or device for 8417987

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Launch in simulator or device for 943d247

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Launch in simulator or device for 2bdc0ab

@christianbaroni christianbaroni force-pushed the @christian/ci-speedups branch from 4035340 to 364c760 Compare May 1, 2026 04:06
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Launch in simulator or device for 80ac448

@christianbaroni christianbaroni changed the title [Test] CI speedups CI speedups May 1, 2026
@olerass
Copy link
Copy Markdown
Contributor

olerass commented May 1, 2026

Can we split this in 3 PRs? There's some genuine good stuff here but some is worth having more discussions about and other is clearly just win:

PR 1) Prettier caching -> I dont think this has any downisides? Seems like a genuine all posiive improvement

PR 2) ESLINT fixes/caching -> also good improvement but has more risk around forgetting to add new files to the hash/cache keys, also hiustorically I've seen this type of caching have issues with @typescript-eslint causing flakiness or missing critical lints entirely. Doesn't mean it's not valuable, but maybe worth to dive a bit on this in isolation I think

PR 3) The E2E sharding. Seems novel and looks like it works, but given we're moving E2E our of PRs entirely soon I'm not sure its worth the complexity? No one will care if tE2E is 25 or 30 mins per run when its not blocking, and they shouldn't... anyway, we should have this discussion in that PR/context and not block the other two because of it

@olerass
Copy link
Copy Markdown
Contributor

olerass commented May 1, 2026

Also why is this stacked on christian/fix-e2e-brittleness?

@christianbaroni
Copy link
Copy Markdown
Member Author

Also why is this stacked on christian/fix-e2e-brittleness?

So that this can be tested without random failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants