CI speedups#7407
Conversation
|
🧪 Flashlight Performance Report (AWS Device Farm) 🔀 Commit: 80ac448
|
e2279de to
4933037
Compare
4b8b33a to
0f85ea3
Compare
4035340 to
364c760
Compare
|
Can we split this in 3 PRs? There's some genuine good stuff here but some is worth having more discussions about and other is clearly just win: PR 1) Prettier caching -> I dont think this has any downisides? Seems like a genuine all posiive improvement PR 2) ESLINT fixes/caching -> also good improvement but has more risk around forgetting to add new files to the hash/cache keys, also hiustorically I've seen this type of caching have issues with PR 3) The E2E sharding. Seems novel and looks like it works, but given we're moving E2E our of PRs entirely soon I'm not sure its worth the complexity? No one will care if tE2E is 25 or 30 mins per run when its not blocking, and they shouldn't... anyway, we should have this discussion in that PR/context and not block the other two because of it |
|
Also why is this stacked on |
So that this can be tested without random failures |
What changed (plus any additional context for devs)
Adds duration-based e2e sharding and persisted Prettier/ESLint caches on top of #7406.
E2E sharding now runs as a feedback loop instead of a fixed file-order split: each successful run writes updated pass/retry timings to
timings.tsv, and later runs use those weights to keep the shard workload balanced as flow runtimes change.Shard balancing
scripts/e2e-run.shplans shards from timing history, placing the highest-weight remaining flow on the shard with the lowest planned total60sdefault weight.plan.tsvandtimings.tsv, which show the produced shard assignment and the timing data saved for later runsLint
lint:ciuses cache-aware Prettier and ESLint commands with explicit.cache/prettierand.cache/eslintpathsWorkflow cancellation
Results
Each value is the median of the latest five successful runs for each PR. Measurements use active GitHub job/step time and exclude runner/device wait time.
4:083:5414sfaster1:491:0445sfaster9:107:481:22faster3:331:302:03tighter10:319:181:13faster4:560:494:07tighterSharding evidence
Planned weights come from
plan.tsv. Measured totals come fromtimings.tsv. Weights and totals are seconds, ordered from oldest to newest successful run.Android drops from a
3:30gap to sub-minute gaps in runs 3-5. iOS drops to0:12on run 2 and stays at0:15or better.Android:
323 / 322 / 332 / 318259 / 355 / 469 / 3373:30266 / 291 / 275 / 264267 / 274 / 337 / 2611:16266 / 266 / 297 / 280242 / 241 / 300 / 2750:59260 / 293 / 280 / 264239 / 274 / 262 / 2590:35267 / 277 / 261 / 287254 / 273 / 242 / 2910:49iOS:
407 / 441 / 420 / 421422 / 454 / 415 / 3811:13421 / 407 / 425 / 418410 / 408 / 398 / 4040:12419 / 408 / 419 / 416416 / 416 / 409 / 4170:08418 / 409 / 419 / 417418 / 413 / 420 / 4140:07415 / 405 / 417 / 426427 / 420 / 433 / 4180:15