Skip to content

Add overview release selector, competitive analysis revamp, DP parallelism, LLM-D URL sync & filter fixes#19

Merged
Harshith-umesh merged 5 commits intoopenshift-psap:mainfrom
Harshith-umesh:main-production
Apr 13, 2026
Merged

Add overview release selector, competitive analysis revamp, DP parallelism, LLM-D URL sync & filter fixes#19
Harshith-umesh merged 5 commits intoopenshift-psap:mainfrom
Harshith-umesh:main-production

Conversation

@Harshith-umesh
Copy link
Copy Markdown
Member

@Harshith-umesh Harshith-umesh commented Apr 13, 2026

Summary

RHAIIS Dashboard — Overview & Competitive Analysis

  • Multi-release Overview selector: The Overview page now has a dropdown to compare
    any back-to-back release pair (RHAIIS-3.4-EA2 vs EA1, EA1 vs 3.3, 3.3 vs 3.2.5)
    instead of being hardcoded to a single pair. _compute_overview_data is now
    parameterized with current, previous, upstream, and additional arguments.
  • Neutral dead-zone for win/loss: Introduced a ±2% NEUTRAL_THRESHOLD_PCT
    changes within this range are classified as "neutral" rather than wins/losses.
    Win rate is now a "non-regression rate" (total − losses / total) with thresholds
    tightened to 90%/70% (was 80%/60%).
  • vLLM parity table expansion: The upstream vLLM parity scorecard now includes
    TTFT P95 and ITL P95 deltas alongside throughput, displayed in a proper HTML table
    with per-metric win/loss/tie breakdowns.
  • Competitive Analysis revamp: Added a release selector dropdown
    (RHAIIS-3.4-EA2 vs RHAIIS-3.3 configs), restructured comparison groups per release,
    and added interactive per-metric drill-down dialog charts (click a metric button to
    see baseline vs competitor across concurrency levels).
  • Custom ISL/OSL awareness: Overview, Compare Versions, and heatmap sections now
    include custom_isl_osl in combo keys, preventing data from different ISL/OSL pairs
    being incorrectly merged. A Custom ISL/OSL pair filter is added to Compare Versions.
  • Accelerator display names: All user-facing accelerator references now use full
    names (e.g. "NVIDIA H200", "AMD MI300X") via _accel_display(). B200 pricing added
    to cost analysis.

RHAIIS Dashboard — Data Parallelism (DP) Support

  • Import script: import_manual_runs_json_v2.py now accepts --dp as an
    alternative to --tp, writing a DP column. Mutually exclusive validation added.
  • Compare Versions: Handles cross-parallelism comparisons (TP→DP across versions)
    with automatic pair matching, labels like (TP=4 → DP=2), and user warnings.
  • Performance Plots: Run identifier and legend include DP when present; sort order
    includes DP column.

LLM-D Dashboard

  • Fix empty-filter rendering: Graphs previously rendered with all data when no
    model was selected because if selected_X: guards skipped filtering for empty lists.
    Replaced with unconditional boolean-mask filtering (matching RHAIIS), so empty
    selection = empty result.
  • "No Data" banner: When filters produce no matching data, shows a styled banner
    with per-model expandable trees of available filter combinations, instead of a simple
    warning that killed the sidebar navigation.
  • URL query parameter sync: Filter state (accelerators, models, versions, profile,
    custom ISL/OSL, TP, replicas, prefill/decode pods, section) is now encoded into URL
    params on every rerun and decoded on first load, enabling shareable/bookmarkable URLs.

Styles & UX

  • vLLM parity table styling: New .vllm-parity-table CSS for the expanded parity
    breakdown table.
  • Seamless transitions: Added CSS for content fade-in, stale-element dimming, and
    batch-reveal of filter rows to reduce visual flicker during Streamlit reruns.

Lint & Type Fixes

  • Fixed 6 ruff errors: F841 unused wins variable, SIM118 .keys(), SIM102/SIM114
    collapsible conditionals, 2× SIM105 try/except/pass → contextlib.suppress.
  • Fixed 2 mypy errors: added pd.DataFrame type annotation on filtered_df, type hint
    on combo_dict.

Changed Files

  • dashboard.py — Overview, Competitive Analysis, Compare Versions, Performance Plots,
    Cost Analysis, Model Performance Comparison, energy section (B200 pricing/references)
  • dashboard_styles.py — vLLM parity table CSS, rerun transition animations
  • llmd_dashboard.py — filter logic, empty-data banner, URL sync, lint/type fixes
  • manual_runs/scripts/import_manual_runs_json_v2.py — DP column support

Summary by CodeRabbit

Release Notes

  • New Features

    • Added custom performance profile pairing selection in the dashboard
    • Introduced shareable filter states via URL parameters for easy bookmarking
    • Added data parallelism parameter support for benchmark imports
  • Improvements

    • Enhanced UI transitions and reduced visual flickering during page updates
    • Sidebar now opens by default for better navigation visibility
    • Improved handling when benchmark data doesn't match selected filters

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Warning

Rate limit exceeded

@Harshith-umesh has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 22 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 16 minutes and 22 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 89b6a2fd-1914-4fa4-aac4-b94402af8631

📥 Commits

Reviewing files that changed from the base of the PR and between 9d0689f and 0d1f320.

📒 Files selected for processing (2)
  • llmd_dashboard.py
  • mlperf_datacenter.py
📝 Walkthrough

Walkthrough

The pull request enhances the LLMD dashboard with custom ISL/OSL pairing support, implements bidirectional URL syncing for filters and section state, refines CSS styling for smoother UI transitions and prevents layout collapsing, and extends the data import pipeline to support data parallelism (DP) alongside tensor parallelism (TP).

Changes

Cohort / File(s) Summary
Dashboard Styling & Configuration
dashboard_styles.py
Added .vllm-parity-table styling; introduced seamless transition animations for .block-container, fade-in effects for stMain, opacity-dimming for stale data, and animated reveals for horizontal selectbox blocks; applied layout containment to reduce reflow; changed initial sidebar state from collapsed to expanded.
Dashboard Filtering & URL State
llmd_dashboard.py
Implemented custom ISL/OSL pairing system with clean_profile_name() mapping and format_custom_isl_osl() formatter; added secondary selectbox for custom pair selection when profile is "Custom"; refactored filter application to use explicit mask computation with combined filtering; introduced _decode_llmd_url_filters() for URL parameter parsing; added bidirectional URL syncing to persist filter/section state and performance plot axes; extended "Compare Versions" section with custom pair selector; adjusted UI layout for filter actions.
Data Import Pipeline
manual_runs/scripts/import_manual_runs_json_v2.py
Added optional dp_size parameter to process_benchmark_section() and parse_guidellm_json(); updated run naming logic to use either TP or DP size via parallelism_tag; extended row data export with new "DP" column; refactored CLI arguments to make --tp optional and added --dp option with mutual exclusivity validation; updated processing logs to reflect TP or DP designation.

Sequence Diagram

sequenceDiagram
    participant User as User/UI
    participant FilterState as Filter State<br/>(Session)
    participant Dashboard as Dashboard<br/>Rendering
    participant URLParams as URL Params<br/>(st.query_params)
    participant DataEngine as Data Engine<br/>(DataFrame)

    User->>FilterState: Select profile (e.g., "Custom")
    User->>FilterState: Select custom ISL/OSL pair
    FilterState->>Dashboard: Trigger rerender
    Dashboard->>DataEngine: Apply filters (profile + custom pair)
    DataEngine->>Dashboard: Return filtered results
    Dashboard->>User: Render filtered dashboard view
    Dashboard->>URLParams: Write filter state + section slug<br/>+ plot axes to URL
    User->>URLParams: (Browser) Navigate with URL params
    URLParams->>FilterState: Decode & initialize session state<br/>from query parameters
    FilterState->>Dashboard: Restore prior filter selections
    Dashboard->>User: Render dashboard with restored state
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • PR #5: Extends modifications to clean_profile_name() and filter UI behavior in llmd_dashboard.py, building directly on custom profile handling logic.
  • PR #15: Overlaps on dashboard styling and URL state syncing implementation, both adding filter persistence via query parameters and CSS animation enhancements.
  • PR #11: Modifies the same import_manual_runs_json_v2.py import pipeline for guidellm data, directly affecting benchmark section processing.

Poem

🐰 A dashboard now dances with filters so spry,
ISL and OSL pairs paint the sky,
URLs remember what once would be lost,
While silky transitions glide smooth—no reflow cost!
From TP to DP, the data flows free,
A dashboard evolved, for all eyes to see!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title covers multiple major features (overview release selector, competitive analysis revamp, DP parallelism, LLM-D URL sync, filter fixes) matching the changeset's scope across dashboard_styles.py, llmd_dashboard.py, and import_manual_runs_json_v2.py.
Docstring Coverage ✅ Passed Docstring coverage is 92.86% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
llmd_dashboard.py (2)

3185-3233: URL sync may cause infinite rerun loops if from_dict triggers state changes.

The st.query_params.from_dict(desired_params) call at line 3233 replaces all query parameters. If this triggers a Streamlit rerun (which it can when parameters change), and the code then reads different values, it could cause a loop. The contextlib.suppress(Exception) wrapper hides any errors but doesn't prevent potential rerun issues.

Consider comparing current params with desired params before updating to avoid unnecessary writes.

💡 Suggested fix to avoid unnecessary updates
     # --- Sync filter state to URL query params ---
     with contextlib.suppress(Exception):
         desired_params: dict[str, str] = {}
         # ... (existing param building code) ...

+        # Only update if params actually changed to avoid rerun loops
+        current_params = dict(st.query_params)
+        if desired_params != current_params:
             st.query_params.from_dict(desired_params)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@llmd_dashboard.py` around lines 3185 - 3233, Compare the existing query
params with the computed desired_params before calling st.query_params.from_dict
to avoid unnecessary writes / reruns: compute desired_params (as done in this
block using filter_selections, LLMD_SECTION_SLUG_MAP, current_section and
session keys like "llmd_selected_custom_isl_osl", "llmd_perf_plots_x_axis",
"llmd_perf_plots_y_axis"), then fetch the current params via
st.query_params.to_dict() (or equivalent) and only call
st.query_params.from_dict(desired_params) when they differ; ensure you normalize
types/ordering for list-joined values so comparisons are stable and skip the
update when identical to prevent infinite rerun loops.

2866-2933: Consider adding basic input sanitization for URL-decoded values.

While the function validates values against available data (which is good), string values from URL parameters are used directly without sanitization. Although Streamlit's query_params provides some protection, consider stripping potentially problematic characters or limiting string lengths to prevent edge cases.

💡 Suggested defensive sanitization
+def _sanitize_param(value: str, max_length: int = 500) -> str:
+    """Basic sanitization for URL parameter values."""
+    return value.strip()[:max_length] if value else ""
+
+
 def _decode_llmd_url_filters(df: pd.DataFrame) -> dict:
     ...
     if "accelerators" in qp:
         result["accelerators"] = [
-            a.strip()
+            _sanitize_param(a)
             for a in qp["accelerators"].split(",")
-            if a.strip() in all_accelerators
+            if _sanitize_param(a) in all_accelerators
         ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@llmd_dashboard.py` around lines 2866 - 2933, The _decode_llmd_url_filters
function currently trusts raw query_params strings for fields like
custom_isl_osl, pp_x, pp_y, section, prefill_pods and decode_pods; sanitize
these by trimming, enforcing a max length (e.g., 100 chars), stripping
control/whitespace-only input, and filtering to an allowed character set (e.g.,
letters, numbers, spaces, hyphen, underscore, dot) or replacing disallowed chars
before storing; apply the same sanitization to each list element produced for
prefill_pods/decode_pods and to single-value fields (custom_isl_osl, pp_x, pp_y,
section), and keep the existing validation checks for
accelerator/model/version/profile/tp/replicas intact in _decode_llmd_url_filters
so only safe, size-limited strings are returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@llmd_dashboard.py`:
- Line 3043: The empty filter_selections dict causes downstream code that calls
filter_selections.get("models", []) to always see an empty list when
_show_global_filters is False; fix this by initializing filter_selections with
sensible defaults instead of {} (e.g., ensure it contains keys like "models",
"datasets", "runs" with empty lists) where filter_selections is declared, or
alternatively gate the "No Data" banner logic behind a check of
_show_global_filters so that the banner only runs when global filters are shown;
reference the variables filter_selections and _show_global_filters and the "No
Data" banner logic to locate where to apply the change.
- Around line 1026-1060: The custom_mask logic currently returns the scalar True
when selected_custom_isl_osl is falsy, causing inconsistent behavior vs other
filters; change it so an empty/no selection yields an empty boolean mask like
the other .isin-based filters. Concretely, update the custom_mask expression
(and any similar special-case logic) to use
df["custom_isl_osl"].isin(selected_custom_isl_osl) when selected_custom_isl_osl
is truthy, and otherwise produce an empty mask (e.g.,
df["custom_isl_osl"].isin([]) or a Series of False of len(df)); ensure
filtered_df still composes custom_mask with the other .isin(...) filters so
empty selections uniformly produce empty results.
- Around line 183-196: The clean_profile_name function currently references
_CUSTOM_ISL_OSL_LABELS before it's defined, causing a NameError; fix it by
moving the _CUSTOM_ISL_OSL_LABELS dictionary declaration so it appears before
clean_profile_name (or otherwise ensure the constant is defined at module import
time prior to any function that calls it), and keep format_custom_isl_osl using
the same dictionary name to avoid duplication.

In `@manual_runs/scripts/import_manual_runs_json_v2.py`:
- Around line 299-310: Add validation after parsing to reject non-positive
parallelism values: check args.tp and args.dp and if either is not None and <= 0
call parser.error with a clear message (e.g., "--tp must be a positive integer"
/ "--dp must be a positive integer"). Apply the same check in the other
validation block that currently handles presence/mutual exclusivity (the logic
working with args.tp and args.dp around the later validation at the 336-341
region) so both argument positions are validated consistently.
- Around line 49-50: The run identifier currently uses parallelism_tag = tp_size
if tp_size is not None else dp_size and builds full_model_name =
f"{accelerator}-{model_name}-{parallelism_tag}", which collides when TP and DP
values are equal; change parallelism_tag to encode which type it is (e.g.,
"tp{tp_size}" when tp_size is not None else "dp{dp_size}") and update
full_model_name accordingly so full_model_name =
f"{accelerator}-{model_name}-{parallelism_tag}" will now contain a
disambiguating prefix ("tp" or "dp") to avoid key collisions.

---

Nitpick comments:
In `@llmd_dashboard.py`:
- Around line 3185-3233: Compare the existing query params with the computed
desired_params before calling st.query_params.from_dict to avoid unnecessary
writes / reruns: compute desired_params (as done in this block using
filter_selections, LLMD_SECTION_SLUG_MAP, current_section and session keys like
"llmd_selected_custom_isl_osl", "llmd_perf_plots_x_axis",
"llmd_perf_plots_y_axis"), then fetch the current params via
st.query_params.to_dict() (or equivalent) and only call
st.query_params.from_dict(desired_params) when they differ; ensure you normalize
types/ordering for list-joined values so comparisons are stable and skip the
update when identical to prevent infinite rerun loops.
- Around line 2866-2933: The _decode_llmd_url_filters function currently trusts
raw query_params strings for fields like custom_isl_osl, pp_x, pp_y, section,
prefill_pods and decode_pods; sanitize these by trimming, enforcing a max length
(e.g., 100 chars), stripping control/whitespace-only input, and filtering to an
allowed character set (e.g., letters, numbers, spaces, hyphen, underscore, dot)
or replacing disallowed chars before storing; apply the same sanitization to
each list element produced for prefill_pods/decode_pods and to single-value
fields (custom_isl_osl, pp_x, pp_y, section), and keep the existing validation
checks for accelerator/model/version/profile/tp/replicas intact in
_decode_llmd_url_filters so only safe, size-limited strings are returned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 63c01830-52d9-4f63-a303-135f4462c3ba

📥 Commits

Reviewing files that changed from the base of the PR and between b086744 and 9d0689f.

📒 Files selected for processing (4)
  • dashboard.py
  • dashboard_styles.py
  • llmd_dashboard.py
  • manual_runs/scripts/import_manual_runs_json_v2.py

Comment thread llmd_dashboard.py
Comment thread llmd_dashboard.py
Comment on lines +1026 to +1060
# Apply all filters unconditionally (empty selection → empty result, not "show all")
custom_mask = (
(df["custom_isl_osl"] == selected_custom_isl_osl)
if selected_custom_isl_osl and "custom_isl_osl" in df.columns
else True
)

_pf_nums = [v for v in selected_prefill_pods if v != "N/A"]
_pf_na = "N/A" in selected_prefill_pods
prefill_mask = (
df["prefill_pod_count"].isin(_pf_nums)
| (_pf_na & df["prefill_pod_count"].isna())
if selected_prefill_pods
else True
)

if selected_accelerators:
filtered_df = filtered_df[ # type: ignore[assignment]
filtered_df["accelerator"].isin(selected_accelerators)
]
if selected_profiles:
filtered_df = filtered_df[filtered_df["profile"].isin(selected_profiles)] # type: ignore[assignment]
if selected_versions:
filtered_df = filtered_df[filtered_df["version"].isin(selected_versions)] # type: ignore[assignment]
if selected_models:
filtered_df = filtered_df[filtered_df["model"].isin(selected_models)] # type: ignore[assignment]
if selected_tp:
filtered_df = filtered_df[filtered_df["TP"].isin(selected_tp)] # type: ignore[assignment]
if selected_replicas:
filtered_df = filtered_df[filtered_df["replicas"].isin(selected_replicas)] # type: ignore[assignment]
if selected_prefill_pods:
_pf_nums = [v for v in selected_prefill_pods if v != "N/A"]
_pf_na = "N/A" in selected_prefill_pods
filtered_df = filtered_df[ # type: ignore[assignment]
filtered_df["prefill_pod_count"].isin(_pf_nums)
| (_pf_na & filtered_df["prefill_pod_count"].isna())
]
if selected_decode_pods:
_dc_nums = [v for v in selected_decode_pods if v != "N/A"]
_dc_na = "N/A" in selected_decode_pods
filtered_df = filtered_df[ # type: ignore[assignment]
filtered_df["decode_pod_count"].isin(_dc_nums)
| (_dc_na & filtered_df["decode_pod_count"].isna())
]
_dc_nums = [v for v in selected_decode_pods if v != "N/A"]
_dc_na = "N/A" in selected_decode_pods
decode_mask = (
df["decode_pod_count"].isin(_dc_nums) | (_dc_na & df["decode_pod_count"].isna())
if selected_decode_pods
else True
)

filtered_df: pd.DataFrame = df[ # type: ignore[assignment]
df["accelerator"].isin(selected_accelerators)
& df["model"].isin(selected_models)
& df["version"].isin(selected_versions)
& (df["profile"].isin(selected_profiles) if selected_profiles else True)
& df["TP"].isin(selected_tp)
& df["replicas"].isin(selected_replicas)
& custom_mask
& prefill_mask
& decode_mask
].copy()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Boolean mask logic is inconsistent between custom_mask and filter application.

When selected_custom_isl_osl is falsy (None or empty), custom_mask is set to True (scalar), which means "include all rows". However, the filtering at lines 1051-1055 uses .isin(selected_accelerators) etc., which returns an empty result when the selection list is empty.

This creates inconsistent behavior: selecting no custom ISL/OSL shows all rows, but selecting no accelerators shows no rows. The comment says "empty selection → empty result, not 'show all'", but custom_mask doesn't follow this pattern.

💡 Suggested fix for consistency

If the intent is truly "empty selection → empty result", then custom_mask should also return an empty mask when selected_custom_isl_osl is None/empty but profile is "Custom":

     custom_mask = (
         (df["custom_isl_osl"] == selected_custom_isl_osl)
         if selected_custom_isl_osl and "custom_isl_osl" in df.columns
-        else True
+        else (df["profile"] != "Custom") if "custom_isl_osl" in df.columns else True
     )

However, if the current behavior (showing all custom rows when no specific pair selected) is intentional, consider updating the comment to clarify this exception.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@llmd_dashboard.py` around lines 1026 - 1060, The custom_mask logic currently
returns the scalar True when selected_custom_isl_osl is falsy, causing
inconsistent behavior vs other filters; change it so an empty/no selection
yields an empty boolean mask like the other .isin-based filters. Concretely,
update the custom_mask expression (and any similar special-case logic) to use
df["custom_isl_osl"].isin(selected_custom_isl_osl) when selected_custom_isl_osl
is truthy, and otherwise produce an empty mask (e.g.,
df["custom_isl_osl"].isin([]) or a Series of False of len(df)); ensure
filtered_df still composes custom_mask with the other .isin(...) filters so
empty selections uniformly produce empty results.

Comment thread llmd_dashboard.py
Comment thread manual_runs/scripts/import_manual_runs_json_v2.py
Comment thread manual_runs/scripts/import_manual_runs_json_v2.py
@Harshith-umesh Harshith-umesh merged commit 8c947d5 into openshift-psap:main Apr 13, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant