Skip to content

Add Exo7 French MCQ math benchmark#8

Open
Lduignan1 wants to merge 10 commits intomainfrom
exo7
Open

Add Exo7 French MCQ math benchmark#8
Lduignan1 wants to merge 10 commits intomainfrom
exo7

Conversation

@Lduignan1
Copy link
Copy Markdown
Member

Summary

Adds community_tasks/exo7.py: Exo7, a multi-label multiple-choice math benchmark for French undergraduate students (dataset: OpenLLM-BPI/Exo7MCQ, source: http://exo7.emath.fr/). Many items have more than one correct answer.

Two zero-shot scoring paths are exposed:

  • Logprob path (MCF and Hybrid formulations): TruthfulQA MC2-style probability-mass metric (Exo7MCMetric) over length-normalized choice logprobs (CF not included as many questions do not include context required to answer e.g. "Quels sont les propositions vraies ?")..
  • Generative path: model emits Réponse : A, C; scored with set-F1 and exact-set-match. Includes a \boxed{...} answer-format fallback.

Test plan

  • Task is discoverable via the lighteval task registry and imports cleanly.
  • Run the Exo7 task end-to-end on a small instruct model in both logprob (MCF / Hybrid) and generative variants; confirm metrics are produced.

@Lduignan1 Lduignan1 requested a review from Jeronymous May 6, 2026 14:52
@Lduignan1 Lduignan1 force-pushed the exo7 branch 2 times, most recently from ad1fa9e to 4f7b738 Compare May 6, 2026 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant