Skip to content

Actions: huggingface/lighteval

Quality

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
375 workflow run results
375 workflow run results

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

new metrics and pr-fouras dataset add
Quality #2752: Pull request #558 synchronize by clefourrier
June 6, 2025 08:05 Action required BertrandCabotIDRIS:main
June 6, 2025 08:05 Action required
Add Bulgarian and Macedonian literals (#769)
Quality #2750: Commit 9771f3a pushed by clefourrier
June 6, 2025 08:02 1m 59s main
June 6, 2025 08:02 1m 59s
June 6, 2025 06:49 2m 7s
[IFEval] Speed up think tag removal (#792)
Quality #2744: Commit 53df859 pushed by lewtun
June 4, 2025 14:29 15s main
June 4, 2025 14:29 15s
add a regex to remove think tags before evaluating ifeval (#791)
Quality #2740: Commit 8260f59 pushed by lewtun
June 4, 2025 12:22 2m 48s main
June 4, 2025 12:22 2m 48s
fix: multiple typos of different value (#782)
Quality #2735: Commit 7887172 pushed by clefourrier
May 28, 2025 20:42 4m 5s main
May 28, 2025 20:42 4m 5s
Making bootstrap_iters an arg (#697)
Quality #2734: Commit 8805035 pushed by clefourrier
May 28, 2025 20:42 2m 9s main
May 28, 2025 20:42 2m 9s
Adds GSM-PLUS (#780)
Quality #2731: Commit 9619194 pushed by NathanHB
May 28, 2025 12:51 2m 9s main
May 28, 2025 12:51 2m 9s
Bump dev version to 0.10.1.dev0 (#777)
Quality #2723: Commit 9dc2e53 pushed by NathanHB
May 23, 2025 11:31 2m 7s main
May 23, 2025 11:31 2m 7s
Async vllm (#693)
Quality #2711: Commit c4826ea pushed by clefourrier
May 22, 2025 12:35 2m 7s main
May 22, 2025 12:35 2m 7s
Bump ruff version (#774)
Quality #2708: Commit c9c19e1 pushed by NathanHB
May 22, 2025 12:00 2m 2s main
May 22, 2025 12:00 2m 2s
Nanotron, Multilingual tasks update + misc (#756)
Quality #2701: Commit 034c23b pushed by NathanHB
May 22, 2025 09:40 2m 21s main
May 22, 2025 09:40 2m 21s
add dependencies to run after pip install (#767)
Quality #2693: Commit 2651750 pushed by NathanHB
May 21, 2025 14:44 2m 29s main
May 21, 2025 14:44 2m 29s
fix custom model example (#766)
Quality #2692: Commit cce0bfc pushed by NathanHB
May 21, 2025 14:44 2m 34s main
May 21, 2025 14:44 2m 34s
Adds template for custom path saving results (#755)
Quality #2689: Commit b6816a8 pushed by NathanHB
May 21, 2025 12:15 2m 1s main
May 21, 2025 12:15 2m 1s
May 21, 2025 08:13 2m 11s
Add MCQ support to Yourbench evaluation (#734)
Quality #2666: Commit 317cb50 pushed by alozowski
May 20, 2025 12:12 2m 1s main
May 20, 2025 12:12 2m 1s
Fix task metric type mismatch (#743)
Quality #2665: Commit 3bb8a50 pushed by NathanHB
May 20, 2025 11:50 1m 58s main
May 20, 2025 11:50 1m 58s
Adds multimodal support and MMMU pro (#675)
Quality #2660: Commit 1607dc1 pushed by NathanHB
May 19, 2025 16:56 2m 9s main
May 19, 2025 16:56 2m 9s
Added Flores 200 (#717)
Quality #2648: Commit 63be4b0 pushed by clefourrier
May 19, 2025 13:20 2m 4s main
May 19, 2025 13:20 2m 4s
Update main_endpoint.py (#739)
Quality #2639: Commit d18f11a pushed by NathanHB
May 19, 2025 11:18 2m 8s main
May 19, 2025 11:18 2m 8s
fix litellm (#736)
Quality #2634: Commit a590376 pushed by NathanHB
May 16, 2025 16:17 2m 2s main
May 16, 2025 16:17 2m 2s
Adds More Generative tasks (#694)
Quality #2632: Commit c6d1231 pushed by clefourrier
May 16, 2025 15:23 2m 8s main
May 16, 2025 15:23 2m 8s
Update README.md (#733)
Quality #2626: Commit f684d35 pushed by NathanHB
May 15, 2025 16:38 2m 10s main
May 15, 2025 16:38 2m 10s
Fix revision arg for vLLM tokenizer (#721)
Quality #2623: Commit d3da6b9 pushed by lewtun
May 15, 2025 12:21 2m 6s main
May 15, 2025 12:21 2m 6s