Skip to content

Actions: huggingface/lighteval

Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
375 workflow run results
375 workflow run results

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

new metrics and pr-fouras dataset add
Tests #2751: Pull request #558 synchronize by clefourrier
June 6, 2025 08:05 Action required BertrandCabotIDRIS:main
June 6, 2025 08:05 Action required
Add Bulgarian and Macedonian literals (#769)
Tests #2749: Commit 9771f3a pushed by clefourrier
June 6, 2025 08:02 2m 9s main
June 6, 2025 08:02 2m 9s
June 6, 2025 06:49 2m 29s
[IFEval] Speed up think tag removal (#792)
Tests #2743: Commit 53df859 pushed by lewtun
June 4, 2025 14:27 2m 8s main
June 4, 2025 14:27 2m 8s
add a regex to remove think tags before evaluating ifeval (#791)
Tests #2739: Commit 8260f59 pushed by lewtun
June 4, 2025 12:22 2m 10s main
June 4, 2025 12:22 2m 10s
fix: multiple typos of different value (#782)
Tests #2734: Commit 7887172 pushed by clefourrier
May 28, 2025 20:42 2m 26s main
May 28, 2025 20:42 2m 26s
Making bootstrap_iters an arg (#697)
Tests #2733: Commit 8805035 pushed by clefourrier
May 28, 2025 20:42 2m 45s main
May 28, 2025 20:42 2m 45s
Adds GSM-PLUS (#780)
Tests #2730: Commit 9619194 pushed by NathanHB
May 28, 2025 12:51 2m 30s main
May 28, 2025 12:51 2m 30s
Bump dev version to 0.10.1.dev0 (#777)
Tests #2722: Commit 9dc2e53 pushed by NathanHB
May 23, 2025 11:31 2m 24s main
May 23, 2025 11:31 2m 24s
Async vllm (#693)
Tests #2710: Commit c4826ea pushed by clefourrier
May 22, 2025 12:35 2m 4s main
May 22, 2025 12:35 2m 4s
Bump ruff version (#774)
Tests #2707: Commit c9c19e1 pushed by NathanHB
May 22, 2025 12:00 2m 38s main
May 22, 2025 12:00 2m 38s
Nanotron, Multilingual tasks update + misc (#756)
Tests #2700: Commit 034c23b pushed by NathanHB
May 22, 2025 09:40 2m 1s main
May 22, 2025 09:40 2m 1s
add dependencies to run after pip install (#767)
Tests #2692: Commit 2651750 pushed by NathanHB
May 21, 2025 14:44 2m 45s main
May 21, 2025 14:44 2m 45s
fix custom model example (#766)
Tests #2691: Commit cce0bfc pushed by NathanHB
May 21, 2025 14:44 2m 14s main
May 21, 2025 14:44 2m 14s
Adds template for custom path saving results (#755)
Tests #2688: Commit b6816a8 pushed by NathanHB
May 21, 2025 12:15 1m 31s main
May 21, 2025 12:15 1m 31s
May 21, 2025 08:13 1m 29s
Add MCQ support to Yourbench evaluation (#734)
Tests #2665: Commit 317cb50 pushed by alozowski
May 20, 2025 12:12 1m 44s main
May 20, 2025 12:12 1m 44s
Fix task metric type mismatch (#743)
Tests #2664: Commit 3bb8a50 pushed by NathanHB
May 20, 2025 11:50 1m 36s main
May 20, 2025 11:50 1m 36s
Adds multimodal support and MMMU pro (#675)
Tests #2659: Commit 1607dc1 pushed by NathanHB
May 19, 2025 16:56 2m 31s main
May 19, 2025 16:56 2m 31s
Added Flores 200 (#717)
Tests #2647: Commit 63be4b0 pushed by clefourrier
May 19, 2025 13:20 2m 14s main
May 19, 2025 13:20 2m 14s
Update main_endpoint.py (#739)
Tests #2638: Commit d18f11a pushed by NathanHB
May 19, 2025 11:18 1m 51s main
May 19, 2025 11:18 1m 51s
fix litellm (#736)
Tests #2633: Commit a590376 pushed by NathanHB
May 16, 2025 16:17 1m 34s main
May 16, 2025 16:17 1m 34s
Adds More Generative tasks (#694)
Tests #2631: Commit c6d1231 pushed by clefourrier
May 16, 2025 15:23 2m 6s main
May 16, 2025 15:23 2m 6s
Update README.md (#733)
Tests #2625: Commit f684d35 pushed by NathanHB
May 15, 2025 16:38 2m 36s main
May 15, 2025 16:38 2m 36s
Fix revision arg for vLLM tokenizer (#721)
Tests #2622: Commit d3da6b9 pushed by lewtun
May 15, 2025 12:21 1m 29s main
May 15, 2025 12:21 1m 29s