Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[feat]: improve performance of XQA-MLA for sm120
#5087 opened Jun 10, 2025 by lowsfer Loading…
[feat] Support torch compile for attention dp
#5086 opened Jun 10, 2025 by liji-nv Loading…
[fix] Fix test_attention_mla
#5084 opened Jun 10, 2025 by jinyangyuan-nvidia Loading…
chore: Mass integration of release/0.20
#5082 opened Jun 10, 2025 by amirkl94 Loading…
ci: waive test [NVBUGS/5301492]
#5081 opened Jun 10, 2025 by stnie Loading…
refactor: improve code readability
#5079 opened Jun 10, 2025 by Shixiaowei02 Loading…
infra[TRTLLM-5635] remove package stage in CI build
#5075 opened Jun 10, 2025 by niukuo Loading…
[TRTLLM-5786][test] Add llama3.2-3b test case
#5073 opened Jun 10, 2025 by crazydemo Loading…
chore: rename IOFormatter to BaseCacheFormatter
#5068 opened Jun 10, 2025 by zhengd-nv Loading…
Draft: test: [CI] remove closed bugs
#5063 opened Jun 10, 2025 by xinhe-nv Draft
chore: build the wheel using NIXL as the default
#5061 opened Jun 10, 2025 by Shixiaowei02 Loading…
None: fix OOM because of unnecessary mha workspace
#5056 opened Jun 10, 2025 by ttyio Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.