【Hackathon 8th No.29】在 PaddleNLP 中复现 ModernBERT 模型 #10686

robinbg · 2025-06-01T04:59:33Z

Before submitting

Lint code. If there are lint issues, please format the code first.

# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py

Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

Description

This commit implements ModernBERT model for PaddleNLP, including: - ModernBertConfig configuration class - Rotary Position Embeddings (RoPE) - GeGLU activation function - Sliding window attention mechanism - Parameter conversion from HuggingFace weights - Core model components and task-specific heads

paddle-bot · 2025-06-01T04:59:39Z

Thanks for your contribution!

- Add test_modeling.py for testing model components - Add test_tokenizer.py for testing tokenizer - Test coverage includes: - Basic model functionality - Task-specific heads (MLM, QA, Classification, etc.) - Tokenizer operations - Configuration

DrownFish19 · 2025-06-04T04:04:52Z

请将文件通过pre-commit 处理后再补充提交以统一格式，可参考以下命令：

# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py

DrownFish19 · 2025-06-04T04:30:34Z

需要在XXXPretrainedModel中补充_get_name_mappings (支持参数转换)、_get_tensor_parallel_mappings （支持模型并行参数切分）、_get_fuse_or_split_param_mappings（支持参数自动化融合切分）。
需要参考Qwen2模型支持并行策略以支持模型训练。

robinbg added 3 commits June 1, 2025 12:05

Add ModernBERT model to transformers package

e7ee187

Add ModernBERT configuration and tokenizer

90e0caa

paddle-bot bot added the contributor label Jun 1, 2025

paddle-bot bot assigned lugimzzz Jun 1, 2025

robinbg added 2 commits June 1, 2025 16:02

feat: Add ModernBERT configuration, tokenizer and init files

8e2ddf1

test: Add unit tests for ModernBERT

04e39b3

- Add test_modeling.py for testing model components - Add test_tokenizer.py for testing tokenizer - Test coverage includes: - Basic model functionality - Task-specific heads (MLM, QA, Classification, etc.) - Tokenizer operations - Configuration

luotao1 mentioned this pull request Jun 3, 2025

【Hackathon 8th】开源贡献个人挑战赛 PaddlePaddle/Paddle#71310

Open

luotao1 added the hackathon label Jun 3, 2025

luotao1 assigned luotao1 and DrownFish19 and unassigned lugimzzz Jun 3, 2025

DrownFish19 mentioned this pull request Jun 4, 2025

【Hackathon 8th No.29】在 PaddleNLP 中复现 ModernBERT 模型 #10685

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

【Hackathon 8th No.29】在 PaddleNLP 中复现 ModernBERT 模型 #10686

【Hackathon 8th No.29】在 PaddleNLP 中复现 ModernBERT 模型 #10686

Uh oh!

robinbg commented Jun 1, 2025

Uh oh!

paddle-bot bot commented Jun 1, 2025

Uh oh!

DrownFish19 commented Jun 4, 2025

Uh oh!

DrownFish19 commented Jun 4, 2025

Uh oh!

Uh oh!

【Hackathon 8th No.29】在 PaddleNLP 中复现 ModernBERT 模型 #10686

Are you sure you want to change the base?

【Hackathon 8th No.29】在 PaddleNLP 中复现 ModernBERT 模型 #10686

Uh oh!

Conversation

robinbg commented Jun 1, 2025

Before submitting

PR types

PR changes

Description

Uh oh!

paddle-bot bot commented Jun 1, 2025

Uh oh!

DrownFish19 commented Jun 4, 2025

Uh oh!

DrownFish19 commented Jun 4, 2025

Uh oh!

Uh oh!