Skip to content

【Hackathon 8th No.29】在 PaddleNLP 中复现 ModernBERT 模型 #10686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

robinbg
Copy link

@robinbg robinbg commented Jun 1, 2025

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

Description

robinbg added 3 commits June 1, 2025 12:05
This commit implements ModernBERT model for PaddleNLP, including:
- ModernBertConfig configuration class
- Rotary Position Embeddings (RoPE)
- GeGLU activation function
- Sliding window attention mechanism
- Parameter conversion from HuggingFace weights
- Core model components and task-specific heads
Copy link

paddle-bot bot commented Jun 1, 2025

Thanks for your contribution!

robinbg added 2 commits June 1, 2025 16:02
- Add test_modeling.py for testing model components
- Add test_tokenizer.py for testing tokenizer
- Test coverage includes:
  - Basic model functionality
  - Task-specific heads (MLM, QA, Classification, etc.)
  - Tokenizer operations
  - Configuration
@DrownFish19
Copy link
Collaborator

请将文件通过pre-commit 处理后再补充提交以统一格式,可参考以下命令:

# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py

@DrownFish19
Copy link
Collaborator

  1. 需要在XXXPretrainedModel中补充_get_name_mappings (支持参数转换)、_get_tensor_parallel_mappings (支持模型并行参数切分)、_get_fuse_or_split_param_mappings(支持参数自动化融合切分)。
  2. 需要参考Qwen2模型支持并行策略以支持模型训练。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants