-
Notifications
You must be signed in to change notification settings - Fork 3k
【Hackathon 8th No.28】在 PaddleNLP 中复现 Phi3 #10688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
robinbg
wants to merge
7
commits into
PaddlePaddle:develop
Choose a base branch
from
robinbg:feature/add_phi3
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add Phi3 model configuration, tokenizer, and modeling classes - Support both phi3-small (3B) and phi3-base (14B) variants - Add comprehensive unit tests for model and tokenizer - Implement grouped query attention and rotary embeddings - Add support for gradient checkpointing and generation - Follow PaddleNLP coding standards and conventions
Thanks for your contribution! |
DrownFish19
reviewed
Jun 4, 2025
""" | ||
|
||
resource_files_names = {"vocab_file": "vocab.model", "tokenizer_config_file": "tokenizer_config.json"} | ||
pretrained_resource_files_map = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处不需要配置下载路径,我们后续可以直接转模型上传使用。
模型配置位置也可以删除。
@@ -0,0 +1 @@ | |||
# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里不完整
请将文件通过pre-commit 处理后再补充提交以统一格式,可参考以下命令: # Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install
# Process previous code files separately
pre-commit run --file XXXX.py |
|
robinbg
pushed a commit
to robinbg/PaddleNLP
that referenced
this pull request
Jun 8, 2025
Fix(phi3): Address comments from PR PaddlePaddle#10688 This commit incorporates your suggestions and requirements from the review comments on PR PaddlePaddle#10688 for the Phi3 model implementation. The following changes were made: 1. **Tokenizer Configuration Cleanup:** - Removed `pretrained_resource_files_map`, `pretrained_init_configuration`, and `max_model_input_sizes` from `paddlenlp/transformers/phi3/tokenizer.py` as you requested, to decouple it from specific pre-trained model download paths. 2. **Test Init File Completion:** - Added a docstring to `tests/transformers/phi3/__init__.py` to ensure it's a valid and non-empty Python module initialization file. 3. **PretrainedModel Mapping Methods:** - Implemented `_get_name_mappings`, `_get_tensor_parallel_mappings`, and `_get_fuse_or_split_param_mappings` in the `Phi3PreTrainedModel` class in `paddlenlp/transformers/phi3/modeling.py`. These methods are crucial for model conversion and tensor parallelism, based on the Qwen2 model's implementation. 4. **Parallel Strategy Support:** - Integrated support for sequence parallelism and recomputation into `paddlenlp/transformers/phi3/modeling.py`. - This includes: - Configuration flags for enabling/disabling these features. - Modifications to `Phi3Model`, `Phi3DecoderLayer`, `Phi3Attention`, and `Phi3MLP` to handle sequence-parallel linear layers and recomputation logic (full layer, full attention, and core attention granularities). - Necessary imports and utilities for sequence parallelism (ScatterOp, GatherOp, sequence-parallel linear layers) and recomputation. - Tensor parallelism considerations for weight initialization and layer configurations. 5. **Code Formatting:** - Applied `pre-commit` to all modified files to ensure code style consistency and address linting issues. This included removing some unused imports and a duplicated code segment.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Before submitting
tests
folder. If there are codecov issues, please add tests cases first.PR types
PR changes
Description