Skip to content

【Hackathon 8th No.28】在 PaddleNLP 中复现 Phi3 #10688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

robinbg
Copy link

@robinbg robinbg commented Jun 1, 2025

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

Description

robinbg added 3 commits June 1, 2025 12:05
- Add Phi3 model configuration, tokenizer, and modeling classes
- Support both phi3-small (3B) and phi3-base (14B) variants
- Add comprehensive unit tests for model and tokenizer
- Implement grouped query attention and rotary embeddings
- Add support for gradient checkpointing and generation
- Follow PaddleNLP coding standards and conventions
Copy link

paddle-bot bot commented Jun 1, 2025

Thanks for your contribution!

"""

resource_files_names = {"vocab_file": "vocab.model", "tokenizer_config_file": "tokenizer_config.json"}
pretrained_resource_files_map = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处不需要配置下载路径,我们后续可以直接转模型上传使用。
模型配置位置也可以删除。

@@ -0,0 +1 @@
# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不完整

@DrownFish19
Copy link
Collaborator

请将文件通过pre-commit 处理后再补充提交以统一格式,可参考以下命令:

# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py

@luotao1
Copy link
Collaborator

luotao1 commented Jun 4, 2025

  • 可下载「如流」扫码加入第八期黑客松交流群

@DrownFish19
Copy link
Collaborator

  1. 需要在XXXPretrainedModel中补充_get_name_mappings (支持参数转换)、_get_tensor_parallel_mappings (支持模型并行参数切分)、_get_fuse_or_split_param_mappings(支持参数自动化融合切分)。
  2. 需要参考Qwen2模型支持并行策略以支持模型训练。

robinbg pushed a commit to robinbg/PaddleNLP that referenced this pull request Jun 8, 2025
Fix(phi3): Address comments from PR PaddlePaddle#10688

This commit incorporates your suggestions and requirements from the review comments on PR PaddlePaddle#10688 for the Phi3 model implementation.

The following changes were made:

1.  **Tokenizer Configuration Cleanup:**
    - Removed `pretrained_resource_files_map`, `pretrained_init_configuration`, and `max_model_input_sizes` from `paddlenlp/transformers/phi3/tokenizer.py` as you requested, to decouple it from specific pre-trained model download paths.

2.  **Test Init File Completion:**
    - Added a docstring to `tests/transformers/phi3/__init__.py` to ensure it's a valid and non-empty Python module initialization file.

3.  **PretrainedModel Mapping Methods:**
    - Implemented `_get_name_mappings`, `_get_tensor_parallel_mappings`, and `_get_fuse_or_split_param_mappings` in the `Phi3PreTrainedModel` class in `paddlenlp/transformers/phi3/modeling.py`. These methods are crucial for model conversion and tensor parallelism, based on the Qwen2 model's implementation.

4.  **Parallel Strategy Support:**
    - Integrated support for sequence parallelism and recomputation into `paddlenlp/transformers/phi3/modeling.py`.
    - This includes:
        - Configuration flags for enabling/disabling these features.
        - Modifications to `Phi3Model`, `Phi3DecoderLayer`, `Phi3Attention`, and `Phi3MLP` to handle sequence-parallel linear layers and recomputation logic (full layer, full attention, and core attention granularities).
        - Necessary imports and utilities for sequence parallelism (ScatterOp, GatherOp, sequence-parallel linear layers) and recomputation.
        - Tensor parallelism considerations for weight initialization and layer configurations.

5.  **Code Formatting:**
    - Applied `pre-commit` to all modified files to ensure code style consistency and address linting issues. This included removing some unused imports and a duplicated code segment.
@CLAassistant
Copy link

CLAassistant commented Jun 9, 2025

CLA assistant check
All committers have signed the CLA.

@robinbg robinbg force-pushed the feature/add_phi3 branch from ff63b2e to dbb9d76 Compare June 9, 2025 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants