Skip to content

Starter Task 1: Get learning rate for llm_pte_finetuning example from config file #11445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions examples/llm_pte_finetuning/llama3_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ dataset:
seed: null
shuffle: True

learning_rate: 5e-3

checkpointer:
_component_: torchtune.training.FullModelHFCheckpointer
checkpoint_dir: /tmp/Llama-3.2-1B-Instruct/
Expand Down
2 changes: 2 additions & 0 deletions examples/llm_pte_finetuning/phi3_alpaca_code_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ batch_size: 1
loss:
_component_: torch.nn.CrossEntropyLoss

learning_rate: 5e-3

model:
_component_: torchtune.models.phi3.lora_phi3_mini
lora_attn_modules: ['q_proj', 'v_proj']
Expand Down
2 changes: 2 additions & 0 deletions examples/llm_pte_finetuning/phi3_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ batch_size: 1
loss:
_component_: torch.nn.CrossEntropyLoss

learning_rate: 5e-3

model:
_component_: torchtune.models.phi3.lora_phi3_mini
lora_attn_modules: ['q_proj', 'v_proj']
Expand Down
4 changes: 4 additions & 0 deletions examples/llm_pte_finetuning/qwen_05b_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,16 @@ batch_size: 1
loss:
_component_: torch.nn.CrossEntropyLoss

learning_rate: 5e-3

model:
_component_: torchtune.models.qwen2.lora_qwen2_0_5b
lora_attn_modules: ['q_proj', 'k_proj', 'v_proj']
apply_lora_to_mlp: False
lora_rank: 32
lora_alpha: 64
# lr parameter is not supported by lora_qwen2_0_5b function
# lr: 5e-3

checkpointer:
_component_: torchtune.training.FullModelHFCheckpointer
Expand Down
2 changes: 1 addition & 1 deletion examples/llm_pte_finetuning/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ def main() -> None:
# params run from [param_start, outputs_end]
grad_start = et_mod.run_method("__et_training_gradients_index_forward", [])[0]
param_start = et_mod.run_method("__et_training_parameters_index_forward", [])[0]
learning_rate = 5e-3
learning_rate = cfg.learning_rate
f.seek(0)
losses = []
for i, batch in tqdm(enumerate(train_dataloader), total=num_training_steps):
Expand Down
4 changes: 0 additions & 4 deletions examples/llm_pte_finetuning/training_lib.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,10 +106,6 @@ def eval_model(
token_size = tokens.shape[1]
labels_size = labels.shape[1]

tokens, labels = batch["tokens"], batch["labels"]
token_size = tokens.shape[1]
labels_size = labels.shape[1]

# Fixed length for now. We need to resize as the input shapes
# should be the same passed as examples to the export function.
if token_size > max_seq_len:
Expand Down
Loading