Skip to content

[WIP] [GenAI] Lora Finetune #7288

Open
Open
@LittleLittleCloud

Description

@LittleLittleCloud

Lora fine-tuning is an adapter-based technique to fine-tune an LLM. It changes LLM model architecture by adding learnable lora layers to transformers. During fine-tuning, only lora weights are adjustable and the LLM weights are frozen, so it requires much less GPU memory comparing to a full-layer fine-tuning. Based on this table, it requires 16GB memory to fine-tuning a 7B size model in 16bits, which can be fit in rtx 3090, 4080 and 4090. A wider range of GPUs can be fit on 3.8B LLMs like phi-3.5-mini

API design (wip)

Package: Microsoft.ML.GenAI.Lora

interface ICausalLMLoraPipeline {} // pipeline for loading causal LM + lora layers

class LoraConfiguration // lora configuration

Metadata

Metadata

Labels

AutoML.NETAutomating various steps of the machine learning processenhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions