How to Fine-Tune LLMs with Kubeflow
Warning
This feature is in alpha stage and Kubeflow community is looking for your feedback. Please share your experience using #kubeflow-training-operator Slack channel or Kubeflow Training Operator GitHub.This page describes how to use a train
API from Training Python SDK that simplifies the ability to fine-tune LLMs with
distributed PyTorchJob workers.
If you want to learn more about how the fine-tuning API fit in the Kubeflow ecosystem, head to explanation guide.
Prerequisites
You need to install Training Python SDK with fine-tuning support to run this API.
How to use Fine-Tuning API ?
You need to provide the following parameters to use the train
API:
- Pre-trained model parameters.
- Dataset parameters.
- Trainer parameters.
- Number of PyTorch workers and resources per workers.
For example, you can use train
API as follows to fine-tune BERT model using Yelp Review dataset
from HuggingFace Hub:
import transformers
from peft import LoraConfig
from kubeflow.training import TrainingClient
from kubeflow.storage_initializer.hugging_face import (
HuggingFaceModelParams,
HuggingFaceTrainerParams,
HuggingFaceDatasetParams,
)
TrainingClient().train(
name="fine-tune-bert",
# BERT model URI and type of Transformer to train it.
model_provider_parameters=HuggingFaceModelParams(
model_uri="hf://google-bert/bert-base-cased",
transformer_type=transformers.AutoModelForSequenceClassification,
),
# Use 3000 samples from Yelp dataset.
dataset_provider_parameters=HuggingFaceDatasetParams(
repo_id="yelp_review_full",
split="train[:3000]",
),
# Specify HuggingFace Trainer parameters. In this example, we will skip evaluation and model checkpoints.
trainer_parameters=HuggingFaceTrainerParams(
training_parameters=transformers.TrainingArguments(
output_dir="test_trainer",
save_strategy="no",
evaluation_strategy="no",
do_eval=False,
disable_tqdm=True,
log_level="info",
),
# Set LoRA config to reduce number of trainable model parameters.
lora_config=LoraConfig(
r=8,
lora_alpha=8,
lora_dropout=0.1,
bias="none",
),
),
num_workers=4, # nnodes parameter for torchrun command.
num_procs_per_worker=2, # nproc-per-node parameter for torchrun command.
resources_per_worker={
"gpu": 2,
"cpu": 5,
"memory": "10G",
},
)
After you execute train
, Training Operator will orchestrate appropriate PyTorchJob resources
to fine-tune LLM.
Next Steps
Run example to fine-tune TinyLlama LLM
Check this example to compare
create_job
andtrain
Python API for fine-tuning BERT LLM.Understand the architecture behind
train
API.
Feedback
Was this page helpful?
Thank you for your feedback!
We're sorry this page wasn't helpful. If you have a moment, please share your feedback so we can improve.