Choosing a Fine-Tuning Method in Generative AI

You can fine-tune the meta.llama-3-70b-instruct, cohere.command, and cohere.command-light base models with your own dataset.

Important

  • For the meta.llama-3-70b-instruct models, OCI Generative AI fine-tunes your data with the Lora method.
  • For the cohere.command and cohere.command-light models, OCI Generative AI has two training methods: T-Few, and Vanilla.

Use the following guidelines to help you choose the best training method if you select the cohere.command and cohere.command-light base models.

Feature Options and Recommendations
Training methods for cohere.command and cohere.command-light
  • T-Few
  • Vanilla
Dataset Size
  • Use T-Few for small datasets (A few thousand samples or less)
  • Use Vanilla for large datasets (From a hundred thousand samples to millions of samples)

Using small datasets for the Vanilla method might cause overfitting. Overfitting happens when the trained model gives great results for the training data, but can't generalize outputs for unseen data.

Complexity
  • Use T-Few for format following or instruction following.
  • Use Vanilla for complicated semantical understanding improvement, such as improving a model's understanding of medical cases.
Hosting
  • Use T-Few if you're planning to host several fine-tuned models on the same hosting dedicated AI cluster. If all the models are trained on the same base model, you can host them on the same cluster. This stacked-serving feature saves cost and offers good performance if user traffic to each T-Few fine-tuned model is relatively low. See Hosting Clusters with Many Endpoints.
  • Each model that's fine-tuned with the Vanilla method requires its own hosting dedicated AI cluster.