LLaMA 2.pptx

RkRahul16
RkRahul16Student um Daffodil International University
Large Language
Model Mata AI
-LLaMA 2
© Rk Rahul
LLaMA - Overview
● LLaMA is a family of large language models (LLMs)
● LLaMA has four model sizes were trained: 7, 13, 33 and 65
billion parameters
● LLaMA developed by Meta
● First released in February 2023.
LLaMA 2 - Overview
● LLaMA 2 is a family of large language models (LLMs)
● LLaMA 2 is an auto-regressive language model
● First Release, July 18, 2023, in partnership with Microsoft, Meta
and Open-source Large Language models.
● LLaMA 2 pretrained models are trained on 2 trillion tokens, and
have double the context length than LLaMA 1
● Three model sizes were trained: 7, 13,70 billion parameters
● LLaMA 2 is available for free for research and commercial use.
LLaMA 2 – Can Do
● Generate different creative text formats of text content, like poems,
code, scripts, musical pieces, email, letters, etc.
● Translate languages.
● Write different kinds of creative content.
● Answer your questions in an informative way, even if they are open
ended, challenging, or strange.
● Help you with coding tasks.
● Generate dialogue for chatbots and other conversational AI systems.
LLaMA 2 - Improvements
● Increased Training on Tokens: Llama 2 is trained on 40% more
tokens.
● Longer Context Length: With a longer context length of 4k tokens.
● Fine-Tuning for Dialogues: The versions of Llama 2 that are fine-
tuned (Labelled Llama 2-Chat) are aimed at being optimized for
dialogue applications using Reinforcement Learning from Human
Feedback (RLHF).
Fine-Tuning Process and LLaMA-2-Chat
Supervised Fine-
Tuning
LLaMA 2 Building Process
1
Pre-Training
2
3
Reinforcement
Learning from
Human Feedback
(RLHF)
4
Reward Model
LLaMA 2 Pre-Training
● The pretraining approach using an optimized auto-regressive transformer
(several changes to improve performance)
● Also used grouped-query attention (GQA) (improve inference scalability)
● Trained on 2 trillion tokens of data for good performance.
● Model architecture uses standard transformer architecture.
● Pre-normalization using RMSNorm (Root Mean Square Layer Normalization)
LLaMA 2 Pre-Training Normalization
LLaMA 2 - Pretraining Functionality
● Trained using the AdamW optimizer (β1 = 0.9, β2 = 0.95, eps = 10−5
)
● The SwiGLU activation function
● To use a cosine learning rate schedule (warmup of 2000 steps) and decay
final learning rate.
● Weight decay of 0.1 and gradient clipping of 1.0
LLaMA 2 - Training Hardware
● LLaMA 2 was pre-trained on Meta's Research Super Cluster
(RSC) as well as internal production clusters.
● Both clusters use NVIDIA A100 GPUs.
● RSC use NVIDIA Quantum InfiniBand while production
cluster is using a RoCE (RDMA over converged Ethernet)
LLaMA 2 - Supervised Fine-Tuning (SFT)
● SFT is the technique of next-token prediction objective that is nearly
identical to pre-training.
● To encode text for LLaMA 2 and using the method of the tokenizer.
● Supervised fine-tuning to use a cosine learning rate schedule with an
initial learning rate of 2 × 𝟏𝟎−𝟓, a weight decay of 0.1, a batch size
of 64, and a sequence length of 4096 tokens.
LLaMA 2 - Tokenizer
● To encode text from SFT (LLaMA 2), the tokenizer first splits all
numbers into individual digits. LLaMA 2 is a sub word language
model, and it can learn to represent numbers using a small
number of sub words.
● LLaMA 2 is a byte pair encoding (BPE) tokenizer based on the
SentencePiece implementation.
● The total vocabulary size is 32k tokens.
LLaMA 2 - Tokenizer
LLaMA 2 - RLHF
● Reinforcement learning from human feedback (RLHF) is a model training
procedure that is applied to a fine-tuned language model to further align model
behavior with human preferences and instruction following.
● RLHF collects data that represents sampled human preferences, whereby
human annotators select which directly from human feedback on the
model’s output.
● Safety-based data collection during RLHF
● This human feedback is subsequently used to train a reward model, which
learns patterns in the preferences of the human annotators and can then
automate preference decisions.
LLaMA 2 - Reward Model
● The reward model is responsible
for telling the language model
what constitutes a good response.
Its response based on how helpful
and safe it is.
● The reward model takes a model
response and its corresponding as
inputs and outputs a scalar score to
indicate the quality of the model
generation.
LLaMA 2 - Model Evaluations
Reference
● Deep (Learning) Focus -
https://cameronrwolfe.substack.com/p/llama-2-from-the-ground-up
● Meta AI - https://ai.meta.com/
● Research Article - Llama 2: Open Foundation and Fine-Tuned
Chat Models
Thanks!
1 von 19

Más contenido relacionado

LLaMA 2.pptx

  • 1. Large Language Model Mata AI -LLaMA 2 © Rk Rahul
  • 2. LLaMA - Overview ● LLaMA is a family of large language models (LLMs) ● LLaMA has four model sizes were trained: 7, 13, 33 and 65 billion parameters ● LLaMA developed by Meta ● First released in February 2023.
  • 3. LLaMA 2 - Overview ● LLaMA 2 is a family of large language models (LLMs) ● LLaMA 2 is an auto-regressive language model ● First Release, July 18, 2023, in partnership with Microsoft, Meta and Open-source Large Language models. ● LLaMA 2 pretrained models are trained on 2 trillion tokens, and have double the context length than LLaMA 1 ● Three model sizes were trained: 7, 13,70 billion parameters ● LLaMA 2 is available for free for research and commercial use.
  • 4. LLaMA 2 – Can Do ● Generate different creative text formats of text content, like poems, code, scripts, musical pieces, email, letters, etc. ● Translate languages. ● Write different kinds of creative content. ● Answer your questions in an informative way, even if they are open ended, challenging, or strange. ● Help you with coding tasks. ● Generate dialogue for chatbots and other conversational AI systems.
  • 5. LLaMA 2 - Improvements ● Increased Training on Tokens: Llama 2 is trained on 40% more tokens. ● Longer Context Length: With a longer context length of 4k tokens. ● Fine-Tuning for Dialogues: The versions of Llama 2 that are fine- tuned (Labelled Llama 2-Chat) are aimed at being optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF).
  • 6. Fine-Tuning Process and LLaMA-2-Chat
  • 7. Supervised Fine- Tuning LLaMA 2 Building Process 1 Pre-Training 2 3 Reinforcement Learning from Human Feedback (RLHF) 4 Reward Model
  • 8. LLaMA 2 Pre-Training ● The pretraining approach using an optimized auto-regressive transformer (several changes to improve performance) ● Also used grouped-query attention (GQA) (improve inference scalability) ● Trained on 2 trillion tokens of data for good performance. ● Model architecture uses standard transformer architecture. ● Pre-normalization using RMSNorm (Root Mean Square Layer Normalization)
  • 9. LLaMA 2 Pre-Training Normalization
  • 10. LLaMA 2 - Pretraining Functionality ● Trained using the AdamW optimizer (β1 = 0.9, β2 = 0.95, eps = 10−5 ) ● The SwiGLU activation function ● To use a cosine learning rate schedule (warmup of 2000 steps) and decay final learning rate. ● Weight decay of 0.1 and gradient clipping of 1.0
  • 11. LLaMA 2 - Training Hardware ● LLaMA 2 was pre-trained on Meta's Research Super Cluster (RSC) as well as internal production clusters. ● Both clusters use NVIDIA A100 GPUs. ● RSC use NVIDIA Quantum InfiniBand while production cluster is using a RoCE (RDMA over converged Ethernet)
  • 12. LLaMA 2 - Supervised Fine-Tuning (SFT) ● SFT is the technique of next-token prediction objective that is nearly identical to pre-training. ● To encode text for LLaMA 2 and using the method of the tokenizer. ● Supervised fine-tuning to use a cosine learning rate schedule with an initial learning rate of 2 × 𝟏𝟎−𝟓, a weight decay of 0.1, a batch size of 64, and a sequence length of 4096 tokens.
  • 13. LLaMA 2 - Tokenizer ● To encode text from SFT (LLaMA 2), the tokenizer first splits all numbers into individual digits. LLaMA 2 is a sub word language model, and it can learn to represent numbers using a small number of sub words. ● LLaMA 2 is a byte pair encoding (BPE) tokenizer based on the SentencePiece implementation. ● The total vocabulary size is 32k tokens.
  • 14. LLaMA 2 - Tokenizer
  • 15. LLaMA 2 - RLHF ● Reinforcement learning from human feedback (RLHF) is a model training procedure that is applied to a fine-tuned language model to further align model behavior with human preferences and instruction following. ● RLHF collects data that represents sampled human preferences, whereby human annotators select which directly from human feedback on the model’s output. ● Safety-based data collection during RLHF ● This human feedback is subsequently used to train a reward model, which learns patterns in the preferences of the human annotators and can then automate preference decisions.
  • 16. LLaMA 2 - Reward Model ● The reward model is responsible for telling the language model what constitutes a good response. Its response based on how helpful and safe it is. ● The reward model takes a model response and its corresponding as inputs and outputs a scalar score to indicate the quality of the model generation.
  • 17. LLaMA 2 - Model Evaluations
  • 18. Reference ● Deep (Learning) Focus - https://cameronrwolfe.substack.com/p/llama-2-from-the-ground-up ● Meta AI - https://ai.meta.com/ ● Research Article - Llama 2: Open Foundation and Fine-Tuned Chat Models