Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT

Anant Corporation
Anant CorporationAnant Corporation
NoCode, Data & AI
LLM Inside Bootcamp
Fundamentals of LLM
What is a large language model, how is it trained, how are
different from traditional machine learning models.
Rahul Xavier Singh Anant Corporation
Nocode Data & AI
To most , LLMs seem like
magic. In computing &
technology, LLMs show
great promise in bridging
the gap between human
computer interaction.
Our Customers
NoCode, Data & AI
LLM Inside Bootcamp
with Cassandra
Full day bootcamp to familiarize product managers, software
professionals, and data engineers to creating next generation
experts, assistants, and platforms powered by Generative AI
with Large Language Models (LLM, OpenAI, GPT)
Rahul Xavier Singh Anant Corporation
Nocode Data & AI
kono.io/bootcamp
Agenda
● I: Strategy & Theory
● II: LLM Design Patterns
● III: NoCode/Code LLM Stacks
● IV: Build a Custom ChatBot
with LLM your Data
Today’s Agenda
1. Fundamentals of ML
2. Transformers Architecture
3. How LLMs Work
4. LLMs other than ChatGPT/GPT
Fundamentals of
ML/Transformers
● History of LLMs (Large Language Models)
● What is Machine Learning / AI?
● Transformer Architecture
History of Large Language Models
1. Everything
before GPT-3
(2020) was trash.
2. ChatGPT made
GPT-3 popular.
3. Now everyone
wants in on the
party.
https://voicebot.ai/large-language-models
-history-timeline/
Most of the hype, growth
relating to LLMs have
happened in the last 6 months
( November 2022 till now , May
2023
Machine Learning in a Nutshell
https://www.avenga.com/magazine/machi
ne-learning-programming/
1. In machine learning, the
computer trains on your
data, and gives you the
most likely answer. The
better the data, the
better the algorithm.
2. Neural networks process
input data through layers
to predict outcomes
based on patterns and
relationships learned
during training.
What can Neural Neworks do?
https://thedatascientist.com/wp-content/uploads/2018/03/Deep-Neural-
Network-What-is-Deep-Learning-Edureka.png
1. Artificial neural networks (ANN)
can recognize patterns and
relationships in data.
2. They can classify and categorize
data accurately.
3. They can make predictions based
on input data.
4. Neural networks can be used for
image and speech recognition.
5. Deep neural network is an ANN
that has many layers and can do
more complex predictions.
6. They can be trained to improve
their accuracy over time.
https://www.analyticsvidhya.com/blog/202
1/05/convolutional-neural-networks-cnn/
What is the big deal about Transformers?
1. Because ANNs are implementations in matrix math -
and that relates to the Matrix of Leadership …
2. Transformers improve natural language processing,
enabling better chatbots and language translation
tools.
3. Transformers are a neural network architecture that
outperforms previous models on various NLP tasks.
4. Attention mechanisms in Transformers better model
long-term dependencies in sequential data.
5. Transformers are a hardware accelerator that
speeds up AI computations by several orders of
magnitude.
6. Transformers were invented by Elon Musk
The encoder-decoder structure of the Transformer
architecture
Taken from “Attention Is All You Need“
How LLMs Work &
What LLMs Do
● Transformers Decoder/Encoder
● What LLMs Do: Predict Words
● What LLMs Do: Narrow Possibilities
● What LLMs Do: Verse Jumping
● What LLMs Do: Document Construction
How does a Large Language Model Work?
1. The transformer architecture consists of two
components: the encoder and decoder.
2. The encoder processes the input sequence and
generates embeddings through self-attention
mechanisms.
3. The decoder takes the encoder's embeddings as
input and generates an output sequence, while
also using self-attention mechanisms to attend to
relevant parts of the input sequence.
4. Together, they enable the transformer to learn
complex patterns and relationships within
sequences, making it a powerful tool for natural
language processing and other sequence modeling
tasks.
The encoder-decoder structure of the Transformer
architecture
Taken from “Attention Is All You Need“
What LLMs Do: Predict Words
1. A language model uses deep learning
algorithms to learn patterns and
relationships in large sets of text data.
2. It is trained on a large corpus of text, such
as books, articles, and websites, to
recognize and understand the underlying
structure and meaning of language.
3. Once trained, the model can generate
new text based on the input it receives,
by predicting the most likely sequence of
words to follow.
4. The model uses a probabilistic approach
to generate text, allowing it to produce
diverse and creative responses to different
inputs.
5. LLMs have a wide range of applications,
including language translation, chatbots,
content creation, and more.
https://vectara.com/avoiding-hallucinations-in-llm-powered-applications/
What LLMs Do: Narrow Possibilities
1. A LLM is like a really
smart guesser that's
been trained on a lot
of text.
2. When you give it a
prompt, it starts
guessing what the
next word might be.
3. Instead of guessing
randomly, it predicts
the best possible
word.
4. As you add words to
your prompt, you are
narrowing down the
overall “document”
you get back.
What LLMs Do: Verse Jumping
1. It’s a simulator of the real world, but it isn’t a real
world. Each prompt is a portal to a a possible
realistic universe.
2. It contains probabilities of words or tokens from the
tokenverse strung together which we can call a
“Document”
3. As you give it more words, the universe of possible
“Documents” reduces.
https://now.tufts.edu/2022/05/31/exploring-shape-
our-universe-and-multiverse
What LLMs Do: Document Construction
1. Each model has a
“tokenverse” which it
picks words from.
GPT4 has 100k tokens.
2. Document A &
Document B are
possible path through
all of the tokens in the
tokenverse for a
particular model.
3. If you start with certain
words, a Prompt A’,
the possibility of
getting Document A
increases
4.
A’
B’
LLMs other than
ChatGPT/GPT
● Popular LLMs Available
● Popular Open Source LLMs Available
● Cloud Providers LLM Offerings
Popular Public LLMs Available Today
1. OpenAI: ChatGPT,
GPT3.5-Turbo,
Text-Davinci-003,
GPT4 (Waitlist)
2. Anthropic: Claude,
Claude-Instant
3. Cohere: Baseline,
allows training
https://vectara.com/top-large-language-models-llms-gpt-4-llama-g
ato-bloom-and-when-to-choose-one-over-the-other/
If you are starting out, just use GPT-3.5 Turbo.
It’s easy to get access to, and there are lots of
code examples on Github
Leaked @Google: “We Have No Moat…”
“We Have No Moat, And
Neither Does OpenAI"
https://lmsys.org/blog/2023-03-30-vicuna/
https://www.semianalysis.com/p/google-we-have-
no-moat-and-neither ● Meta LLaMa Open Sourced
● GPT Answers used to Train
● LoRA - Low rank adaptation
● Retraining models is hard
● Small models iterating
better
● Data quality scales better
● Battling open source means
failure
● Companies need users /
researchers
● Individuals can use
different licenses
● Be your customer
● Let open source do the
work
● OpenAI no different than
Google
Example Open LLM: Stanford Alpaca
https://crfm.stanford.edu/2023/03/13/alpaca.html
https://lmsys.org/blog/2023-03-30-vicuna/
Popular Open LLMs Available Today
Leaderboard
1. Vicuna-13b
2. Koala-13b
3. Oast-pythia-12b
Others to Look into
4. StableLM
5. Dolly
6. ChatGLM
https://chat.lmsys.org/
If you don’t want to send your data to a public
LLM, you can host your own open model, or use
Azure OpenAI, Amazon Bedrock
Cost of Fine Tuning: Alpaca/Vicuna
https://lmsys.org/blog/2023-03-30-vicuna/
Public Cloud Offerings of LLM
1. Azure OpenAI
2. Amazon Bedrock
3. NVidia NeMo
4. Google Vertex (batteries
not Included)
https://venturebeat.com/ai/amazon-launches-bedrock-for-generative
-ai-escalating-ai-cloud-wars/
Azure OpenAI is the most mature, and probably the
best. Amazon’s Bedrock offers managed hosting of
Claude, StableLM, etc. Google’s offering requires
work to get it to work.
25
Key Takeaways: History Foundations of LLM
Neural Networks : 1940s/50s
Transformers/Attention: 2017
GPT3: 2020, GPT3.5: 2022
Tensorflow : 2015/ Pytorch 2016
- People have been hacking away at
ML/AI since the 1940s. Until GPUs, TPUs,
Cloud Infrastructure, very few
companies could do “Deep Learning”
- Deep Learning enabled great stuff in
vision, speech, and starts to generative
AI. It wasn’t until the Transformers paper,
that things took off.
- LLMs are good at predicting the “next
word” or token from a tokenverse given
an input.
- The quality / characteristics of the
prompt given, narrows down a
Document from a multiverse of
documents.
TPUs / GPT: 2018, GPT2: 2019
Everything Else: 2023 Q1/Q2
26
Thank you and Dream Big.
Hire us
- Design Workshops
- Innovation Sprints
- Service Catalog
Anant.us
- Read our Playbook
- Join our Mailing List
- Read up on Data Platforms
- Watch our Videos
- Download Examples
1 von 26

Recomendados

The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT! von
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!taozen
1K views13 Folien
OpenAI Chatgpt.pptx von
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxNawroz University
870 views32 Folien
Generative Models and ChatGPT von
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPTLoic Merckel
652 views19 Folien
Introduction to ChatGPT von
Introduction to ChatGPTIntroduction to ChatGPT
Introduction to ChatGPTDamian T. Gordon
4.9K views38 Folien
ChatGPT 101 - Vancouver ChatGPT Experts von
ChatGPT 101 - Vancouver ChatGPT ExpertsChatGPT 101 - Vancouver ChatGPT Experts
ChatGPT 101 - Vancouver ChatGPT ExpertsAli Tavanayan
1.2K views29 Folien
ChatGPT-the-revolution-is-coming.pdf von
ChatGPT-the-revolution-is-coming.pdfChatGPT-the-revolution-is-coming.pdf
ChatGPT-the-revolution-is-coming.pdfLiang Yan
2.9K views8 Folien

Más contenido relacionado

Was ist angesagt?

Let's talk about GPT: A crash course in Generative AI for researchers von
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersSteven Van Vaerenbergh
856 views71 Folien
Prompting is an art / Sztuka promptowania von
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaMichal Jaskolski
289 views49 Folien
Deep dive into ChatGPT von
Deep dive into ChatGPTDeep dive into ChatGPT
Deep dive into ChatGPTvaluebound
447 views9 Folien
How Does Generative AI Actually Work? (a quick semi-technical introduction to... von
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
971 views14 Folien
Large Language Models - Chat AI.pdf von
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfDavid Rostcheck
711 views19 Folien
intro chatGPT workshop.pdf von
intro chatGPT workshop.pdfintro chatGPT workshop.pdf
intro chatGPT workshop.pdfpeterpur
1.1K views20 Folien

Was ist angesagt?(20)

Let's talk about GPT: A crash course in Generative AI for researchers von Steven Van Vaerenbergh
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
Prompting is an art / Sztuka promptowania von Michal Jaskolski
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowania
Michal Jaskolski289 views
Deep dive into ChatGPT von valuebound
Deep dive into ChatGPTDeep dive into ChatGPT
Deep dive into ChatGPT
valuebound447 views
How Does Generative AI Actually Work? (a quick semi-technical introduction to... von ssuser4edc93
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
ssuser4edc93971 views
Large Language Models - Chat AI.pdf von David Rostcheck
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
David Rostcheck711 views
intro chatGPT workshop.pdf von peterpur
intro chatGPT workshop.pdfintro chatGPT workshop.pdf
intro chatGPT workshop.pdf
peterpur1.1K views
Revolutionary-ChatGPT von 9 series
Revolutionary-ChatGPTRevolutionary-ChatGPT
Revolutionary-ChatGPT
9 series1.2K views
LLMs_talk_March23.pdf von ChaoYang81
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
ChaoYang8194 views
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap von Anant Corporation
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Anant Corporation664 views
Build an LLM-powered application using LangChain.pdf von AnastasiaSteele10
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
AnastasiaSteele10148 views
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in... von David Talby
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
David Talby665 views
ChatGPT.pdf von dhatura
ChatGPT.pdfChatGPT.pdf
ChatGPT.pdf
dhatura1.8K views
Introduction to Chat GPT von DianaGray10
Introduction to Chat GPTIntroduction to Chat GPT
Introduction to Chat GPT
DianaGray101.2K views
ChatGPT Deck.pptx von omornahid1
ChatGPT Deck.pptxChatGPT Deck.pptx
ChatGPT Deck.pptx
omornahid14.2K views

Similar a Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT

Train foundation model for domain-specific language model von
Train foundation model for domain-specific language modelTrain foundation model for domain-specific language model
Train foundation model for domain-specific language modelBenjaminlapid1
32 views45 Folien
Nautral Langauge Processing - Basics / Non Technical von
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
666 views20 Folien
LangChain Intro by KeyMate.AI von
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AIOzgurOscarOzkan
393 views19 Folien
LanGCHAIN Framework von
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN FrameworkKeymate.AI
1.5K views19 Folien
Technologies for startup von
Technologies for startupTechnologies for startup
Technologies for startupDzung Nguyen
125 views23 Folien
The Guide to becoming a full stack developer in 2018 von
The Guide to becoming a full stack developer in 2018The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018Amit Ashwini
50 views14 Folien

Similar a Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT(20)

Train foundation model for domain-specific language model von Benjaminlapid1
Train foundation model for domain-specific language modelTrain foundation model for domain-specific language model
Train foundation model for domain-specific language model
Benjaminlapid132 views
Nautral Langauge Processing - Basics / Non Technical von Dhruv Gohil
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
Dhruv Gohil666 views
LanGCHAIN Framework von Keymate.AI
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
Keymate.AI1.5K views
Technologies for startup von Dzung Nguyen
Technologies for startupTechnologies for startup
Technologies for startup
Dzung Nguyen125 views
The Guide to becoming a full stack developer in 2018 von Amit Ashwini
The Guide to becoming a full stack developer in 2018The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018
Amit Ashwini50 views
Dmdh winter 2015 session #1 von sarahkh12
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1
sarahkh12892 views
DMDS Winter 2015 Workshop 1 slides von Paige Morgan
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slides
Paige Morgan835 views
Google cloud Study Jam 2023.pptx von GDSCNiT
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptx
GDSCNiT622 views
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s... von Mihai Criveti
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Mihai Criveti253 views
Deprecating the state machine: building conversational AI with the Rasa stack von Justina Petraitytė
Deprecating the state machine: building conversational AI with the Rasa stackDeprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stack
Deprecating the state machine: building conversational AI with the Rasa stack... von PyData
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
PyData3.6K views
Build an LLM-powered application using LangChain.pdf von StephenAmell4
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
StephenAmell4956 views
LangChain + Docugami Webinar von Taqi Jaffri
LangChain + Docugami WebinarLangChain + Docugami Webinar
LangChain + Docugami Webinar
Taqi Jaffri96 views
10 Limitations of Large Language Models and Mitigation Options von Mihai Criveti
10 Limitations of Large Language Models and Mitigation Options10 Limitations of Large Language Models and Mitigation Options
10 Limitations of Large Language Models and Mitigation Options
Mihai Criveti62 views
Build an LLM-powered application using LangChain.pdf von MatthewHaws4
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
MatthewHaws416 views

Más de Anant Corporation

Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot von
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotAnant Corporation
24 views22 Folien
YugabyteDB Developer Tools von
YugabyteDB Developer ToolsYugabyteDB Developer Tools
YugabyteDB Developer ToolsAnant Corporation
23 views14 Folien
Machine Learning Orchestration with Airflow von
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowAnant Corporation
40 views11 Folien
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ... von
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
40 views11 Folien
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S... von
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
19 views33 Folien
Data Engineer's Lunch #85: Designing a Modern Data Stack von
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
150 views27 Folien

Más de Anant Corporation(20)

Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot von Anant Corporation
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ... von Anant Corporation
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S... von Anant Corporation
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #85: Designing a Modern Data Stack von Anant Corporation
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation150 views
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg von Anant Corporation
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Anant Corporation219 views
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps von Anant Corporation
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra von Anant Corporation
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache... von Anant Corporation
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Anant Corporation103 views
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness von Anant Corporation
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms von Anant Corporation
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Anant Corporation115 views
Data Engineer’s Lunch #67: Machine Learning - Feature Selection von Anant Corporation
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionData Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer's Lunch #80: Apache Spark Resource Managers von Anant Corporation
Data Engineer's Lunch #80: Apache Spark Resource ManagersData Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ... von Anant Corporation
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #76: Airflow and Google Dataproc von Anant Corporation
Data Engineer's Lunch #76: Airflow and Google DataprocData Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #67: Machine Learning - Feature Selection von Anant Corporation
Data Engineer's Lunch #67: Machine Learning - Feature SelectionData Engineer's Lunch #67: Machine Learning - Feature Selection
Data Engineer's Lunch #67: Machine Learning - Feature Selection
Anant Corporation123 views
Data Engineer's Lunch #63: Building a Cryptocurrency Data Catalogue von Anant Corporation
Data Engineer's Lunch #63: Building a Cryptocurrency Data CatalogueData Engineer's Lunch #63: Building a Cryptocurrency Data Catalogue
Data Engineer's Lunch #63: Building a Cryptocurrency Data Catalogue
Anant Corporation126 views

Último

The Role of Patterns in the Era of Large Language Models von
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language ModelsYunyao Li
80 views65 Folien
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... von
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...ShapeBlue
101 views17 Folien
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... von
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...ShapeBlue
158 views20 Folien
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue von
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueShapeBlue
176 views20 Folien
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... von
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...James Anderson
156 views32 Folien
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue von
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueShapeBlue
94 views13 Folien

Último(20)

The Role of Patterns in the Era of Large Language Models von Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li80 views
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... von ShapeBlue
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
ShapeBlue101 views
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... von ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue158 views
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue von ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue176 views
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... von James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson156 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue von ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue94 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue von ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue163 views
Future of AR - Facebook Presentation von Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty62 views
NTGapps NTG LowCode Platform von Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu365 views
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue von ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue103 views
DRBD Deep Dive - Philipp Reisner - LINBIT von ShapeBlue
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
ShapeBlue140 views
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... von ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue85 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows von Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software385 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... von ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue98 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... von ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue123 views
Data Integrity for Banking and Financial Services von Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely78 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... von ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue117 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... von ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue132 views
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ von ShapeBlue
Confidence in CloudStack - Aron Wagner, Nathan Gleason - AmericConfidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
ShapeBlue88 views
Why and How CloudStack at weSystems - Stephan Bienek - weSystems von ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue197 views

Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT

  • 1. NoCode, Data & AI LLM Inside Bootcamp Fundamentals of LLM What is a large language model, how is it trained, how are different from traditional machine learning models. Rahul Xavier Singh Anant Corporation Nocode Data & AI
  • 2. To most , LLMs seem like magic. In computing & technology, LLMs show great promise in bridging the gap between human computer interaction.
  • 4. NoCode, Data & AI LLM Inside Bootcamp with Cassandra Full day bootcamp to familiarize product managers, software professionals, and data engineers to creating next generation experts, assistants, and platforms powered by Generative AI with Large Language Models (LLM, OpenAI, GPT) Rahul Xavier Singh Anant Corporation Nocode Data & AI kono.io/bootcamp
  • 5. Agenda ● I: Strategy & Theory ● II: LLM Design Patterns ● III: NoCode/Code LLM Stacks ● IV: Build a Custom ChatBot with LLM your Data
  • 6. Today’s Agenda 1. Fundamentals of ML 2. Transformers Architecture 3. How LLMs Work 4. LLMs other than ChatGPT/GPT
  • 7. Fundamentals of ML/Transformers ● History of LLMs (Large Language Models) ● What is Machine Learning / AI? ● Transformer Architecture
  • 8. History of Large Language Models 1. Everything before GPT-3 (2020) was trash. 2. ChatGPT made GPT-3 popular. 3. Now everyone wants in on the party. https://voicebot.ai/large-language-models -history-timeline/ Most of the hype, growth relating to LLMs have happened in the last 6 months ( November 2022 till now , May 2023
  • 9. Machine Learning in a Nutshell https://www.avenga.com/magazine/machi ne-learning-programming/ 1. In machine learning, the computer trains on your data, and gives you the most likely answer. The better the data, the better the algorithm. 2. Neural networks process input data through layers to predict outcomes based on patterns and relationships learned during training.
  • 10. What can Neural Neworks do? https://thedatascientist.com/wp-content/uploads/2018/03/Deep-Neural- Network-What-is-Deep-Learning-Edureka.png 1. Artificial neural networks (ANN) can recognize patterns and relationships in data. 2. They can classify and categorize data accurately. 3. They can make predictions based on input data. 4. Neural networks can be used for image and speech recognition. 5. Deep neural network is an ANN that has many layers and can do more complex predictions. 6. They can be trained to improve their accuracy over time. https://www.analyticsvidhya.com/blog/202 1/05/convolutional-neural-networks-cnn/
  • 11. What is the big deal about Transformers? 1. Because ANNs are implementations in matrix math - and that relates to the Matrix of Leadership … 2. Transformers improve natural language processing, enabling better chatbots and language translation tools. 3. Transformers are a neural network architecture that outperforms previous models on various NLP tasks. 4. Attention mechanisms in Transformers better model long-term dependencies in sequential data. 5. Transformers are a hardware accelerator that speeds up AI computations by several orders of magnitude. 6. Transformers were invented by Elon Musk The encoder-decoder structure of the Transformer architecture Taken from “Attention Is All You Need“
  • 12. How LLMs Work & What LLMs Do ● Transformers Decoder/Encoder ● What LLMs Do: Predict Words ● What LLMs Do: Narrow Possibilities ● What LLMs Do: Verse Jumping ● What LLMs Do: Document Construction
  • 13. How does a Large Language Model Work? 1. The transformer architecture consists of two components: the encoder and decoder. 2. The encoder processes the input sequence and generates embeddings through self-attention mechanisms. 3. The decoder takes the encoder's embeddings as input and generates an output sequence, while also using self-attention mechanisms to attend to relevant parts of the input sequence. 4. Together, they enable the transformer to learn complex patterns and relationships within sequences, making it a powerful tool for natural language processing and other sequence modeling tasks. The encoder-decoder structure of the Transformer architecture Taken from “Attention Is All You Need“
  • 14. What LLMs Do: Predict Words 1. A language model uses deep learning algorithms to learn patterns and relationships in large sets of text data. 2. It is trained on a large corpus of text, such as books, articles, and websites, to recognize and understand the underlying structure and meaning of language. 3. Once trained, the model can generate new text based on the input it receives, by predicting the most likely sequence of words to follow. 4. The model uses a probabilistic approach to generate text, allowing it to produce diverse and creative responses to different inputs. 5. LLMs have a wide range of applications, including language translation, chatbots, content creation, and more. https://vectara.com/avoiding-hallucinations-in-llm-powered-applications/
  • 15. What LLMs Do: Narrow Possibilities 1. A LLM is like a really smart guesser that's been trained on a lot of text. 2. When you give it a prompt, it starts guessing what the next word might be. 3. Instead of guessing randomly, it predicts the best possible word. 4. As you add words to your prompt, you are narrowing down the overall “document” you get back.
  • 16. What LLMs Do: Verse Jumping 1. It’s a simulator of the real world, but it isn’t a real world. Each prompt is a portal to a a possible realistic universe. 2. It contains probabilities of words or tokens from the tokenverse strung together which we can call a “Document” 3. As you give it more words, the universe of possible “Documents” reduces. https://now.tufts.edu/2022/05/31/exploring-shape- our-universe-and-multiverse
  • 17. What LLMs Do: Document Construction 1. Each model has a “tokenverse” which it picks words from. GPT4 has 100k tokens. 2. Document A & Document B are possible path through all of the tokens in the tokenverse for a particular model. 3. If you start with certain words, a Prompt A’, the possibility of getting Document A increases 4. A’ B’
  • 18. LLMs other than ChatGPT/GPT ● Popular LLMs Available ● Popular Open Source LLMs Available ● Cloud Providers LLM Offerings
  • 19. Popular Public LLMs Available Today 1. OpenAI: ChatGPT, GPT3.5-Turbo, Text-Davinci-003, GPT4 (Waitlist) 2. Anthropic: Claude, Claude-Instant 3. Cohere: Baseline, allows training https://vectara.com/top-large-language-models-llms-gpt-4-llama-g ato-bloom-and-when-to-choose-one-over-the-other/ If you are starting out, just use GPT-3.5 Turbo. It’s easy to get access to, and there are lots of code examples on Github
  • 20. Leaked @Google: “We Have No Moat…” “We Have No Moat, And Neither Does OpenAI" https://lmsys.org/blog/2023-03-30-vicuna/ https://www.semianalysis.com/p/google-we-have- no-moat-and-neither ● Meta LLaMa Open Sourced ● GPT Answers used to Train ● LoRA - Low rank adaptation ● Retraining models is hard ● Small models iterating better ● Data quality scales better ● Battling open source means failure ● Companies need users / researchers ● Individuals can use different licenses ● Be your customer ● Let open source do the work ● OpenAI no different than Google
  • 21. Example Open LLM: Stanford Alpaca https://crfm.stanford.edu/2023/03/13/alpaca.html https://lmsys.org/blog/2023-03-30-vicuna/
  • 22. Popular Open LLMs Available Today Leaderboard 1. Vicuna-13b 2. Koala-13b 3. Oast-pythia-12b Others to Look into 4. StableLM 5. Dolly 6. ChatGLM https://chat.lmsys.org/ If you don’t want to send your data to a public LLM, you can host your own open model, or use Azure OpenAI, Amazon Bedrock
  • 23. Cost of Fine Tuning: Alpaca/Vicuna https://lmsys.org/blog/2023-03-30-vicuna/
  • 24. Public Cloud Offerings of LLM 1. Azure OpenAI 2. Amazon Bedrock 3. NVidia NeMo 4. Google Vertex (batteries not Included) https://venturebeat.com/ai/amazon-launches-bedrock-for-generative -ai-escalating-ai-cloud-wars/ Azure OpenAI is the most mature, and probably the best. Amazon’s Bedrock offers managed hosting of Claude, StableLM, etc. Google’s offering requires work to get it to work.
  • 25. 25 Key Takeaways: History Foundations of LLM Neural Networks : 1940s/50s Transformers/Attention: 2017 GPT3: 2020, GPT3.5: 2022 Tensorflow : 2015/ Pytorch 2016 - People have been hacking away at ML/AI since the 1940s. Until GPUs, TPUs, Cloud Infrastructure, very few companies could do “Deep Learning” - Deep Learning enabled great stuff in vision, speech, and starts to generative AI. It wasn’t until the Transformers paper, that things took off. - LLMs are good at predicting the “next word” or token from a tokenverse given an input. - The quality / characteristics of the prompt given, narrows down a Document from a multiverse of documents. TPUs / GPT: 2018, GPT2: 2019 Everything Else: 2023 Q1/Q2
  • 26. 26 Thank you and Dream Big. Hire us - Design Workshops - Innovation Sprints - Service Catalog Anant.us - Read our Playbook - Join our Mailing List - Read up on Data Platforms - Watch our Videos - Download Examples