Minerva - Solving Quantitative Reasoning Problems with Language Models

•

0 gefällt mir•41 views

언어 모델은 자연어 이해가 필요한 다양한 작업에서 놀라운 성과를 이루었습니다. 그러나 최신 모델들은 대학 수준의 수학, 과학, 공학 문제를 해결하는 데 필요한 양적 추론 작업에 대해 일반적으로 어려움을 겪고 있습니다. 이러한 차이를 좁히기 위해 우리는 Minerva를 소개합니다. Minerva는 일반 자연어 데이터로 사전 학습된 대형 언어 모델로, 기술적인 콘텐츠로 추가 학습되었습니다. 이 모델은 외부 도구를 사용하지 않고 기술적인 기준에 대해 최고 성능을 달성합니다. 또한 물리학, 생물학, 화학, 경제학 등 양적 추론이 필요한 200개 이상의 학부 수준 문제를 평가하였고, 모델이 거의 1/3의 문제를 정확히 해결할 수 있다는 결과를 얻었습니다.

Daten & Analysen

Minerva:
Solving Quantitative Reasoning
Problems with Language Models
MINERVA
1
Google Research
조해창(발표자), 박희수, 허정원

CONTENTS
Quantitative Reasoning Problems
Minerva - Overview
Prompt
Majority voting
Conclusions
MINERVA
2

Quantitative Reasoning Problems
3
• 정량적 추론문제
• 수학문제, 과학문제
• 수치적인 연산이 포함되고, 정답이 정해져있는 문제
• Datasets:
• Math Word Problem (MWP)
• Grade School Math (GSM8k)
• Massive Multitask Language Understanding (MMLU)

Quantitative Reasoning Problems
6
• GPT, BERT로 접근.
• 언어모델이 양적 정보(숫자 등)를 잘 학습하지 못함.
• 외부 도구에 의존해야 할 수 있음.

Quantitative Reasoning Problems
8
한 변의 길이가 24cm인 정육각
형과 둘레가 같은 정팔각형이 있
습니다. 이 정팔각형의 한 변의
길이는 몇 cm 인지 구하시오.
a = 24
b = 6
c = 8
y = a * b // c
print(y)
18
인공지능
모델
<서술형 수학문제> <풀이 과정> <해답>
외부 도구

Minerva
• 학습 데이터:
• Math Web Pages
• arXiv
• General Natural Language Data
• 학습 방법
• few-shot prompting
• chain of thought & scratchpad prompting
• majority voting – nucleus sampling
• based on the PaLM general language models
9

Prompt
• 다음 대사를 상기시켜주기 위한 힌트
11
user@host:/home$ _

Majority voting
• 256개 이상의 output sampling.
• 나올 수 있는 정답이 제한되지 않기 때문에 유효함.
15
Nucleus sampling

Majority voting
• 256개 이상의 output sampling.
• 나올 수 있는 정답이 제한되지 않기 때문에 유효함.
16
Nucleus sampling

Conclusions
19
• 고품질 데이터 셋을 확보함으로 외부 도구의 도움 없이 정량적
문제를 효과적으로 해결할 수 있다.
• 풀이과정까지 맞았는지 확인할 방법은 없다.
• 코드 생성 모델과 결합할 경우 좋은 성능을 낼 여지가 있다.

Empfohlen

VoxelNettaeseon ryu

OpineSum Entailment-based self-training for abstractive opinion summarization...taeseon ryu

3D Gaussian Splattingtaeseon ryu

JetsonTX2 Python taeseon ryu

Hyperbolic Image Embedding.pptxtaeseon ryu

MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정taeseon ryu

LLaMA Open and Efficient Foundation Language Models - 230528.pdftaeseon ryu

YOLO V6taeseon ryu

Empfohlen

VoxelNettaeseon ryu

OpineSum Entailment-based self-training for abstractive opinion summarization...taeseon ryu

3D Gaussian Splattingtaeseon ryu

JetsonTX2 Python taeseon ryu

Hyperbolic Image Embedding.pptxtaeseon ryu

MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정taeseon ryu

LLaMA Open and Efficient Foundation Language Models - 230528.pdftaeseon ryu

YOLO V6taeseon ryu

Dataset Distillation by Matching Training Trajectories taeseon ryu

RL_UpsideDowntaeseon ryu

Packed Levitated Marker for Entity and Relation Extractiontaeseon ryu

MOReL: Model-Based Offline Reinforcement Learningtaeseon ryu

Scaling Instruction-Finetuned Language Modelstaeseon ryu

Visual prompt tuningtaeseon ryu

mPLUGtaeseon ryu

variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdftaeseon ryu

Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdftaeseon ryu

The Forward-Forward Algorithmtaeseon ryu

Towards Robust and Reproducible Active Learning using Neural Networkstaeseon ryu

BRIO: Bringing Order to Abstractive Summarizationtaeseon ryu

ProximalPolicyOptimizationtaeseon ryu

Dream2Control paper reviewtaeseon ryu

Online Continual Learning on Class Incremental Blurry Task Configuration with...taeseon ryu

[2023] Cut and Learn for Unsupervised Object Detection and Instance Segmentationtaeseon ryu

Unsupervised Neural Machine Translation for Low-Resource Domainstaeseon ryu

PaLM Scaling Language Modeling with Pathways - 230219 (1).pdftaeseon ryu

Distributional RL via Moment Matchingtaeseon ryu

Deep Reinforcement Learning from Human Preferencestaeseon ryu

Weitere ähnliche Inhalte

Mehr von taeseon ryu

Dataset Distillation by Matching Training Trajectories taeseon ryu

RL_UpsideDowntaeseon ryu

Packed Levitated Marker for Entity and Relation Extractiontaeseon ryu

MOReL: Model-Based Offline Reinforcement Learningtaeseon ryu

Scaling Instruction-Finetuned Language Modelstaeseon ryu

Visual prompt tuningtaeseon ryu

mPLUGtaeseon ryu

variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdftaeseon ryu

Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdftaeseon ryu

The Forward-Forward Algorithmtaeseon ryu

Towards Robust and Reproducible Active Learning using Neural Networkstaeseon ryu

BRIO: Bringing Order to Abstractive Summarizationtaeseon ryu

ProximalPolicyOptimizationtaeseon ryu

Dream2Control paper reviewtaeseon ryu

Online Continual Learning on Class Incremental Blurry Task Configuration with...taeseon ryu

[2023] Cut and Learn for Unsupervised Object Detection and Instance Segmentationtaeseon ryu

Unsupervised Neural Machine Translation for Low-Resource Domainstaeseon ryu

PaLM Scaling Language Modeling with Pathways - 230219 (1).pdftaeseon ryu

Distributional RL via Moment Matchingtaeseon ryu

Deep Reinforcement Learning from Human Preferencestaeseon ryu

Mehr von taeseon ryu (20)

Dataset Distillation by Matching Training Trajectories

RL_UpsideDown

Packed Levitated Marker for Entity and Relation Extraction

MOReL: Model-Based Offline Reinforcement Learning

Scaling Instruction-Finetuned Language Models

Visual prompt tuning

mPLUG

variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf

Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf

The Forward-Forward Algorithm

Towards Robust and Reproducible Active Learning using Neural Networks

BRIO: Bringing Order to Abstractive Summarization

ProximalPolicyOptimization

Dream2Control paper review

Online Continual Learning on Class Incremental Blurry Task Configuration with...

[2023] Cut and Learn for Unsupervised Object Detection and Instance Segmentation

Unsupervised Neural Machine Translation for Low-Resource Domains

PaLM Scaling Language Modeling with Pathways - 230219 (1).pdf

Distributional RL via Moment Matching

Deep Reinforcement Learning from Human Preferences

Minerva - Solving Quantitative Reasoning Problems with Language Models

1. Minerva: Solving Quantitative Reasoning Problems with Language Models MINERVA 1 Google Research 조해창(발표자), 박희수, 허정원

2. CONTENTS Quantitative Reasoning Problems Minerva - Overview Prompt Majority voting Conclusions MINERVA 2

3. Quantitative Reasoning Problems 3 • 정량적 추론문제 • 수학문제, 과학문제 • 수치적인 연산이 포함되고, 정답이 정해져있는 문제 • Datasets: • Math Word Problem (MWP) • Grade School Math (GSM8k) • Massive Multitask Language Understanding (MMLU)

4. Modern model 4

5. Quantitative Reasoning Problems 5

6. Quantitative Reasoning Problems 6 • GPT, BERT로 접근. • 언어모델이 양적 정보(숫자 등)를 잘 학습하지 못함. • 외부 도구에 의존해야 할 수 있음.

7. Quantitative Reasoning Problems 7

8. Quantitative Reasoning Problems 8 한 변의 길이가 24cm인 정육각 형과 둘레가 같은 정팔각형이 있 습니다. 이 정팔각형의 한 변의 길이는 몇 cm 인지 구하시오. a = 24 b = 6 c = 8 y = a * b // c print(y) 18 인공지능 모델 <서술형 수학문제> <풀이 과정> <해답> 외부 도구

9. Minerva • 학습 데이터: • Math Web Pages • arXiv • General Natural Language Data • 학습 방법 • few-shot prompting • chain of thought & scratchpad prompting • majority voting – nucleus sampling • based on the PaLM general language models 9

10. Minerva 10

11. Prompt • 다음 대사를 상기시켜주기 위한 힌트 11 user@host:/home$ _

12. Thought of chain 12

13. Scratchpad Prompt 13

14. Minerva Prompt 14

15. Majority voting • 256개 이상의 output sampling. • 나올 수 있는 정답이 제한되지 않기 때문에 유효함. 15 Nucleus sampling

16. Majority voting • 256개 이상의 output sampling. • 나올 수 있는 정답이 제한되지 않기 때문에 유효함. 16 Nucleus sampling

17. 17

18. Conclusions 18

19. Conclusions 19 • 고품질 데이터 셋을 확보함으로 외부 도구의 도움 없이 정량적 문제를 효과적으로 해결할 수 있다. • 풀이과정까지 맞았는지 확인할 방법은 없다. • 코드 생성 모델과 결합할 경우 좋은 성능을 낼 여지가 있다.

20. THANK YOU MINERVA 20