<style>
.gsm8k-page * { box-sizing: border-box; }
.gsm8k-page h1, .gsm8k-page h2, .gsm8k-page h3, .gsm8k-page h4, .gsm8k-page h5, .gsm8k-page h6, .gsm8k-page p, .gsm8k-page ul, .gsm8k-page ol, .gsm8k-page li, .gsm8k-page pre, .gsm8k-page blockquote, .gsm8k-page table, .gsm8k-page td, .gsm8k-page th { margin: 0; padding: 0; }
.gsm8k-page {
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
color: var(--el-text-color-primary);
background: var(--el-bg-color);
line-height: 1.6;
}
.gsm8k-page a { text-decoration: none; color: inherit; }
.gsm8k-page a:hover { text-decoration: none; }
.gsm8k-page ul { list-style: none; }
.markdown-body .gsm8k-page a { color: inherit !important; text-decoration: none !important; }
.markdown-body .gsm8k-page a:hover { text-decoration: none !important; }
.markdown-body .gsm8k-page a.s-btn-primary,
.markdown-body .gsm8k-page a.btn-cta-light { color: #ffffff !important; }
.markdown-body .gsm8k-page a.s-btn-secondary { color: var(--el-text-color-primary) !important; }
.markdown-body .gsm8k-page a.btn-cta-ghost { color: #94a3b8 !important; }
.markdown-body .gsm8k-page a.btn-cta-ghost:hover { color: #e2e8f0 !important; }
.markdown-body .gsm8k-page h1, .markdown-body .gsm8k-page h2 { border-bottom: none !important; padding-bottom: 0 !important; }
.gsm8k-page .s-container { max-width: 1200px; margin: 0 auto; padding: 0 24px; }
.gsm8k-page .s-container-narrow { max-width: 800px; margin: 0 auto; padding: 0 24px; }
.gsm8k-page .s-container-wide { max-width: 1100px; margin: 0 auto; padding: 0 32px; }
.gsm8k-page .s-section { padding: 80px 0; }
.gsm8k-page .s-section-lg { padding: 100px 0; }
.gsm8k-page .s-section-sm { padding: 48px 0; }
.gsm8k-page .s-bg-white { background: var(--el-bg-color); }
.gsm8k-page .s-bg-gray { background: var(--el-bg-color-page); }
.gsm8k-page .s-bg-dark { background: #0f172a; color: #f8fafc; }
.gsm8k-page .s-header { text-align: center; margin-bottom: 64px; }
.gsm8k-page .s-header h2 {
font-size: clamp(28px, 4vw, 40px);
font-weight: 700;
color: var(--el-text-color-primary);
letter-spacing: normal;
margin-bottom: 20px;
line-height: 1.15;
}
.gsm8k-page .s-header p {
font-size: clamp(16px, 2vw, 18px);
color: var(--el-text-color-regular);
max-width: 640px;
margin: 0 auto;
line-height: 1.6;
}
.gsm8k-page .s-bg-dark .s-header h2 { color: #f8fafc; }
.gsm8k-page .s-bg-dark .s-header p { color: var(--el-text-color-secondary); }
.gsm8k-page .s-btn-primary {
display: inline-flex; align-items: center; gap: 6px;
padding: 14px 28px;
background: #059669; color: #ffffff !important;
border-radius: 9999px; font-size: 15px; font-weight: 600;
transition: background 0.2s, transform 0.15s;
border: none; cursor: pointer;
text-decoration: none !important;
}
.gsm8k-page .s-btn-primary:hover { background: #047857; transform: translateY(-1px); text-decoration: none !important; }
.gsm8k-page .s-btn-secondary {
display: inline-flex; align-items: center; gap: 6px;
padding: 14px 28px;
background: var(--el-bg-color); color: var(--el-text-color-primary) !important;
border: 1px solid var(--el-border-color-light);
border-radius: 9999px; font-size: 15px; font-weight: 600;
transition: border-color 0.2s, background 0.2s;
cursor: pointer;
text-decoration: none !important;
}
.gsm8k-page .s-btn-secondary:hover { background: var(--el-bg-color-page); text-decoration: none !important; }
.gsm8k-hero {
padding: 100px 0 80px;
text-align: center;
background: var(--el-bg-color);
position: relative;
overflow: hidden;
}
.gsm8k-hero::before {
content: '';
position: absolute;
top: -200px; left: 50%;
transform: translateX(-50%);
width: 900px; height: 500px;
background: radial-gradient(ellipse, rgba(5, 150, 105, 0.06) 0%, transparent 70%);
pointer-events: none;
}
.gsm8k-page .hero-badge {
display: inline-flex; align-items: center; gap: 8px;
padding: 6px 16px;
background: var(--el-bg-color-page); border: 1px solid var(--el-border-color-light);
border-radius: 9999px; font-size: 13px; font-weight: 600; color: var(--el-text-color-regular);
margin-bottom: 28px;
}
.gsm8k-page .hero-badge .badge-dot {
width: 6px; height: 6px; background: #10b981; border-radius: 50%;
display: inline-block;
}
.gsm8k-hero h1 {
font-size: clamp(36px, 5vw, 60px);
font-weight: 700; line-height: 1.05;
letter-spacing: normal; color: var(--el-text-color-primary);
margin-bottom: 20px;
position: relative;
}
.gsm8k-hero h1 span { color: #059669; }
.gsm8k-page .hero-subtitle {
font-size: clamp(16px, 2vw, 20px);
color: var(--el-text-color-regular); line-height: 1.6;
max-width: 620px; margin: 0 auto 56px;
position: relative;
}
.gsm8k-page .hero-actions {
display: flex; gap: 12px; justify-content: center;
flex-wrap: wrap; margin-bottom: 56px; position: relative;
}
.gsm8k-page .hero-highlights {
display: flex; align-items: center; justify-content: center;
gap: 16px; flex-wrap: wrap; position: relative;
}
.gsm8k-page .hero-highlights .h-item { font-size: 14px; color: var(--el-text-color-regular); font-weight: 500; }
.gsm8k-page .hero-highlights .h-div { width: 1px; height: 16px; background: var(--el-border-color-light); }
@media (max-width: 640px)
{ .gsm8k-page .hero-highlights .h-div { display: none; } .gsm8k-page .hero-highlights { gap: 8px 16px; } .gsm8k-page .hero-actions { flex-direction: column; align-items: center; } .gsm8k-page .hero-actions a { width: 100%; max-width: 280px; justify-content: center; } } .gsm8k-page .hero-cover { max-width: 720px; margin: 48px auto 0; border-radius: 16px; overflow: hidden; box-shadow: 0 8px 32px rgba(0,0,0,0.10); } .gsm8k-page .hero-cover img { width: 100%; height: auto; display: block; } .gsm8k-stats { padding: 48px 0; background: var(--el-bg-color-page); border-top: 1px solid var(--el-border-color-lighter); border-bottom: 1px solid var(--el-border-color-lighter); } .gsm8k-page .stats-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 32px; text-align: center; } .gsm8k-page .stat-icon { font-size: 28px; margin-bottom: 12px; } .gsm8k-page .stat-val { font-size: clamp(28px, 4vw, 40px); font-weight: 700; color: var(--el-text-color-primary); letter-spacing: normal; margin-bottom: 4px; } .gsm8k-page .stat-lbl { font-size: 14px; color: var(--el-text-color-secondary); font-weight: 500; } @media (max-width: 768px) { .gsm8k-page .stats-grid { grid-template-columns: repeat(2, 1fr); gap: 24px; } } @media (max-width: 480px) { .gsm8k-page .stats-grid { grid-template-columns: 1fr; gap: 20px; } } .gsm8k-page .features-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 24px; } .gsm8k-page .feat-card { padding: 32px 28px; border: none; border-radius: 20px; box-shadow: 0 2px 12px 0 rgba(0,0,0,0.08); background: var(--el-bg-color); transition: border-color 0.2s, box-shadow 0.2s, transform 0.15s; } .gsm8k-page .feat-card:hover { box-shadow: 0 8px 24px 0 rgba(0,0,0,0.12); transform: translateY(-2px); } .gsm8k-page .feat-icon { font-size: 32px; margin-bottom: 16px; } .gsm8k-page .feat-card h3 { font-size: 18px; font-weight: 700; color: var(--el-text-color-primary); margin-bottom: 8px; } .gsm8k-page .feat-card p { font-size: 15px; color: var(--el-text-color-regular); line-height: 1.6; } @media (max-width: 1024px) { .gsm8k-page .features-grid { grid-template-columns: repeat(2, 1fr); } } @media (max-width: 640px) { .gsm8k-page .features-grid { grid-template-columns: 1fr; } } .gsm8k-page .usecases-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 20px; } .gsm8k-page .uc-card { padding: 28px 24px; background: var(--el-bg-color); border: none; border-radius: 20px; box-shadow: 0 2px 12px 0 rgba(0,0,0,0.08); text-align: center; transition: border-color 0.2s, box-shadow 0.2s, transform 0.15s; } .gsm8k-page .uc-card:hover { box-shadow: 0 8px 24px 0 rgba(0,0,0,0.12); transform: translateY(-2px); } .gsm8k-page .uc-icon { font-size: 36px; margin-bottom: 16px; } .gsm8k-page .uc-card h3 { font-size: 17px; font-weight: 700; color: var(--el-text-color-primary); margin-bottom: 8px; } .gsm8k-page .uc-card p { font-size: 14px; color: var(--el-text-color-regular); line-height: 1.6; } @media (max-width: 1024px) { .gsm8k-page .usecases-grid { grid-template-columns: repeat(2, 1fr); } } @media (max-width: 480px) { .gsm8k-page .usecases-grid { grid-template-columns: 1fr; } } .gsm8k-page .code-wrap { border-radius: 16px !important; overflow: hidden !important; border: 1px solid #334155 !important; background: #0f172a !important; max-width: 860px; margin: 0 auto; } .markdown-body .gsm8k-page .code-wrap { border-radius: 16px !important; overflow: hidden !important; border: 1px solid #334155 !important; background: #0f172a !important; } .gsm8k-page .code-bar { display: flex !important; align-items: center !important; justify-content: space-between !important; padding: 12px 20px !important; background: #1e293b !important; border-bottom: 1px solid #334155 !important; } .gsm8k-page .code-dots { display: flex; gap: 6px; } .gsm8k-page .code-dots i { width: 10px; height: 10px; border-radius: 50%; display: inline-block; } .gsm8k-page .code-dots .r { background: #ef4444; } .gsm8k-page .code-dots .y { background: #f59e0b; } .gsm8k-page .code-dots .g { background: #10b981; } .gsm8k-page .code-lang { font-size: 12px; color: var(--el-text-color-secondary); font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; } .gsm8k-page .code-block { padding: 24px !important; margin: 0 !important; overflow-x: auto !important; font-family: 'JetBrains Mono', 'Fira Code', 'SF Mono', monospace !important; font-size: 13.5px !important; line-height: 1.7 !important; color: #e2e8f0 !important; white-space: pre !important; background: transparent !important; border: none !important; border-radius: 0 !important; } .markdown-body .gsm8k-page .code-block { padding: 24px !important; margin: 0 !important; overflow-x: auto !important; font-family: 'JetBrains Mono', 'Fira Code', 'SF Mono', monospace !important; font-size: 13.5px !important; line-height: 1.7 !important; color: #e2e8f0 !important; white-space: pre !important; background: transparent !important; border: none !important; border-radius: 0 !important; } .gsm8k-page .steps-row { display: flex; align-items: flex-start; justify-content: center; margin-bottom: 48px; } .gsm8k-page .stp-card { flex: 1; max-width: 320px; text-align: center; padding: 0 24px; } .gsm8k-page .stp-num { font-size: clamp(48px, 6vw, 72px); font-weight: 700; color: #e2e8f0; letter-spacing: -0.04em; line-height: 1; margin-bottom: 20px; } .gsm8k-page .stp-card h3 { font-size: 18px; font-weight: 700; color: var(--el-text-color-primary); margin-bottom: 10px; } .gsm8k-page .stp-card p { font-size: 15px; color: var(--el-text-color-regular); line-height: 1.6; } .gsm8k-page .stp-conn { width: 60px; height: 2px; background: var(--el-border-color-light); margin-top: 36px; flex-shrink: 0; } .gsm8k-page .steps-cta { text-align: center; } @media (max-width: 768px) { .gsm8k-page .steps-row { flex-direction: column; align-items: center; gap: 32px; } .gsm8k-page .stp-conn { width: 2px; height: 32px; margin: 0; } .gsm8k-page .stp-card { max-width: 100%; } } .gsm8k-cta { padding: 100px 0; background: #0f172a; text-align: center; position: relative; overflow: hidden; } .gsm8k-cta::before { content: ''; position: absolute; top: -100px; left: 50%; transform: translateX(-50%); width: 700px; height: 400px; background: radial-gradient(ellipse, rgba(5, 150, 105, 0.12) 0%, transparent 70%); pointer-events: none; } .gsm8k-cta h2 { font-size: clamp(28px, 4vw, 44px); font-weight: 700; color: #f8fafc; letter-spacing: normal; margin-bottom: 28px; position: relative; } .gsm8k-cta > div > p { font-size: clamp(16px, 2vw, 18px); color: var(--el-text-color-secondary); max-width: 520px; margin: 0 auto 56px; line-height: 1.6; position: relative; } .gsm8k-page .cta-actions { display: flex; gap: 12px; justify-content: center; flex-wrap: wrap; position: relative; } .gsm8k-page .btn-cta-light { display: inline-flex; align-items: center; gap: 6px; padding: 14px 32px; background: #059669; color: #ffffff !important; border-radius: 9999px; font-size: 15px; font-weight: 700; transition: background 0.2s, transform 0.15s; text-decoration: none !important; } .gsm8k-page .btn-cta-light:hover { background: #047857; transform: translateY(-1px); text-decoration: none !important; } .gsm8k-page .btn-cta-ghost { display: inline-flex; align-items: center; padding: 14px 32px; background: transparent; color: #94a3b8 !important; border: 1px solid #334155; border-radius: 9999px; font-size: 15px; font-weight: 600; transition: border-color 0.2s, color 0.2s; text-decoration: none !important; } .gsm8k-page .btn-cta-ghost:hover { border-color: var(--el-text-color-regular); color: #e2e8f0 !important; text-decoration: none !important; } .gsm8k-page code { background: #ecfdf5 !important; padding: 2px 8px !important; border-radius: 5px !important; font-size: 13px !important; font-family: 'JetBrains Mono', 'Fira Code', 'SF Mono', monospace !important; color: #064e3b !important; border: 1px solid #6ee7b7 !important; } .gsm8k-page .s-text-dark { color: var(--el-text-color-primary); } .gsm8k-page .s-text-brand { color: #059669; } .gsm8k-page .s-section-body { font-size: 16px; color: var(--el-text-color-regular); line-height: 1.8; text-align: center; max-width: 680px; margin: 0 auto; } .gsm8k-page .s-section-body p + p { margin-top: 16px; } .gsm8k-page .tag-row { display: flex; gap: 8px; flex-wrap: wrap; justify-content: center; margin-top: 16px; } .gsm8k-page .tag-item
{
padding: 4px 12px; background: var(--el-bg-color-page);
border: 1px solid var(--el-border-color-light); border-radius: 9999px;
font-size: 12px; font-weight: 600; color: var(--el-text-color-regular);
}
html.dark .gsm8k-page { background: var(--el-bg-color); color: var(--el-text-color-primary); }
html.dark .gsm8k-page a { color: inherit; }
html.dark .markdown-body .gsm8k-page a { color: inherit !important; }
html.dark .markdown-body .gsm8k-page a.s-btn-primary,
html.dark .markdown-body .gsm8k-page a.btn-cta-light { color: #ffffff !important; }
html.dark .markdown-body .gsm8k-page a.s-btn-secondary { color: var(--el-text-color-primary) !important; }
html.dark .markdown-body .gsm8k-page a.btn-cta-ghost { color: #94a3b8 !important; }
html.dark .markdown-body .gsm8k-page a.btn-cta-ghost:hover { color: var(--el-text-color-primary) !important; }
html.dark .gsm8k-page .s-bg-white { background: var(--el-bg-color); }
html.dark .gsm8k-page .s-bg-gray { background: var(--el-bg-color-page); }
html.dark .gsm8k-page .s-bg-dark { background: var(--el-bg-color); }
html.dark .gsm8k-page .s-header h2 { color: var(--el-text-color-primary); }
html.dark .gsm8k-page .s-header p { color: var(--el-text-color-secondary); }
html.dark .gsm8k-page .s-btn-primary { background: #059669; color: #ffffff !important; }
html.dark .gsm8k-page .s-btn-primary:hover { background: #047857; }
html.dark .gsm8k-page .s-btn-secondary {
background: #1e293b; color: var(--el-text-color-primary) !important;
border-color: #475569;
}
html.dark .gsm8k-page .s-btn-secondary:hover { background: var(--el-border-color); border-color: var(--el-text-color-regular); }
html.dark .gsm8k-hero { background: var(--el-bg-color); }
html.dark .gsm8k-hero::before {
background: radial-gradient(ellipse, rgba(5, 150, 105, 0.15) 0%, transparent 70%);
}
html.dark .gsm8k-page .hero-badge { background: var(--el-bg-color-page); border-color: var(--el-border-color); color: var(--el-text-color-secondary); }
html.dark .gsm8k-hero h1 { color: var(--el-text-color-primary); }
html.dark .gsm8k-hero h1 span { color: #34d399; }
html.dark .gsm8k-page .hero-subtitle { color: var(--el-text-color-secondary); }
html.dark .gsm8k-page .hero-highlights .h-item { color: var(--el-text-color-secondary); }
html.dark .gsm8k-page .hero-highlights .h-div { background: var(--el-border-color); }
html.dark .gsm8k-stats { background: var(--el-bg-color-page); border-color: var(--el-border-color); }
html.dark .gsm8k-page .stat-val { color: var(--el-text-color-primary); }
html.dark .gsm8k-page .stat-lbl { color: var(--el-text-color-regular); }
html.dark .gsm8k-page .feat-card {
background: var(--el-bg-color-page); border-color: var(--el-border-color);
}
html.dark .gsm8k-page .feat-card:hover { border-color: var(--el-text-color-regular); box-shadow: 0 4px 16px rgba(0,0,0,0.3); }
html.dark .gsm8k-page .feat-card h3 { color: var(--el-text-color-primary); }
html.dark .gsm8k-page .feat-card p { color: var(--el-text-color-secondary); }
html.dark .gsm8k-page .uc-card { background: var(--el-bg-color-page); border-color: var(--el-border-color); }
html.dark .gsm8k-page .uc-card:hover { border-color: var(--el-text-color-regular); box-shadow: 0 4px 16px rgba(0,0,0,0.3); }
html.dark .gsm8k-page .uc-card h3 { color: var(--el-text-color-primary); }
html.dark .gsm8k-page .uc-card p { color: var(--el-text-color-secondary); }
html.dark .gsm8k-page .stp-num { color: #334155; }
html.dark .gsm8k-page .stp-card h3 { color: var(--el-text-color-primary); }
html.dark .gsm8k-page .stp-card p { color: var(--el-text-color-secondary); }
html.dark .gsm8k-page .stp-conn { background: var(--el-border-color); }
html.dark .gsm8k-page code {
background: #022c22 !important; color: #a7f3d0 !important; border-color: #064e3b !important;
}
html.dark .gsm8k-page .s-text-dark { color: var(--el-text-color-primary); }
html.dark .gsm8k-page .s-text-brand { color: #34d399; }
html.dark .gsm8k-page .s-section-body { color: var(--el-text-color-secondary); }
html.dark .gsm8k-page .tag-item { background: var(--el-border-color); border-color: var(--el-text-color-regular); color: var(--el-text-color-secondary); }
html.dark .gsm8k-cta { background: #022c22; }
html.dark .gsm8k-cta::before {
background: radial-gradient(ellipse, rgba(52, 211, 153, 0.2) 0%, transparent 70%);
}
html.dark .gsm8k-page .btn-cta-light { color: #ffffff !important; }
html.dark .gsm8k-page .btn-cta-ghost { color: #94a3b8 !important; }
html.dark .gsm8k-page .btn-cta-ghost:hover { color: var(--el-text-color-primary) !important; }
</style>
<div class="gsm8k-page">
<section class="gsm8k-hero">
<div class="s-container-narrow">
<div class="hero-badge">
<span class="badge-dot"></span>
GSM8K Math Reasoning Dataset
</div>
<h1>
GSM8K Math Reasoning<br/><span>Dataset</span>
</h1>
<p class="hero-subtitle">
GSM8K (Grade School Math 8K) is a high-quality benchmark dataset of elementary school math word problems created by OpenAI, expanded to 17.6K problems with detailed step-by-step solutions, and has become the standard benchmark for evaluating the mathematical reasoning abilities of large language models. Each problem requires 2-8 steps of basic mathematical operations to solve.
Dataset Highlights
The gold standard benchmark dataset for evaluating the mathematical reasoning abilities of large language models
Elementary Math Problems
The problems cover basic arithmetic operations at the elementary level, including addition, subtraction, multiplication, division, fractions, percentages, etc., requiring no advanced math knowledge, focusing on testing reasoning rather than computational complexity.
Step-by-step Solutions
Each problem comes with a detailed step-by-step solution process, highlighting intermediate calculations and final answers, providing high-quality annotated data for Chain-of-Thought training.
Language Diversity
The problems use natural language to describe a rich variety of real-life scenarios, covering themes such as shopping, speed, age, etc., with flexible language expressions that avoid being formulaic, making them closer to real-world problems.
Multi-step Reasoning
Each problem requires 2-8 reasoning steps to solve, demanding the model to have coherent logical reasoning and intermediate state tracking abilities, effectively distinguishing the reasoning levels of different models.
Natural Language Questions
All problems are presented in natural language text form, without formulas or symbolic expressions, testing the model's ability to understand mathematical relationships from text and extract key values.
Standard Benchmark
Widely used by mainstream large language models such as GPT-4, Claude, and Gemini for evaluating mathematical reasoning abilities, it is one of the most cited mathematical benchmarks in academic papers and technical reports.
Applicable Scenarios
Comprehensively supports mathematical reasoning research from model evaluation to educational development
Mathematical Reasoning Assessment
Evaluating the multi-step mathematical reasoning abilities of large language models, quantifying the model's accuracy and reasoning quality on basic math problems
Chain-of-Thought Training
Using step-by-step solution data to train Chain-of-Thought reasoning, enhancing the model's ability to decompose complex problems step by step
LLM Benchmark Testing
As a standardized benchmark to compare the mathematical reasoning performance of different models, tracking the trend of capability changes between model iterations
Educational AI Development
Building intelligent math tutoring systems and automatic problem-solving tools, providing training and evaluation data for AI applications in the K-12 education sector
Data Preview
Below are typical problems and step-by-step solution examples from the GSM8K dataset
{
"question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",
"answer": "Natalia sold 48/2 = 24 clips in May.\nNatalia sold 48 + 24 = 72 clips altogether in April and May.\n#### 72"
}
{
"question": "Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?",
"answer": "In the beginning, Betty has only 100 / 2 = $50.\nBetty's grandparents gave her 15 * 2 = $30.\nThis means, Betty needs 100 - 50 - 30 - 15 = $5 more.\n#### 5"
}
3 Steps to Get Started Quickly
From browsing to usage, you can start your mathematical reasoning research in just a few minutes
Browse the Dataset
View the details of the GSM8K dataset on the Ace Data Cloud platform to understand the question format, answer structure, and licensing agreements.
Obtain the Data
Access a complete set of 17.6K math problems and step-by-step solutions via API, with standardized data formats, ready to use.
Evaluation and Training
Use the dataset to evaluate the model's mathematical reasoning capabilities, or as a training data source for Chain-of-Thought fine-tuning.
Start Exploring the GSM8K Mathematical Reasoning Data
A standard mathematical reasoning benchmark produced by OpenAI, available under MIT open license, get it now. Whether you are evaluating large model capabilities or training reasoning models, GSM8K is an indispensable dataset.
