<style>
.pixelprose-page * { box-sizing: border-box; }
.pixelprose-page h1, .pixelprose-page h2, .pixelprose-page h3, .pixelprose-page h4, .pixelprose-page h5, .pixelprose-page h6, .pixelprose-page p, .pixelprose-page ul, .pixelprose-page ol, .pixelprose-page li, .pixelprose-page pre, .pixelprose-page blockquote, .pixelprose-page table, .pixelprose-page td, .pixelprose-page th { margin: 0; padding: 0; }
.pixelprose-page {
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
color: var(--el-text-color-primary);
background: var(--el-bg-color);
line-height: 1.6;
}
.pixelprose-page a { text-decoration: none; color: inherit; }
.pixelprose-page a:hover { text-decoration: none; }
.pixelprose-page ul { list-style: none; }
.markdown-body .pixelprose-page a { color: inherit !important; text-decoration: none !important; }
.markdown-body .pixelprose-page a:hover { text-decoration: none !important; }
.markdown-body .pixelprose-page a.s-btn-primary,
.markdown-body .pixelprose-page a.btn-cta-light { color: #ffffff !important; }
.markdown-body .pixelprose-page a.s-btn-secondary { color: var(--el-text-color-primary) !important; }
.markdown-body .pixelprose-page a.btn-cta-ghost { color: #94a3b8 !important; }
.markdown-body .pixelprose-page a.btn-cta-ghost:hover { color: #e2e8f0 !important; }
.markdown-body .pixelprose-page h1, .markdown-body .pixelprose-page h2 { border-bottom: none !important; padding-bottom: 0 !important; }
.pixelprose-page .s-container { max-width: 1200px; margin: 0 auto; padding: 0 24px; }
.pixelprose-page .s-container-narrow { max-width: 800px; margin: 0 auto; padding: 0 24px; }
.pixelprose-page .s-container-wide { max-width: 1100px; margin: 0 auto; padding: 0 32px; }
.pixelprose-page .s-section { padding: 80px 0; }
.pixelprose-page .s-section-lg { padding: 100px 0; }
.pixelprose-page .s-section-sm { padding: 48px 0; }
.pixelprose-page .s-bg-white { background: var(--el-bg-color); }
.pixelprose-page .s-bg-gray { background: var(--el-bg-color-page); }
.pixelprose-page .s-bg-dark { background: #0f172a; color: #f8fafc; }
.pixelprose-page .s-header { text-align: center; margin-bottom: 64px; }
.pixelprose-page .s-header h2 {
font-size: clamp(28px, 4vw, 40px);
font-weight: 700;
color: var(--el-text-color-primary);
letter-spacing: normal;
margin-bottom: 20px;
line-height: 1.15;
}
.pixelprose-page .s-header p {
font-size: clamp(16px, 2vw, 18px);
color: var(--el-text-color-regular);
max-width: 640px;
margin: 0 auto;
line-height: 1.6;
}
.pixelprose-page .s-bg-dark .s-header h2 { color: #f8fafc; }
.pixelprose-page .s-bg-dark .s-header p { color: var(--el-text-color-secondary); }
.pixelprose-page .s-btn-primary {
display: inline-flex; align-items: center; gap: 6px;
padding: 14px 28px;
background: #f59e0b; color: #ffffff !important;
border-radius: 9999px; font-size: 15px; font-weight: 600;
transition: background 0.2s, transform 0.15s;
border: none; cursor: pointer;
text-decoration: none !important;
}
.pixelprose-page .s-btn-primary:hover { background: #d97706; transform: translateY(-1px); text-decoration: none !important; }
.pixelprose-page .s-btn-secondary {
display: inline-flex; align-items: center; gap: 6px;
padding: 14px 28px;
background: var(--el-bg-color); color: var(--el-text-color-primary) !important;
border: 1px solid var(--el-border-color-light);
border-radius: 9999px; font-size: 15px; font-weight: 600;
transition: border-color 0.2s, background 0.2s;
cursor: pointer;
text-decoration: none !important;
}
.pixelprose-page .s-btn-secondary:hover { background: var(--el-bg-color-page); text-decoration: none !important; }
.pixelprose-hero {
padding: 100px 0 80px;
text-align: center;
background: var(--el-bg-color);
position: relative;
overflow: hidden;
}
.pixelprose-hero::before {
content: '';
position: absolute;
top: -200px; left: 50%;
transform: translateX(-50%);
width: 900px; height: 500px;
background: radial-gradient(ellipse, rgba(245, 158, 11, 0.06) 0%, transparent 70%);
pointer-events: none;
}
.pixelprose-page .hero-badge {
display: inline-flex; align-items: center; gap: 8px;
padding: 6px 16px;
background: var(--el-bg-color-page); border: 1px solid var(--el-border-color-light);
border-radius: 9999px; font-size: 13px; font-weight: 600; color: var(--el-text-color-regular);
margin-bottom: 28px;
}
.pixelprose-page .hero-badge .badge-dot {
width: 6px; height: 6px; background: #10b981; border-radius: 50%;
display: inline-block;
}
.pixelprose-hero h1 {
font-size: clamp(36px, 5vw, 60px);
font-weight: 700; line-height: 1.05;
letter-spacing: normal; color: var(--el-text-color-primary);
margin-bottom: 20px;
position: relative;
}
.pixelprose-hero h1 span { color: #f59e0b; }
.pixelprose-page .hero-subtitle {
font-size: clamp(16px, 2vw, 20px);
color: var(--el-text-color-regular); line-height: 1.6;
max-width: 620px; margin: 0 auto 56px;
position: relative;
}
.pixelprose-page .hero-actions {
display: flex; gap: 12px; justify-content: center;
flex-wrap: wrap; margin-bottom: 56px; position: relative;
}
.pixelprose-page .hero-highlights {
display: flex; align-items: center; justify-content: center;
gap: 16px; flex-wrap: wrap; position: relative;
}
.pixelprose-page .hero-highlights .h-item { font-size: 14px; color: var(--el-text-color-regular); font-weight: 500; }
.pixelprose-page .hero-highlights .h-div { width: 1px; height: 16px; background: var(--el-border-color-light); }
@media (max-width: 640px) 

{ .pixelprose-page .hero-highlights .h-div { display: none; } .pixelprose-page .hero-highlights { gap: 8px 16px; } .pixelprose-page .hero-actions { flex-direction: column; align-items: center; } .pixelprose-page .hero-actions a { width: 100%; max-width: 280px; justify-content: center; } } .pixelprose-page .hero-cover { max-width: 720px; margin: 48px auto 0; border-radius: 16px; overflow: hidden; box-shadow: 0 8px 32px rgba(0,0,0,0.10); } .pixelprose-page .hero-cover img { width: 100%; height: auto; display: block; } .pixelprose-stats { padding: 48px 0; background: var(--el-bg-color-page); border-top: 1px solid var(--el-border-color-lighter); border-bottom: 1px solid var(--el-border-color-lighter); } .pixelprose-page .stats-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 32px; text-align: center; } .pixelprose-page .stat-icon { font-size: 28px; margin-bottom: 12px; } .pixelprose-page .stat-val { font-size: clamp(28px, 4vw, 40px); font-weight: 700; color: var(--el-text-color-primary); letter-spacing: normal; margin-bottom: 4px; } .pixelprose-page .stat-lbl { font-size: 14px; color: var(--el-text-color-secondary); font-weight: 500; } @media (max-width: 768px) { .pixelprose-page .stats-grid { grid-template-columns: repeat(2, 1fr); gap: 24px; } } @media (max-width: 480px) { .pixelprose-page .stats-grid { grid-template-columns: 1fr; gap: 20px; } } .pixelprose-page .features-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 24px; } .pixelprose-page .feat-card { padding: 32px 28px; border: none; border-radius: 20px; box-shadow: 0 2px 12px 0 rgba(0,0,0,0.08); background: var(--el-bg-color); transition: border-color 0.2s, box-shadow 0.2s, transform 0.15s; } .pixelprose-page .feat-card:hover { box-shadow: 0 8px 24px 0 rgba(0,0,0,0.12); transform: translateY(-2px); } .pixelprose-page .feat-icon { font-size: 32px; margin-bottom: 16px; } .pixelprose-page .feat-card h3 { font-size: 18px; font-weight: 700; color: var(--el-text-color-primary); margin-bottom: 8px; } .pixelprose-page .feat-card p { font-size: 15px; color: var(--el-text-color-regular); line-height: 1.6; } @media (max-width: 1024px) { .pixelprose-page .features-grid { grid-template-columns: repeat(2, 1fr); } } @media (max-width: 640px) { .pixelprose-page .features-grid { grid-template-columns: 1fr; } } .pixelprose-page .usecases-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 20px; } .pixelprose-page .uc-card { padding: 28px 24px; background: var(--el-bg-color); border: none; border-radius: 20px; box-shadow: 0 2px 12px 0 rgba(0,0,0,0.08); text-align: center; transition: border-color 0.2s, box-shadow 0.2s, transform 0.15s; } .pixelprose-page .uc-card:hover { box-shadow: 0 8px 24px 0 rgba(0,0,0,0.12); transform: translateY(-2px); } .pixelprose-page .uc-icon { font-size: 36px; margin-bottom: 16px; } .pixelprose-page .uc-card h3 { font-size: 17px; font-weight: 700; color: var(--el-text-color-primary); margin-bottom: 8px; } .pixelprose-page .uc-card p { font-size: 14px; color: var(--el-text-color-regular); line-height: 1.6; } @media (max-width: 1024px) { .pixelprose-page .usecases-grid { grid-template-columns: repeat(2, 1fr); } } @media (max-width: 480px) { .pixelprose-page .usecases-grid { grid-template-columns: 1fr; } } .pixelprose-page .code-wrap { border-radius: 16px !important; overflow: hidden !important; border: 1px solid #334155 !important; background: #0f172a !important; max-width: 860px; margin: 0 auto; } .markdown-body .pixelprose-page .code-wrap { border-radius: 16px !important; overflow: hidden !important; border: 1px solid #334155 !important; background: #0f172a !important; } .pixelprose-page .code-bar { display: flex !important; align-items: center !important; justify-content: space-between !important; padding: 12px 20px !important; background: #1e293b !important; border-bottom: 1px solid #334155 !important; } .pixelprose-page .code-dots { display: flex; gap: 6px; } .pixelprose-page .code-dots i { width: 10px; height: 10px; border-radius: 50%; display: inline-block; } .pixelprose-page .code-dots .r { background: #ef4444; } .pixelprose-page .code-dots .y { background: #f59e0b; } .pixelprose-page .code-dots .g { background: #10b981; } .pixelprose-page .code-lang { font-size: 12px; color: var(--el-text-color-secondary); font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; } .pixelprose-page .code-block { padding: 24px !important; margin: 0 !important; overflow-x: auto !important; font-family: 'JetBrains Mono', 'Fira Code', 'SF Mono', monospace !important; font-size: 13.5px !important; line-height: 1.7 !important; color: #e2e8f0 !important; white-space: pre !important; background: transparent !important; border: none !important; border-radius: 0 !important; } .markdown-body .pixelprose-page .code-block { padding: 24px !important; margin: 0 !important; overflow-x: auto !important; font-family: 'JetBrains Mono', 'Fira Code', 'SF Mono', monospace !important; font-size: 13.5px !important; line-height: 1.7 !important; color: #e2e8f0 !important; white-space: pre !important; background: transparent !important; border: none !important; border-radius: 0 !important; } .pixelprose-page .steps-row { display: flex; align-items: flex-start; justify-content: center; margin-bottom: 48px; } .pixelprose-page .stp-card { flex: 1; max-width: 320px; text-align: center; padding: 0 24px; } .pixelprose-page .stp-num { font-size: clamp(48px, 6vw, 72px); font-weight: 700; color: #e2e8f0; letter-spacing: -0.04em; line-height: 1; margin-bottom: 20px; } .pixelprose-page .stp-card h3 { font-size: 18px; font-weight: 700; color: var(--el-text-color-primary); margin-bottom: 10px; } .pixelprose-page .stp-card p { font-size: 15px; color: var(--el-text-color-regular); line-height: 1.6; } .pixelprose-page .stp-conn { width: 60px; height: 2px; background: var(--el-border-color-light); margin-top: 36px; flex-shrink: 0; } .pixelprose-page .steps-cta { text-align: center; } @media (max-width: 768px) { .pixelprose-page .steps-row { flex-direction: column; align-items: center; gap: 32px; } .pixelprose-page .stp-conn { width: 2px; height: 32px; margin: 0; } .pixelprose-page .stp-card { max-width: 100%; } } .pixelprose-cta { padding: 100px 0; background: #0f172a; text-align: center; position: relative; overflow: hidden; } .pixelprose-cta::before { content: ''; position: absolute; top: -100px; left: 50%; transform: translateX(-50%); width: 700px; height: 400px; background: radial-gradient(ellipse, rgba(245, 158, 11, 0.12) 0%, transparent 70%); pointer-events: none; } .pixelprose-cta h2 { font-size: clamp(28px, 4vw, 44px); font-weight: 700; color: #f8fafc; letter-spacing: normal; margin-bottom: 28px; position: relative; } .pixelprose-cta > div > p { font-size: clamp(16px, 2vw, 18px); color: var(--el-text-color-secondary); max-width: 520px; margin: 0 auto 56px; line-height: 1.6; position: relative; } .pixelprose-page .cta-actions { display: flex; gap: 12px; justify-content: center; flex-wrap: wrap; position: relative; } .pixelprose-page .btn-cta-light { display: inline-flex; align-items: center; gap: 6px; padding: 14px 32px; background: #f59e0b; color: #ffffff !important; border-radius: 9999px; font-size: 15px; font-weight: 700; transition: background 0.2s, transform 0.15s; text-decoration: none !important; } .pixelprose-page .btn-cta-light:hover { background: #d97706; transform: translateY(-1px); text-decoration: none !important; } .pixelprose-page .btn-cta-ghost { display: inline-flex; align-items: center; padding: 14px 32px; background: transparent; color: #94a3b8 !important; border: 1px solid #334155; border-radius: 9999px; font-size: 15px; font-weight: 600; transition: border-color 0.2s, color 0.2s; text-decoration: none !important; } .pixelprose-page .btn-cta-ghost:hover { border-color: var(--el-text-color-regular); color: #e2e8f0 !important; text-decoration: none !important; } .pixelprose-page code { background: #fef3c7 !important; padding: 2px 8px !important; border-radius: 5px !important; font-size: 13px !important; font-family: 'JetBrains Mono', 'Fira Code', 'SF Mono', monospace !important; color: #d97706 !important; border: 1px solid #fde68a !important; } .pixelprose-page .s-text-dark { color: var(--el-text-color-primary); } .pixelprose-page .s-text-brand { color: #f59e0b; } .pixelprose-page .s-section-body { font-size: 16px; color: var(--el-text-color-regular); line-height: 1.8; text-align: center; max-width: 680px; margin: 0 auto; } .pixelprose-page .s-section-body p + p { margin-top: 16px; } .pixelprose-page .tag-row { display: flex; gap: 8px; flex-wrap: wrap; justify-content: center; margin-top: 16px; } .pixelprose-page .tag-item

{
padding: 4px 12px; background: var(--el-bg-color-page);
border: 1px solid var(--el-border-color-light); border-radius: 9999px;
font-size: 12px; font-weight: 600; color: var(--el-text-color-regular);
}
html.dark .pixelprose-page { background: var(--el-bg-color); color: var(--el-text-color-primary); }
html.dark .pixelprose-page a { color: inherit; }
html.dark .markdown-body .pixelprose-page a { color: inherit !important; }
html.dark .markdown-body .pixelprose-page a.s-btn-primary,
html.dark .markdown-body .pixelprose-page a.btn-cta-light { color: #ffffff !important; }
html.dark .markdown-body .pixelprose-page a.s-btn-secondary { color: var(--el-text-color-primary) !important; }
html.dark .markdown-body .pixelprose-page a.btn-cta-ghost { color: #94a3b8 !important; }
html.dark .markdown-body .pixelprose-page a.btn-cta-ghost:hover { color: var(--el-text-color-primary) !important; }
html.dark .pixelprose-page .s-bg-white { background: var(--el-bg-color); }
html.dark .pixelprose-page .s-bg-gray { background: var(--el-bg-color-page); }
html.dark .pixelprose-page .s-bg-dark { background: var(--el-bg-color); }
html.dark .pixelprose-page .s-header h2 { color: var(--el-text-color-primary); }
html.dark .pixelprose-page .s-header p { color: var(--el-text-color-secondary); }
html.dark .pixelprose-page .s-btn-primary { background: #f59e0b; color: #ffffff !important; }
html.dark .pixelprose-page .s-btn-primary:hover { background: #d97706; }
html.dark .pixelprose-page .s-btn-secondary {
background: #1e293b; color: var(--el-text-color-primary) !important;
border-color: #475569;
}
html.dark .pixelprose-page .s-btn-secondary:hover { background: var(--el-border-color); border-color: var(--el-text-color-regular); }
html.dark .pixelprose-hero { background: var(--el-bg-color); }
html.dark .pixelprose-hero::before {
background: radial-gradient(ellipse, rgba(245, 158, 11, 0.15) 0%, transparent 70%);
}
html.dark .pixelprose-page .hero-badge { background: var(--el-bg-color-page); border-color: var(--el-border-color); color: var(--el-text-color-secondary); }
html.dark .pixelprose-hero h1 { color: var(--el-text-color-primary); }
html.dark .pixelprose-hero h1 span { color: #fbbf24; }
html.dark .pixelprose-page .hero-subtitle { color: var(--el-text-color-secondary); }
html.dark .pixelprose-page .hero-highlights .h-item { color: var(--el-text-color-secondary); }
html.dark .pixelprose-page .hero-highlights .h-div { background: var(--el-border-color); }
html.dark .pixelprose-stats { background: var(--el-bg-color-page); border-color: var(--el-border-color); }
html.dark .pixelprose-page .stat-val { color: var(--el-text-color-primary); }
html.dark .pixelprose-page .stat-lbl { color: var(--el-text-color-regular); }
html.dark .pixelprose-page .feat-card {
background: var(--el-bg-color-page); border-color: var(--el-border-color);
}
html.dark .pixelprose-page .feat-card:hover { border-color: var(--el-text-color-regular); box-shadow: 0 4px 16px rgba(0,0,0,0.3); }
html.dark .pixelprose-page .feat-card h3 { color: var(--el-text-color-primary); }
html.dark .pixelprose-page .feat-card p { color: var(--el-text-color-secondary); }
html.dark .pixelprose-page .uc-card { background: var(--el-bg-color-page); border-color: var(--el-border-color); }
html.dark .pixelprose-page .uc-card:hover { border-color: var(--el-text-color-regular); box-shadow: 0 4px 16px rgba(0,0,0,0.3); }
html.dark .pixelprose-page .uc-card h3 { color: var(--el-text-color-primary); }
html.dark .pixelprose-page .uc-card p { color: var(--el-text-color-secondary); }
html.dark .pixelprose-page .stp-num { color: #334155; }
html.dark .pixelprose-page .stp-card h3 { color: var(--el-text-color-primary); }
html.dark .pixelprose-page .stp-card p { color: var(--el-text-color-secondary); }
html.dark .pixelprose-page .stp-conn { background: var(--el-border-color); }
html.dark .pixelprose-page code {
background: #78350f !important; color: #fcd34d !important; border-color: #f59e0b !important;
}
html.dark .pixelprose-page .s-text-dark { color: var(--el-text-color-primary); }
html.dark .pixelprose-page .s-text-brand { color: #fbbf24; }
html.dark .pixelprose-page .s-section-body { color: var(--el-text-color-secondary); }
html.dark .pixelprose-page .tag-item { background: var(--el-border-color); border-color: var(--el-text-color-regular); color: var(--el-text-color-secondary); }
html.dark .pixelprose-cta { background: #020617; }
html.dark .pixelprose-cta::before {
background: radial-gradient(ellipse, rgba(245, 158, 11, 0.2) 0%, transparent 70%);
}
html.dark .pixelprose-page .btn-cta-light { color: #ffffff !important; }
html.dark .pixelprose-page .btn-cta-ghost { color: #94a3b8 !important; }
html.dark .pixelprose-page .btn-cta-ghost:hover { color: var(--el-text-color-primary) !important; }
</style>
<div class="pixelprose-page">
<section class="pixelprose-hero">
<div class="s-container-narrow">
<div class="hero-badge">
<span class="badge-dot"></span>
PixelProse Dataset
</div>
<h1>
PixelProse<br/><span>Dataset</span>
</h1>
<p class="hero-subtitle">
PixelProse is a large-scale dense image captioning dataset, containing detailed textual descriptions of over 16 million images, each accompanied by rich visual content descriptions, suitable for training visual-language models and image understanding research.

16 million+ images Dense descriptions Open license Multi-source data
PixelProse Dataset
πŸ–ΌοΈ
16M+
Total images
πŸ“
Dense
Dense description annotations
🌐
Multi
Multi-source image data
πŸ“œ
Open
Open license agreement

Dataset Highlights

A large-scale dense image description dataset that provides a solid foundation for visual-language model research

πŸ“–

Dense Descriptions

Each image is accompanied by a detailed textual description of visual content, covering multi-layered information such as scenes, objects, attributes, relationships, etc., far exceeding the information density of brief titles.

🌍

Multi-source Images

Image sources cover multiple public datasets and internet resources, encompassing a rich variety of visual domains including natural scenes, people, animals, architecture, art, etc.

πŸ“Š

Ultra-large Scale

Contains over 16 million images and their dense descriptions, making it one of the largest open dense image description datasets, meeting the needs for large-scale model training.

🎯

High-quality Annotations

The descriptive texts are carefully generated and quality-checked to ensure semantic accuracy and completeness of descriptions, providing reliable supervisory signals for model training.

πŸ”—

Visual-Language Alignment

Precise pairing of images and textual descriptions, naturally suitable for research directions such as visual-language pre-training, image-text alignment, and cross-modal representation learning.

πŸ›οΈ

Open Research Use

The dataset is released under an open license, supporting academic research and non-commercial use, promoting the development of open science in the field of visual-language understanding.

Applicable Scenarios

From model pre-training to downstream tasks, covering the entire chain of visual-language research

πŸ€–

Visual-Language Model Training

As pre-training data for large-scale visual-language models (VLM), enhancing the model's image understanding and description generation capabilities

πŸ’¬

Image Description Generation

Training and evaluating image captioning models to generate accurate and detailed image description texts

❓

Visual Question Answering

Utilizing rich image descriptions to build visual question answering (VQA) training data, enhancing the model's reasoning ability regarding visual content

πŸ”

Image Retrieval

Building a text-image retrieval system based on dense descriptions, achieving precise cross-modal retrieval from text to image and image to text

Image Description Visual Language Model VLM Multimodal Computer Vision

Data Preview

The following are example entries from the dataset, each record contains an image URL and its corresponding dense description text

JSON
{
"image_url": "https://example.com/images/000001.jpg",
"caption": "A golden retriever sits on a wooden dock by a calm lake at sunset. The dog's fur is illuminated by warm orange light, and its tongue hangs out happily. Behind the dog, the lake reflects the pink and purple hues of the sky. Tall pine trees line the far shore, their silhouettes dark against the colorful horizon. A small red canoe is tied to the dock on the left side of the frame.
The wooden planks of the dock show signs of weathering, with some moss growing between the cracks.",
"source": "flickr",
"image_width": 1920,
"image_height": 1280
}

3 Steps to Get Started Quickly

From browsing to loading, you can start your visual-language research project in just a few minutes.

01

Browse the Dataset

View the dataset details on the Ace Data Cloud platform to understand metadata such as data scale, field descriptions, and licensing agreements.

02

Download the Data

Obtain the dataset files through the download methods provided by the platform, supporting on-demand downloads of partial shards or the complete dataset.

03

Load and Use

Use datasets.load_dataset("pixelprose") to load the data and begin training and researching visual-language models.

Start Exploring the PixelProse Dataset

Over 16 million images with dense descriptions, open license, available immediately. Whether you are a multimodal researcher or a visual-language model developer, this dataset is an ideal choice.