datasets
updated
argilla/distilabel-intel-orca-dpo-pairs
Viewer
• Updated
• 12.9k • 3.08k
• 181
Viewer
• Updated
• 66.4k • 1.23k
• 229
argilla/ultrafeedback-binarized-preferences-cleaned
Viewer
• Updated
• 60.9k • 3.03k
• 160
Viewer
• Updated
• 15.3k • 64
• 19
theblackcat102/evol-codealpaca-v1
Viewer
• Updated
• 111k • 3.68k
• 176
Viewer
• Updated
• 395k • 18.8k
• 447
glaiveai/glaive-code-assistant-v2
Viewer
• Updated
• 215k • 630
• 49
Viewer
• Updated
• 12.9k • 1.49k
• 319
Viewer
• Updated
• 183k • 1.14k
• 295
garage-bAInd/Open-Platypus
Viewer
• Updated
• 24.9k • 5.38k
• 415
LLM360/CrystalCoderDatasets
Updated
• 1.11k
• 21
protectai/deberta-v3-base-prompt-injection
Text Classification
• 0.2B • Updated
• 32.5k
• • 98
nampdn-ai/tiny-orca-textbooks
Viewer
• Updated
• 147k • 44
• 43
code-search-net/code_search_net
Viewer
• Updated
• 4.14M • 11.1k
• 320
WhiteRabbitNeo/WRN-Chapter-1
Viewer
• Updated
• 7.75k • 49
• 52
WhiteRabbitNeo/WRN-Chapter-2
Viewer
• Updated
• 11.1k • 32
• 21
Text Generation
• Updated
• 167
• 205
Viewer
• Updated
• 31.1M • 11.5k
• 674
Viewer
• Updated
• 3.54k • 89
• 55
NousResearch/json-mode-eval
Viewer
• Updated
• 100 • 504
• 41
Viewer
• Updated
• 2.75M • 4.41k
• 380
Viewer
• Updated
• 518k • 5
• 1
laurentiubp/openhermes-scored
Viewer
• Updated
• 185k • 5
• 1
Towards Best Practices for Open Datasets for LLM Training
Paper
• 2501.08365
• Published
• 62