datasets_for_fine_tuning_llm iamtarun/python_code_instructions_18k_alpaca Viewer • Updated Jul 27, 2023 • 18.6k • 19.1k • 346 HuggingFaceTB/smollm-corpus Viewer • Updated Sep 6, 2024 • 237M • 32.2k • 467 databricks/databricks-dolly-15k Viewer • Updated Jun 30, 2023 • 15k • 36.3k • 986 ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 22.8k • 180
inference_provided_llm_only medicalai/ClinicalBERT Fill-Mask • Updated Apr 14, 2025 • 24.7k • • 375 google/embeddinggemma-300m Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.6M • • 1.75k BAAI/bge-small-en-v1.5 Feature Extraction • 33.4M • Updated Feb 22, 2024 • 61.8M • • 497 redis/langcache-embed-v1 Sentence Similarity • 0.1B • Updated Dec 8, 2025 • 279k • • 15
datasets_for_fine_tuning_llm iamtarun/python_code_instructions_18k_alpaca Viewer • Updated Jul 27, 2023 • 18.6k • 19.1k • 346 HuggingFaceTB/smollm-corpus Viewer • Updated Sep 6, 2024 • 237M • 32.2k • 467 databricks/databricks-dolly-15k Viewer • Updated Jun 30, 2023 • 15k • 36.3k • 986 ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 22.8k • 180
inference_provided_llm_only medicalai/ClinicalBERT Fill-Mask • Updated Apr 14, 2025 • 24.7k • • 375 google/embeddinggemma-300m Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.6M • • 1.75k BAAI/bge-small-en-v1.5 Feature Extraction • 33.4M • Updated Feb 22, 2024 • 61.8M • • 497 redis/langcache-embed-v1 Sentence Similarity • 0.1B • Updated Dec 8, 2025 • 279k • • 15