Daniel Rosehill PRO

danielrosehill

Chinan010's profile picture

invincible-jha's profile picture

salman99669966's profile picture

https://www.danielrosehill.com

danielsrosehill
danielrosehill
danielrosehill
@danielrosehill.bsky.social‬

AI & ML interests

Speech to text (STT), voice workflows, MCP, AI agents, orchestration, automation.

Recent Activity

updated a collection about 1 month ago

Israel Open Data

updated a dataset about 1 month ago

danielrosehill/Israel-Open-Data-Catalogue

published a dataset about 1 month ago

danielrosehill/Israel-Open-Data-Catalogue

View all activity

Organizations

danielrosehill 's collections 164

Israel Open Data

danielrosehill/Israel-Open-Data-Catalogue

Updated May 6 • 38
danielrosehill/Jerusalem-Air-Quality-Shabbat

Viewer • Updated Apr 28 • 7.95M • 252

Disfluency

4i-ai/BERT_disfluency_cls

Text Classification • 0.1B • Updated Aug 25, 2023 • 514 • 1
amaai-lab/DisfluencySpeech

Viewer • Updated Jun 27, 2024 • 5k • 412 • 21
adjaysagar/english-DisfluencySpeech

Viewer • Updated Feb 7 • 4.5k • 4
arielcerdap/disfluency-fluencybank

Viewer • Updated Mar 17 • 17.8k • 63

Hebrew Puncutation Restoration

verbit/hebrew_punctuation

Updated Oct 6, 2024 • 12 • 1

English Hebrew Translation

ashercn97/english-hebrew-translation

Translation • 77M • Updated Nov 12, 2023 • 24 • 2

Hebrew Diacritic Restoration Models

baravninaor/punctuation-restoration-deberta-alepgbert-hebrew

Updated Mar 22, 2023 • 1

Hebrew-TTS

Yzamari/f5tts-hebrew-v2

Text-to-Speech • 0.3B • Updated Mar 28 • 62 • 1
notmax123/Zonos-Hebrew

Text-to-Speech • Updated Sep 11, 2025 • 13.9k • 3

Utilities

Running

Featured

1.05k

Can You Run It? LLM version

🚀

1.05k

Check if your GPU can run a chosen LLM model

MWP-TTS-Candidates

ResembleAI/chatterbox-turbo

Text-to-Speech • Updated Dec 15, 2025 • • 652

ASR-To-Try

zai-org/GLM-ASR-Nano-2512

Automatic Speech Recognition • 2B • Updated Apr 7 • 118k • 370
nvidia/canary-qwen-2.5b

Automatic Speech Recognition • 3B • Updated Apr 21 • 39.5k • 437
microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 481k • 1.6k
facebook/omniASR-LLM-7B

Automatic Speech Recognition • Updated Nov 28, 2025 • 32

Evaluation Datasets

danielrosehill/Small-STT-Eval-Audio-Dataset

Viewer • Updated Dec 10, 2025 • 92 • 39

Voxtral Originals (Mistral)

The two official variants of Voxtral (audio multimodal model) released by Mistral in July 2025

mistralai/Voxtral-Mini-3B-2507

5B • Updated Jul 28, 2025 • 309k • 655
mistralai/Voxtral-Small-24B-2507

Audio-Text-to-Text • 24B • Updated Dec 20, 2025 • 43.7k • 498

Flux 2 Quants

city96/FLUX.2-dev-gguf

Image-to-Image • 32B • Updated Nov 29, 2025 • 86.7k • 146
gguf-org/flux2-dev-gguf

Image-to-Image • 18B • Updated Jan 1 • 6.55k • 57

My Public Audio Datasets

Open sourced audio datasets for STT/ASR. All recordings by me (Daniel Rosehill) unless otherwise accredited.

danielrosehill/English-Hebrew-Mixed-Sentences

Viewer • Updated Nov 17, 2025 • 516 • 59
danielrosehill/Tech-Sentences-For-ASR-Training

Viewer • Updated Nov 26, 2025 • 205 • 201 • 2
danielrosehill/Sample-Voice-Context-Data

Viewer • Updated Nov 30, 2025 • 159 • 27

TTS Models

Xenova/speecht5_tts

Text-to-Speech • Updated Aug 27, 2025 • 5.9k • 42
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 13.4M • • 6.31k
microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 109k • 2.39k
neuphonic/neutts-air

Text-to-Speech • 0.7B • Updated Feb 12 • 18.8k • 874

ASR Resources

Useful datasets and models for ASR projects and fine tuning

ggerganov/whisper.cpp

Automatic Speech Recognition • Updated Oct 29, 2024 • 1.45k
openslr/librispeech_asr

Viewer • Updated Jul 25, 2025 • 585k • 103k • 228
speechcolab/gigaspeech

Viewer • Updated Feb 7 • 11.9M • 27k • 164
agentlans/high-quality-english-sentences

Viewer • Updated Oct 1, 2024 • 1.71M • 410 • 37

My-ASR-Finetunes

danielrosehill/Whisper-Hebrish

0.8B • Updated Nov 18, 2025 • 60
Sleeping

Agents

Whisper Hebrish

🎤

Compare fine-tuned vs stock Whisper models

FUTO Models

futo-org/acft-whisper-tiny

Automatic Speech Recognition • 57.7M • Updated Jun 25, 2024 • 3 • 1
futo-org/acft-whisper-small.en

Automatic Speech Recognition • 0.3B • Updated Jun 25, 2024 • 6 • 2
futo-org/acft-whisper-base.en

Automatic Speech Recognition • 99.1M • Updated Jun 25, 2024 • 4 • 2
futo-org/acft-whisper-tiny.en

Automatic Speech Recognition • 57.7M • Updated Jun 25, 2024 • 3 • 1

ASR Benchmarking

Running on CPU Upgrade

Agents

Featured

1.37k

Open ASR Leaderboard

🏆

1.37k

Explore and compare speech recognition model benchmarks
Sleeping

Agents

2

Asr Metrics

👀

2

Analyze ASR accuracy by comparing text files
danielrosehill/Podcast-ASR-Evaluation

Viewer • Updated Nov 14, 2025 • 27 • 15

API Price Comparisons

Running

Agentic LLM Price Comparisons

🤖

218 tool-calling LLMs analyzed by cost and context
danielrosehill/Open-Router-API-Pricing-Analysis

Viewer • Updated Nov 10, 2025 • 2.38k • 125

My LORAS

danielrosehill/Jerusalem-Images

Text-to-Image • Updated Nov 6, 2025 • 3 •
danielrosehill/Tel-Aviv-Street-Style

Text-to-Image • Updated Nov 6, 2025 • 3 •
danielrosehill/Herman-Poppleberry

Text-to-Image • Updated Nov 6, 2025 • 4 •
danielrosehill/Dinosaur-Sloth

Text-to-Image • Updated Nov 6, 2025 • 3 •

To Check Out

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated Dec 23, 2025 • 128k • • 1.5k
stabilityai/sp4d

Updated Nov 5, 2025 • 25 • 13
Falconsai/text_summarization

Summarization • 60.5M • Updated Feb 17, 2024 • 141k • • 293
distil-whisper/distil-large-v3

Automatic Speech Recognition • 0.8B • Updated Apr 21 • 931k • 375

Architecture Related Models

Collection of models gathered together for my wife who is an architect (of buildings!)

Muapi/jj-s-architecture-office-building

Text-to-Image • Updated 29 days ago • 12 •
prithivMLmods/Canopus-Interior-Architecture-0.1

Text-to-Image • Updated Aug 4, 2024 • 318 • • 26

Image to 3D

Running on Zero

MCP

631

TRELLIS

🏢

631

Scalable and Versatile 3D Generation from images

My Ideas

Running

2

Claude Agent Picker Pattern

🎯

2

Pattern for managing multi-agent crews in Claude Code

Whisper Base + variants

openai/whisper-base

Automatic Speech Recognition • 72.6M • Updated Feb 29, 2024 • 4.36M • 271
openai/whisper-base.en

Automatic Speech Recognition • 72.6M • Updated Jan 22, 2024 • 22.4k • 43
onnx-community/whisper-base_timestamped

Automatic Speech Recognition • Updated Mar 5, 2025 • 3.53k • 32
Systran/faster-whisper-base

Automatic Speech Recognition • Updated Nov 23, 2023 • 1.33M • 28

Voice Modality Apps

Runtime error

Agents

2

Voice Generated Visions

🦀

2

Cloud-Native Voice to-Image Generation using LLMs and GenAI

Worlds (3D, Games)

Running on Zero

Agents

183

HunyuanWorld-Mirror

🌍

183

Universal 3D World Reconstruction with Any Prior Prompting

Demos

Runtime error

Agents

Baby Noise Cancellation Demo

👶

AI-powered baby noise removal demo with STT comparison
Running

Nano Banana Sketch Cleanup

🎨

AI sketch cleanup with before/after comparisons

Background Noise Removal

Runtime error

Agents

Baby Noise Cancellation Demo

👶

AI-powered baby noise removal demo with STT comparison
Running

Agents

171

DeepFilterNet2

💩

171

Denoise your recordings and view spectrograms
Running

Agents

17

DeepFilterNet2 No File Size Limit

😻

17

Use DeepFilterNet2 to denoise audio no file size limit

Vibe Coding

Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

Project Indexes

Running

AI Project Index

🏆

Navigable index of AI projects, tools, and agents
Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

AI UIs

Sleeping

Agents

Pen Pal AI

📮

Write letters and get thoughtful AI replies

Voice Enhancement

Running

Agents

19

BroadcastAudioUpscaling

🌖

19

Enhance broadcast audio with super‑resolution upscaling
Running on Zero

Agents

55

Apollo

💻

55

Restore and enhance audio files with AI models

Taxonomies

Sleeping

Agents

Multimodal AI Taxonomy

🌍
danielrosehill/multimodal-ai-taxonomy

Viewer • Updated Oct 22, 2025 • 85 • 76

Veo 3.1

Running

Agents

591

veo3.1-fast

🐨

591

Generate videos from text prompts or images

Architecture

benlehrburger/modern-architecture

Viewer • Updated May 31, 2023 • 1.09k • 262 • 4
Sleeping

Agents

2

ArchitectureClassifier

📈

2

Classify architectural styles in images
Running

Agents

17

Rocco Architecture Render

🚀

17

Generate interior and exterior designs from sketches
Sleeping

Agents

1

London Architecture

💻

1

Classify architectural styles in images

Style Transfer

Running

367

SD Artists Browser

🤘

367

Explore artist styles and build SDXL prompts
Running on Zero

MCP

65

StyleAligned Transfer

🐠

65

Generate images in the style of a reference image
Running

Agents

17

StyleFeatureEditor

💻

17

Edit images with predefined styles or text prompts
Runtime error

Agents

12

Kontext Style LoRAs

🌍

12

Transform images using selected styles

Geolocation Utilities

Running

Agents

10

Location Predictor

🌍

10

Identify image location on a map

Image Generation

Running on Zero

Agents

313

Sketch2lineart

🚀

313

Generate lineart images from your photos

Security Tools

Running

Agents

25

GLiNER-Multi-PII

💻

25

Extract personally identifiable information from text

Developer Utilities

Running on Zero

Agents

Featured

925

Screenshot to HTML

⚡

925

Generate HTML code from a website screenshot

Medical

Runtime error

Agents

3

Pharmacology Knowledge Graph

💊

3

Explore drug interactions and effects using AI predictions
Running

Agents

67

Medical Diagnosis

📉

67

Classify symptoms to diagnose health issues
Running

25

MediAI Medical AI Agent

🚀

25

AI-Powered Diagnosis & Treatment Assistant
Sleeping

Agents

Lisdexamfetamine Split Dose Modeller

🚀

Model split-dose protocols for lisdexamfetamine/Vyvanse

CAD Utilities

Running on CPU Upgrade

Agents

Featured

136

SGS 1

🚀

136

Generate 3D CAD models from images

Subtitle generation

Running

Agents

11

Whisper WebUI

🚀

11

Generate subtitles from audio or video files

Game creation

Running

Featured

235

3D Game Maker

🏢

235

create games with AI

OSINT

Spaces and models that may have applications in open source intelligence (OSINT)

Running

147

Reverse Face Search

📉

147

Search Face Online
Runtime error

Agents

SATINT Analyst

👁

Professional satellite imagery intelligence analysis
Running

Agents

10

Location Predictor

🌍

10

Identify image location on a map

Interesting ideas

AI use-cases and appliations that I found interesting (a repo for myself to explore!)

Running on Zero

Agents

Featured

2.26k

MagicQuill

🪶

2.26k

Edit images with scribble‑based color and edge control
Build error

Agents

20

AutoPR

🚀

20

Generate a Twitter or Xiaohongshu post from a research PDF
Running

147

Reverse Face Search

📉

147

Search Face Online
Runtime error

Agents

16

AI STORYTELLER

🏢

16

Generate a video from a story

Background Removal

Running on Zero

MCP

2.86k

Background Removal

🌘

2.86k

Remove backgrounds from images instantly
Running on Zero

Agents

Featured

623

Video Background Removal

📽

623

Remove/Change background of video.
Running on Zero

Agents

948

BRIA RMBG 2.0

🐢

948

remove background from any image

Image captioning

Salesforce/blip-image-captioning-base

Image-to-Text • Updated Feb 3, 2025 • 2.06M • 860

Video Generation Quants

QuantStack/Wan2.2-I2V-A14B-GGUF

Image-to-Video • 14B • Updated Jul 29, 2025 • 217k • 349

OCR & Document Processing

rednote-hilab/dots.ocr

Image-Text-to-Text • 3B • Updated Oct 31, 2025 • 236k • 1.31k
Runtime error

15

Ui Rev Doc Model

😻

15

Analysis of data on an invoice
Paused

Agents

Featured

144

Deepdoctection

🏃

144

Convert PDFs and images to structured text and layout data
Running

13

Docsifer

📚

13

Convert documents into clean, LLM-ready Markdown.

Fast video generation

lightx2v/Wan2.2-Lightning

Text-to-Video • Updated Nov 13, 2025 • 64 • 618
ByteDance/AnimateDiff-Lightning

Text-to-Video • Updated Jan 6, 2025 • 9.87k • 991

Agentic code generation capable

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

256K Context

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

Video Generation

chetwinlow1/Ovi

Image-to-Video • 12B • Updated Nov 15, 2025 • 292 • • 298
lightx2v/Wan2.2-Lightning

Text-to-Video • Updated Nov 13, 2025 • 64 • 618
Wan-AI/Wan2.2-T2V-A14B

Text-to-Video • Updated Aug 7, 2025 • 3.91k • • 499
QuantStack/Wan2.2-I2V-A14B-GGUF

Image-to-Video • 14B • Updated Jul 29, 2025 • 217k • 349

Reasoning Models

ai21labs/AI21-Jamba-Reasoning-3B

Text Generation • 3B • Updated Oct 8, 2025 • 1.38k • 137
LLM360/K2-Think

Text Generation • 33B • Updated Nov 19, 2025 • 110 • • 366
unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF

Text Generation • 8B • Updated Jun 16, 2025 • 56.2k • 414

Instructional LLMs

LLMs optimised for instruction following rather than conversational use - quants and original models

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 9.89M • • 6.06k
Qwen/Qwen3-Omni-30B-A3B-Instruct

Any-to-Any • 35B • Updated Sep 22, 2025 • 1.57M • 936
moonshotai/Kimi-K2-Instruct-0905

Text Generation • 1T • Updated Jan 30 • 2.02M • • 722
mistralai/Mistral-7B-Instruct-v0.3

7B • Updated Dec 3, 2025 • 3.28M • 2.63k

Deep research

FractalAIResearch/Fathom-Search-4B

Text Generation • 4B • Updated Oct 10, 2025 • 23 • • 121
Running

Agents

25

Fathom DeepResearch

📊

25

DeepResearch with the fathom search and synthesizer models

Mobile LLMs

LLMs optimised for running "on-device" (specifically, for this collection, on smartphones with standard inference capabilities)

facebook/MobileLLM-Pro

Text Generation • 1B • Updated Nov 11, 2025 • 4 • 162

Agentic LLMs

zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 13.6k • • 1.23k
AgentFlow/agentflow-planner-7b

8B • Updated Oct 12, 2025 • 1.37k • 63
HuggingFaceH4/zephyr-7b-beta

Text Generation • 7B • Updated Oct 16, 2024 • 147k • • 1.85k
zai-org/GLM-4.5-Air

Text Generation • 110B • Updated Aug 11, 2025 • 320k • • 608

Local model collection

LLMs (mostly quants) that are small enough to run locally (on my hardware). My go-tos.

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15, 2025 • 7.6M • • 951
zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 13.6k • • 1.23k
Qwen/Qwen3-VL-8B-Thinking

Image-Text-to-Text • 9B • Updated Nov 26, 2025 • 254k • • 210
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 9.89M • • 6.06k

My Image Datasets

danielrosehill/Hebrew-Language-Signage

Viewer • Updated Nov 6, 2025 • 68 • 22
danielrosehill/Jerusalem-Streetscapes

Viewer • Updated Nov 6, 2025 • 413 • 19
danielrosehill/Tel-Aviv-Pics

Updated Nov 6, 2025 • 22
danielrosehill/Jerusalem-High-Rise-Development

Viewer • Updated Nov 6, 2025 • 56 • 10

Datasets

danielrosehill/Zapier-Integrations-260825

Viewer • Updated Aug 26, 2025 • 8.64k • 7

Text Transformation

danielrosehill/Shakespearean-Text-Transformation-Prompts

Viewer • Updated Apr 21, 2025 • 1 • 26
danielrosehill/Speech-To-Text-System-Prompts-2

Viewer • Updated Apr 9, 2025 • 2 • 35 • 1
Sleeping

Agents

System Prompt Reformatter

📚

Reformats system prompts in the 2nd person and other edits
Sleeping

Agents

BLUF Email Formatter

📧

Format emails with clear subject lines and summaries

Evaluations

danielrosehill/ChatGPT-AI-Vs-API

Updated Apr 22, 2025 • 2
danielrosehill/Long-Prompt-Experiment

Viewer • Updated Aug 19, 2025 • 92 • 63
danielrosehill/STT-Voice-Notes-Evals

Updated Aug 11, 2025 • 11

Jerusalem

danielrosehill/Jerusalem-Emergency-Shelters-0925

Viewer • Updated Sep 19, 2025 • 149 • 13
danielrosehill/Jerusalem-Streetscapes

Viewer • Updated Nov 6, 2025 • 413 • 19
danielrosehill/Jerusalem-High-Rise-Development

Viewer • Updated Nov 6, 2025 • 56 • 10

ISO Standards

ISO standard related projects

danielrosehill/ISO-3166-4217-Consolidated

Preview • Updated Sep 3, 2025 • 9

Sustainability Projects

Datasets and projects related to sustainability esp impact investing

danielrosehill/GHG-Emissions-Data

Viewer • Updated Dec 20, 2024 • 78 • 461
danielrosehill/Global-Value-Factor-Database-Refactor-V2

Updated Sep 2, 2025 • 149
danielrosehill/ifvi_valuefactors_deriv

Updated Aug 21, 2025 • 296
danielrosehill/pay-for-outcomes-instruments

Preview • Updated May 21, 2025 • 6

Character Creation Datasets

Datasets for generating characters from 2D/3D

danielrosehill/Corn-The-Sloth

Viewer • Updated Apr 18, 2025 • 109 • 106

Israel Photo Galleries

Photo galleries of places in Israel for world generation use cases, among others

danielrosehill/Tel-Aviv-Pics

Updated Nov 6, 2025 • 22
danielrosehill/Jerusalem-High-Rise-Development

Viewer • Updated Nov 6, 2025 • 56 • 10

3D-General

Paused

Agents

Featured

2.14k

Hunyuan3D-2.1

👻

2.14k

Image-to-3D Generation
Running on Zero

MCP

631

TRELLIS

🏢

631

Scalable and Versatile 3D Generation from images
Running on Zero

Agents

3.31k

Hunyuan3D-2.0

🌍

3.31k

Text-to-3D and Image-to-3D Generation

QR Art

Running on Zero

Agents

Featured

1.99k

QR Code AI Art Generator

📱

1.99k

QR Code AI Art Generator Blend QR codes with AI Art

Upscalers

Running on Zero

Agents

2.13k

Finegrain Image Enhancer

🖼

2.13k

Clarity AI Upscaler Reproduction
Running on Zero

Agents

154

RealESRGAN Pytorch

🔥

154

User Friendly Image & Video Upscaler!

Speech To Text (STT)

Running on Zero

Agents

Featured

2.77k

Whisper

📉

2.77k

Transcribe audio files into text instantly
openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 5.05M • • 5.81k
ggerganov/whisper.cpp

Automatic Speech Recognition • Updated Oct 29, 2024 • 1.45k

Image To Video (No Audio)

Running on Zero

Agents

3.75k

Live Portrait

🤪

3.75k

Apply the motion of a video on a portrait
Paused

Agents

Featured

5.12k

Wan2.2 Animate

👁

5.12k

Wan2.2 Animate
Running on Zero

MCP

Featured

2.02k

Stable Video Diffusion 1.1

📺

2.02k

Generate a short video from a single image
Running on Zero

MCP

Featured

1.61k

Wan2.1 Fast

🎥

1.61k

Animate a still image into a short video using a prompt

Image Editing Utilities

Running on Zero

MCP

2.86k

Background Removal

🌘

2.86k

Remove backgrounds from images instantly
Running on Zero

Agents

2.97k

CLIP Interrogator

🕵

2.97k

Generate detailed prompts from any image
Paused

Agents

441

NoWatermark

⚡

441

Powerful Watermark Removal API
Running

Agents

138

Vectorizer AI

🌍

138

Convert images to SVG vectors with customizable settings

Global Value Factor Database (GVFD) - Visualisation And Data

Version controlled refactors for data analysis of the Global Value Factor Database (GVFD) by the International Foundation for Valuing Impacts (IFVI).

Sleeping

Agents

GVFD Navigator

📉

Data visualisation utility for GVFD by IFVI (unofficial)
danielrosehill/Global-Value-Factor-Database-Refactor-V2

Updated Sep 2, 2025 • 149
danielrosehill/ifvi_valuefactors_deriv

Updated Aug 21, 2025 • 296

Text Reformatting Apps

Implementations of apps for simple text reformatting tasks

Sleeping

Agents

BLUF Email Formatter

📧

Format emails with clear subject lines and summaries
Runtime error

Agents

System Prompt Depersonaliser

😻

Converts personal system prompts for general use
Sleeping

Shakespeare AI

😻

Rewrites .. stuff .. in Shakespearean English

Hebrew AI Spaces

Spaces to do with Hebrew language AI

Running on CPU Upgrade

Agents

46

Hebrew LLM Leaderboard

🥇

46

Explore LLM benchmark leaderboard with searchable filters
Running

Agents

Hebrew GPT Neo - Science Fiction and Fantasy

🧙

Generate Hebrew text for science fiction and fantasy stories
Sleeping

Agents

מחולל נונסנס רובושאול

🤖

Generate פיקטיביים שאול אמסטרדمسקי ציטוטים
Build error

Agents

Hebrew Sentiment

😻

Benchmarks

Running on CPU Upgrade

Agents

Featured

1.37k

Open ASR Leaderboard

🏆

1.37k

Explore and compare speech recognition model benchmarks
Running

Agents

31

Hebrew Transcription Leaderboard

🥇

31

Benchmarking Hebrew Speech-to-Text Models
Running

Agents

449

Agent Leaderboard

💬

449

Ranking of LLMs for agentic tasks

Vintage-LLMs

openai-community/gpt2

Text Generation • 0.1B • Updated Feb 19, 2024 • 13.1M • 3.3k

Hebrew Large Language Models

Language models supporting TTS (predominantly the text generation task). These are almost exclusively fine-tunes of Gemma, Mistral, or the Qwen models

GiliGold/Knesset-DictaBERT

Fill-Mask • 0.2B • Updated Dec 28, 2024 • 56 • 2
yam-peleg/Hebrew-Gemma-11B

Text Generation • 10B • Updated Mar 16, 2024 • 7 • 38
yam-peleg/Hebrew-Mistral-7B

Text Generation • 8B • Updated Apr 26, 2024 • 1k • 73
yam-peleg/Hebrew-Mistral-7B-200K

Text Generation • 8B • Updated May 6, 2024 • 297 • 15

Acronym Identification

Kamakshi88/t5_acronym

0.2B • Updated Feb 1, 2024 • 2

LLMS-Im-Testing

google/gemma-4-E4B

Any-to-Any • 8B • Updated 9 days ago • 650k • 311

Hebrew Sentiment Classification Models

DGurgurov/xlm-r_hebrew_sentiment

Text Classification • 0.3B • Updated Jun 8, 2024 • 2

Hebrew OCR Models

sivan22/testing-trOCR-hebrew-handwritten

Image-Text-to-Text • Updated May 17, 2023 • 85 • 1

Hebrew ASR

imvladikon/wav2vec2-large-xlsr-53-hebrew

Automatic Speech Recognition • 0.3B • Updated May 6, 2023 • 446 • 7
Mizurodp/wav2vec2-large-xls-r-300m-hebrew-colab

Automatic Speech Recognition • Updated Dec 17, 2022 • 37 • 1
imvladikon/wav2vec2-xls-r-300m-lm-hebrew

Automatic Speech Recognition • 0.3B • Updated Sep 15, 2023 • 78 • 4
imvladikon/wav2vec2-xls-r-1b-hebrew

Automatic Speech Recognition • 1.0B • Updated Sep 12, 2023 • 14 • 2

Streaming-Speech-To-Text

nvidia/nemotron-speech-streaming-en-0.6b

Automatic Speech Recognition • Updated 2 days ago • 7.24k • 572
mistralai/Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition • 4B • Updated Mar 11 • 1.11M • 875

Agentic Code Gen 301225

Quick collection of models I'm evaluating for cogeneration. Minimum inclusion criteria includes tool usage and MCP. Looking for something fast and goo

zai-org/GLM-4.7

Text Generation • 358B • Updated Jan 29 • 66.3k • • 2.04k
MiniMaxAI/MiniMax-M2.1

Text Generation • 229B • Updated Feb 13 • 10.2k • • 1.35k
XiaomiMiMo/MiMo-V2-Flash

Text Generation • 310B • Updated Apr 20 • 70.6k • • 737
unsloth/MiniMax-M2.1-GGUF

Text Generation • 229B • Updated Feb 14 • 4.69k • 194

Video Understanding

lmms-lab/LLaVA-Video-7B-Qwen2

Video-Text-to-Text • 8B • Updated Oct 25, 2024 • 20.4k • 127

Image Evaluations

danielrosehill/Hebrew-Image-Eval-111225

Updated Dec 11, 2025 • 671

Audio Understanding Datasets

nvidia/AF-Think

Preview • Updated Apr 5 • 418 • 24
nvidia/AudioSkills

Preview • Updated Jan 8 • 4.01k • 102
nvidia/LongAudio

Preview • Updated Apr 5 • 270 • 23

Audio Multimodal Models

Open source models with audio understanding. Tracking mostly vendor releases in the audio and text to text subclassification of multimodal.

stepfun-ai/Step-Audio-R1

Audio-Text-to-Text • 33B • Updated Dec 2, 2025 • 222 • 144
Qwen/Qwen2-Audio-7B

Audio-Text-to-Text • 8B • Updated Nov 20, 2024 • 5.06k • 171
Qwen/Qwen2-Audio-7B-Instruct

Audio-Text-to-Text • 8B • Updated Jan 12, 2025 • 686k • 538
FreedomIntelligence/Soundwave

Audio-Text-to-Text • 9B • Updated Mar 16, 2025 • 20 • 15

My Whisper ACFT Fine Tunes

Whisper fine tunes for use with FUTO keyboard on Android (training: Modal based on Whisper-ACFT skeleton from FUTO)

danielrosehill/daniel_whisper_acft_base_v2

99.1M • Updated Nov 25, 2025 • 3
danielrosehill/daniel_whisper_acft_small_v2

0.3B • Updated Nov 25, 2025 • 3
danielrosehill/daniel_whisper_acft_tiny_v2

57.7M • Updated Nov 25, 2025 • 5

My Whisper Fine-Tunes (V2)

Whisper fine-tunes for my voice and vocab (tech, Hebrew). About 1 hour of training data so still very much POCs!

danielrosehill/daniel_whisper_finetune_large_v3_turbo_v2

Automatic Speech Recognition • 0.8B • Updated Nov 23, 2025 • 3
danielrosehill/daniel_whisper_finetune_medium_v2

Automatic Speech Recognition • 0.8B • Updated Nov 23, 2025 • 4
danielrosehill/daniel_whisper_finetune_tiny_v2

Automatic Speech Recognition • 37.8M • Updated Nov 23, 2025 • 1
danielrosehill/daniel_whisper_finetune_base_v2

Automatic Speech Recognition • 72.6M • Updated Nov 23, 2025 • 5

ASR Beyond Whisper

nvidia/parakeet-tdt-0.6b-v3

Automatic Speech Recognition • 0.6B • Updated 22 days ago • 106k • • 918
ibm-granite/granite-speech-3.3-8b

Automatic Speech Recognition • 9B • Updated Apr 2 • 54.7k • 171
facebook/seamless-m4t-v2-large

Automatic Speech Recognition • 2B • Updated Jan 4, 2024 • 427k • 985
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.08M • 398

Whisper Hebrish

Fine tune of Whisper Large (V3, Turbo) using a small corpus of mixed language English sentences with Hebrew to improve accuracy

Sleeping

Agents

Whisper Hebrish

🎤

Compare fine-tuned vs stock Whisper models
danielrosehill/Whisper-Hebrish

0.8B • Updated Nov 18, 2025 • 60
danielrosehill/English-Hebrew-Mixed-Sentences

Viewer • Updated Nov 17, 2025 • 516 • 59

Old LLMs

Useful models for demonstrating what early and pre-Transformer LLMs looked and functioned like

TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF

1B • Updated Dec 31, 2023 • 197k • 229

My-Evaluations

danielrosehill/Podcast-ASR-Evaluation

Viewer • Updated Nov 14, 2025 • 27 • 15
danielrosehill/Long-Prompt-Experiment

Viewer • Updated Aug 19, 2025 • 92 • 63
Sleeping

Agents

Podcast ASR Evaluation

🎙

ASR benchmark comparing local and cloud models
Running

Agents

1

LLM Long Output Experiment (Code Generation)

📈

1

Evaluating max single output length of code gen LLMs

Whisper Fine Tunes

Whisper fine-tuned on 1 hour of my voice

Running

Whisper Fine-Tune vs Commercial APIs

🎤

Local fine-tunes beat commercial STT APIs

PDF Downloads

Running

3.88k

The Ultra-Scale Playbook

🌌

3.88k

The ultimate guide to training LLM on large GPU Clusters

STT Components

Models that work in unison with core STT models in voice workflows

pyannote/voice-activity-detection

Automatic Speech Recognition • Updated May 10, 2024 • 2.63M • 235
pyannote/speaker-diarization-3.1

Automatic Speech Recognition • Updated May 10, 2024 • 8.18M • 2.28k
pyannote/overlapped-speech-detection

Automatic Speech Recognition • Updated May 10, 2024 • 92.2k • 58
pipecat-ai/smart-turn-v3

Voice Activity Detection • Updated Jan 7 • 168

Video background removal

Running on Zero

Agents

Featured

360

Remove Video Background

🎞

360

Easily remove your videos background!

STT Fine Tune Resources

unsloth/whisper-large-v3

Automatic Speech Recognition • 2B • Updated May 14, 2025 • 3.84k • 16
unsloth/whisper-small

Automatic Speech Recognition • 0.2B • Updated May 14, 2025 • 637 • 6
unsloth/CrisperWhisper

Automatic Speech Recognition • 2B • Updated May 14, 2025 • 45 • 16
unsloth/whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated May 14, 2025 • 1.93k • 10

Concept Outlines

Running

2

Claude Agent Picker Pattern

🎯

2

Pattern for managing multi-agent crews in Claude Code

STT Evaluations

Sleeping

Agents

Local STT Eval One Sample

😻

Single sample eval for WER on various Whisper models
danielrosehill/Podcast-ASR-Evaluation

Viewer • Updated Nov 14, 2025 • 27 • 15
Sleeping

Agents

Podcast ASR Evaluation

🎙

ASR benchmark comparing local and cloud models
Running

STT Comparison

🦀

Comparing STT models against audio

Whisper variants

nyrahealth/CrisperWhisper

Automatic Speech Recognition • 2B • Updated Apr 7 • 70.1k • 334

Entertainment Recommendations

Running

Agents

9

MovieReccomender

📚

9

Get a personalized recommendation using AI

Proofs of Concept

Runtime error

Agents

Baby Noise Cancellation Demo

👶

AI-powered baby noise removal demo with STT comparison

Deep Filter

Running

Agents

17

DeepFilterNet2 No File Size Limit

😻

17

Use DeepFilterNet2 to denoise audio no file size limit

Gemini

Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

Claude Code

Running

Agents

Claude Code Slash Commands

⚡

Interactive browser for Claude Code slash commands
Running

Claude Code Linux Desktop Slash Commands

🖥

Slashes for Linux Desktop admin

Shakespeare AI

Just for fun projects for converting conventional text into Shakespearean English

Paused

Agents

Featured

30

Diffusion GPT

🖊

30

Generate Shakespearean text using a diffusion model
Sleeping

Shakespeare AI

😻

Rewrites .. stuff .. in Shakespearean English

Real Time Video To Video

Running on CPU Upgrade

Featured

101

Krea Realtime Video

👁

101

Generate AI videos from webcam, video, or text
krea/krea-realtime-video

Text-to-Video • Updated Nov 14, 2025 • 3.18k • 280

Multimodal AI

Sleeping

Agents

Multimodal AI Taxonomy

🌍
danielrosehill/multimodal-ai-taxonomy

Viewer • Updated Oct 22, 2025 • 85 • 76
Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

Context Utilities

Experimentary workflows (mostly) that aim to rethink how contextual data can be gathered to achieve personalised inference in AI systems

Sleeping

2

AI Context Generation Interviews

⚡

2

Model workflow using agents to proactively develop context
Sleeping

Agents

Context Cruncher

🎙

Transform voice recordings into structured AI context data

Avatar Videos

Running on Zero

MCP

Featured

604

LatentSync

👄

604

Audio Conditioned LipSync with Latent Diffusion Models
Build error

Agents

Featured

1.43k

SadTalker

😭

1.43k

Generate a talking face video from an image and audio
Running

Agents

181

Gradio Lipsync Wav2lip

👄

181

Create lip‑synced videos from a face image and audio
Running

Agents

70

Wav2lip Gpu

🌍

70

Create a talking‑head video from a photo and audio

Resume Utilities

Runtime error

2

ATS Resume Checker

🌖

2

Parse resumes and match with job descriptions
Running

Agents

2

Resume Matcher

🚀

2

Match resume with job description

Multi LLM Experiments

Running on CPU Upgrade

58

Outsmart

🧠

58

Watch LLMs negotiate in a real‑time battle arena

Leaderboards

Running

22

AudioBench Leaderboard

🥇

22

View and compare audio model performance rankings
Running

110

AI Phone Leaderboard

📱

110

AI Phone Leaderboard
Running

Agents

356

VBench Leaderboard

📊

356

Submit video model evaluation results to a public benchmark

Object Detection

Running

Agents

Featured

101

Yolov10

📉

101

Detect objects in images with customizable YOLOv10 models

Text Processing Utilities

Running

Agents

25

GLiNER-Multi-PII

💻

25

Extract personally identifiable information from text
Running

4

MarkItDown Microsoft

🐠

4

Extract Text in Structured Format

Data Visualization

Running

3

neulab/conala

🗺

3

Display interactive data visualizations

Hugging Face Utilities

Sleeping

Agents

7

HF Downloader

📈

7

Download Hugging Face repositories to run locally.✔

Scraping

Running

MCP

104

Web Scraper

🚀

104

Scrape a website and download its content as markdown

Video editing utilities

Runtime error

Agents

31

LaMa Video Watermark Remover

🌖

31

Remove watermarks from videos
Running

Agents

9

AudioDenoiser

👁

9

Remove background noise from videos
Paused

11

Image Video Colorization

🎥

11

Colorize black and white videos into full color

Audio editing utilities

Running

Agents

322

Remove Silence From Audio

🦀

322

Remove Silence From Audio
Running on Zero

Agents

386

Audio🔹Separator

🏃

386

Vocal and background audio separator
Runtime error

Agents

Featured

327

Audio Editing

🎧

327

Edit audios with text prompts
Paused

Agents

472

Resemble Enhance

🚀

472

Enhance and denoise your audio files instantly

Prompt engineering

Running

149

Prompt Lab

😻

149

Enhance your prompt into a structured, expert‑level version

Data Processing Utilities

Running

620

MinerU Document Extraction Tools

📚

620

Embedded MinerU document extraction demo
Running

MCP

104

Web Scraper

🚀

104

Scrape a website and download its content as markdown
Running

Agents

102

PDF to Dataset

📄

102

Convert PDFs to a Hugging Face dataset

Image To Video

Running on Zero

MCP

Featured

1.51k

LTX Video Fast

🎥

1.51k

ultra-fast video model, LTX 0.9.8 13B distilled

Video To Video

QuantStack/Wan2.2-Animate-14B-GGUF

Video-to-Video • 17B • Updated Sep 20, 2025 • 29.6k • 202

TTS With Dialog Support

nari-labs/Dia-1.6B

Text-to-Speech • 2B • Updated Jun 1, 2025 • 8.2k • • 2.88k

Diarisation

pyannote/speaker-diarization-community-1

Automatic Speech Recognition • Updated Sep 29, 2025 • 2.67M • 533

Long speech synthesis

microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 109k • 2.39k

Browser use capable

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

Code Generation Models

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

General LLM Quants

ai21labs/AI21-Jamba-Mini-1.7

52B • Updated Feb 2 • 70 • 42

Embedding Models

google/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.67M • • 1.71k
nvidia/omni-embed-nemotron-3b

Sentence Similarity • 5B • Updated May 6 • 12.4k • 125
Qwen/Qwen3-Embedding-0.6B

Feature Extraction • 0.6B • Updated Apr 20 • 8.59M • • 1.06k

Image Generation Models

tencent/HunyuanImage-3.0

Text-to-Image • 83B • Updated Jan 28 • 1.05M • • 1.09k
black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16, 2024 • 323k • • 5.09k
stabilityai/stable-diffusion-xl-base-1.0

Text-to-Image • Updated Oct 30, 2023 • 1.39M • • 7.81k
Qwen/Qwen-Image

Text-to-Image • Updated Aug 18, 2025 • 173k • • 2.51k

Image Generation Quants

Quantized versions of image generation models

lightx2v/Qwen-Image-Lightning

Text-to-Image • Updated Nov 3, 2025 • 422k • • 802

LLMs

zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 13.6k • • 1.23k

Voice Cloning

neuphonic/neutts-air

Text-to-Speech • 0.7B • Updated Feb 12 • 18.8k • 874
coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 8.84M • 3.59k
Running on Zero

Agents

83

Voice Cloning Studio

🚀

83

This space offers an easy-to-use interface for voice cloning

Vision Language Models

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15, 2025 • 7.6M • • 951
moondream/moondream3-preview

Image-Text-to-Text • 9B • Updated Apr 9 • 276k • 650

Audio Datasets

danielrosehill/ai-generated-podcast-episodes

Viewer • Updated Aug 26, 2025 • 66 • 3
danielrosehill/Small-STT-Eval-Audio-Dataset

Viewer • Updated Dec 10, 2025 • 92 • 39

Context Data

danielrosehill/Software-Wish-List-Context-Data

Updated Apr 28, 2025 • 16

Experiments

danielrosehill/Long-Prompt-Experiment

Viewer • Updated Aug 19, 2025 • 92 • 63
Sleeping

Agents

AI Agent UN - Multi-Agent Simulation Framework

🏛

Explore UN voting simulations with AI agents
Sleeping

Max Output Tokens Analysis

📊

Display max output tokens for models over time

AI Agents

AI agent network configs and projects

danielrosehill/Code-Gen-Agents-0925

Updated Sep 15, 2025 • 56
Sleeping

Agents

AI Agent UN - Multi-Agent Simulation Framework

🏛

Explore UN voting simulations with AI agents
Sleeping

Agents

Code Gen Agents Network

🔥

Code generation agent network with config navigator

Israel

Projects related to Israel

danielrosehill/Israel-Alerting-Zones

Preview • Updated May 9, 2025 • 17
danielrosehill/Tel-Aviv-Pics

Updated Nov 6, 2025 • 22
danielrosehill/Jerusalem-Streetscapes

Viewer • Updated Nov 6, 2025 • 413 • 19
danielrosehill/Jerusalem-Emergency-Shelters-0925

Viewer • Updated Sep 19, 2025 • 149 • 13

Reference / Lookup Datasets

danielrosehill/ISO-3166-4217-Consolidated

Preview • Updated Sep 3, 2025 • 9

Voice Note Audio And Training

Datasets for STT training, classification fine-tuning - especially voice notes

danielrosehill/Voice-Note-Audio

Preview • Updated Oct 27, 2025 • 201

My System Prompt Collections

Collections of system prompts for assistants, agents, and workflows

danielrosehill/Data-Utils-System-Prompts

Updated Apr 9, 2025 • 23
danielrosehill/Email-Management-System-Prompts

Updated Apr 9, 2025 • 20
danielrosehill/General-Purpose-System-Prompts

Updated Apr 9, 2025 • 27 • 1
danielrosehill/Geopolitical-System-Prompts

Updated Apr 9, 2025 • 29

3D Human Digital Humans

Running on Zero

Agents

191

PSHuman

🏃

191

PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF
Runtime error

Agents

11

Pifuhd

🐠

11

Generate 3D human models from images
Running on Zero

Agents

10

HumanWild

⚡

10

Generate 3D human reconstructions from images
Runtime error

51

HSMR

💀

51

Convert images of humans to biomechanically accurate 3D skeletons

Image To Image

Running on Zero

Agents

Featured

1.64k

Expression Editor

🐨

1.64k

Quickly edit the expression of a face
Running on Zero

Agents

Featured

1.54k

InstructPix2Pix

🚀

1.54k

Edit images using text instructions
Qwen/Qwen-Image-Edit-2509

Image-to-Image • Updated Sep 22, 2025 • 355k • • 1.16k
Qwen/Qwen-Image

Text-to-Image • Updated Aug 18, 2025 • 173k • • 2.51k

Generative-AI-Favorites

Paused

Agents

Featured

5.12k

Wan2.2 Animate

👁

5.12k

Wan2.2 Animate

Single Shot Image To Image (Reference)

Running on Zero

Agents

Featured

1.94k

PhotoMaker

📷

1.94k

Generate personalized photos of a person from a prompt

Text To Speech (TTS)

Runtime error

Agents

Featured

2.77k

XTTS

🐸

2.77k

Generate speech from text using a reference voice
microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 109k • 2.39k
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 13.4M • • 6.31k

Music Generation

Running on Zero

Agents

Featured

5.07k

MusicGen

🎵

5.07k

Generate music from a text description and optional melody
suno/bark

Text-to-Speech • Updated Oct 4, 2023 • 18k • 1.53k
Running on L40S

Featured

744

Song Generation

🎵

744

Generate a song from your lyrics and prompts

Character-Generation

Running on Zero

Agents

191

PSHuman

🏃

191

PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF
Runtime error

MCP

11

3D Game Environment Builder

📚

11

3D Game Environment Builder MCP

AI Experiments

Experiments in AI and prompt engineering

Sleeping

Max Output Tokens Analysis

📊

Display max output tokens for models over time
Running

Agents

1

LLM Long Output Experiment (Code Generation)

📈

1

Evaluating max single output length of code gen LLMs
Running

Single Shot Brevity Training

📈

Using one example to train an LLM for informational brevity
Sleeping

Agents

Local STT Eval One Sample

😻

Single sample eval for WER on various Whisper models

Useful-Models

modularai/Llama-3.1-8B-Instruct-GGUF

Text Generation • 8B • Updated Sep 9, 2024 • 4.89k • 17
MaziyarPanahi/WizardLM-2-7B-GGUF

Text Generation • 7B • Updated Apr 15, 2024 • 157k • 83
MaziyarPanahi/mathstral-7B-v0.1-GGUF

Text Generation • 7B • Updated Jul 16, 2024 • 156k • 7
MaziyarPanahi/phi-4-GGUF

Text Generation • 15B • Updated Jan 8, 2025 • 156k • 8

Hebrew datasets

Datasets in Hebrew

imvladikon/hebrew_speech_coursera

Viewer • Updated May 5, 2023 • 21k • 275 • 10
imvladikon/hebrew_speech_kan

Viewer • Updated May 5, 2023 • 10k • 197 • 13
sivan22/hebrew-handwritten-dataset

Viewer • Updated May 8, 2023 • 5.09k • 73 • 15
imvladikon/hebrew_speech_campus

Viewer • Updated Nov 20, 2023 • 75.9k • 888 • 6

ASR Models

Collection of ASR models for customized STT model training

nvidia/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • Updated Apr 13 • 365k • 1.5k
ibm-granite/granite-speech-3.3-8b

Automatic Speech Recognition • 9B • Updated Apr 2 • 54.7k • 171
nvidia/canary-qwen-2.5b

Automatic Speech Recognition • 3B • Updated Apr 21 • 39.5k • 437
facebook/omniASR-W2V-1B

Automatic Speech Recognition • Updated Nov 27, 2025 • 6

Fav-Code-Generation-Models

Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • 33B • Updated Jan 12, 2025 • 1.54M • • 2.04k
deepseek-ai/DeepSeek-V2.5-1210

Text Generation • 236B • Updated Dec 11, 2024 • 614 • 257

LLM-Experiments

Running

Agents

1

LLM Long Output Experiment (Code Generation)

📈

1

Evaluating max single output length of code gen LLMs
GiliGold/Knesset-DictaBERT

Fill-Mask • 0.2B • Updated Dec 28, 2024 • 56 • 2

Israel Open Data

danielrosehill/Israel-Open-Data-Catalogue

Updated May 6 • 38
danielrosehill/Jerusalem-Air-Quality-Shabbat

Viewer • Updated Apr 28 • 7.95M • 252

Acronym Identification

Kamakshi88/t5_acronym

0.2B • Updated Feb 1, 2024 • 2

Disfluency

4i-ai/BERT_disfluency_cls

Text Classification • 0.1B • Updated Aug 25, 2023 • 514 • 1
amaai-lab/DisfluencySpeech

Viewer • Updated Jun 27, 2024 • 5k • 412 • 21
adjaysagar/english-DisfluencySpeech

Viewer • Updated Feb 7 • 4.5k • 4
arielcerdap/disfluency-fluencybank

Viewer • Updated Mar 17 • 17.8k • 63

LLMS-Im-Testing

google/gemma-4-E4B

Any-to-Any • 8B • Updated 9 days ago • 650k • 311

Hebrew Puncutation Restoration

verbit/hebrew_punctuation

Updated Oct 6, 2024 • 12 • 1

Hebrew Sentiment Classification Models

DGurgurov/xlm-r_hebrew_sentiment

Text Classification • 0.3B • Updated Jun 8, 2024 • 2

English Hebrew Translation

ashercn97/english-hebrew-translation

Translation • 77M • Updated Nov 12, 2023 • 24 • 2

Hebrew OCR Models

sivan22/testing-trOCR-hebrew-handwritten

Image-Text-to-Text • Updated May 17, 2023 • 85 • 1

Hebrew Diacritic Restoration Models

baravninaor/punctuation-restoration-deberta-alepgbert-hebrew

Updated Mar 22, 2023 • 1

Hebrew ASR

imvladikon/wav2vec2-large-xlsr-53-hebrew

Automatic Speech Recognition • 0.3B • Updated May 6, 2023 • 446 • 7
Mizurodp/wav2vec2-large-xls-r-300m-hebrew-colab

Automatic Speech Recognition • Updated Dec 17, 2022 • 37 • 1
imvladikon/wav2vec2-xls-r-300m-lm-hebrew

Automatic Speech Recognition • 0.3B • Updated Sep 15, 2023 • 78 • 4
imvladikon/wav2vec2-xls-r-1b-hebrew

Automatic Speech Recognition • 1.0B • Updated Sep 12, 2023 • 14 • 2

Hebrew-TTS

Yzamari/f5tts-hebrew-v2

Text-to-Speech • 0.3B • Updated Mar 28 • 62 • 1
notmax123/Zonos-Hebrew

Text-to-Speech • Updated Sep 11, 2025 • 13.9k • 3

Streaming-Speech-To-Text

nvidia/nemotron-speech-streaming-en-0.6b

Automatic Speech Recognition • Updated 2 days ago • 7.24k • 572
mistralai/Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition • 4B • Updated Mar 11 • 1.11M • 875

Utilities

Running

Featured

1.05k

Can You Run It? LLM version

🚀

1.05k

Check if your GPU can run a chosen LLM model

Agentic Code Gen 301225

Quick collection of models I'm evaluating for cogeneration. Minimum inclusion criteria includes tool usage and MCP. Looking for something fast and goo

zai-org/GLM-4.7

Text Generation • 358B • Updated Jan 29 • 66.3k • • 2.04k
MiniMaxAI/MiniMax-M2.1

Text Generation • 229B • Updated Feb 13 • 10.2k • • 1.35k
XiaomiMiMo/MiMo-V2-Flash

Text Generation • 310B • Updated Apr 20 • 70.6k • • 737
unsloth/MiniMax-M2.1-GGUF

Text Generation • 229B • Updated Feb 14 • 4.69k • 194

MWP-TTS-Candidates

ResembleAI/chatterbox-turbo

Text-to-Speech • Updated Dec 15, 2025 • • 652

Video Understanding

lmms-lab/LLaVA-Video-7B-Qwen2

Video-Text-to-Text • 8B • Updated Oct 25, 2024 • 20.4k • 127

ASR-To-Try

zai-org/GLM-ASR-Nano-2512

Automatic Speech Recognition • 2B • Updated Apr 7 • 118k • 370
nvidia/canary-qwen-2.5b

Automatic Speech Recognition • 3B • Updated Apr 21 • 39.5k • 437
microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 481k • 1.6k
facebook/omniASR-LLM-7B

Automatic Speech Recognition • Updated Nov 28, 2025 • 32

Image Evaluations

danielrosehill/Hebrew-Image-Eval-111225

Updated Dec 11, 2025 • 671

Evaluation Datasets

danielrosehill/Small-STT-Eval-Audio-Dataset

Viewer • Updated Dec 10, 2025 • 92 • 39

Audio Understanding Datasets

nvidia/AF-Think

Preview • Updated Apr 5 • 418 • 24
nvidia/AudioSkills

Preview • Updated Jan 8 • 4.01k • 102
nvidia/LongAudio

Preview • Updated Apr 5 • 270 • 23

Voxtral Originals (Mistral)

The two official variants of Voxtral (audio multimodal model) released by Mistral in July 2025

mistralai/Voxtral-Mini-3B-2507

5B • Updated Jul 28, 2025 • 309k • 655
mistralai/Voxtral-Small-24B-2507

Audio-Text-to-Text • 24B • Updated Dec 20, 2025 • 43.7k • 498

Audio Multimodal Models

Open source models with audio understanding. Tracking mostly vendor releases in the audio and text to text subclassification of multimodal.

stepfun-ai/Step-Audio-R1

Audio-Text-to-Text • 33B • Updated Dec 2, 2025 • 222 • 144
Qwen/Qwen2-Audio-7B

Audio-Text-to-Text • 8B • Updated Nov 20, 2024 • 5.06k • 171
Qwen/Qwen2-Audio-7B-Instruct

Audio-Text-to-Text • 8B • Updated Jan 12, 2025 • 686k • 538
FreedomIntelligence/Soundwave

Audio-Text-to-Text • 9B • Updated Mar 16, 2025 • 20 • 15

Flux 2 Quants

city96/FLUX.2-dev-gguf

Image-to-Image • 32B • Updated Nov 29, 2025 • 86.7k • 146
gguf-org/flux2-dev-gguf

Image-to-Image • 18B • Updated Jan 1 • 6.55k • 57

My Whisper ACFT Fine Tunes

Whisper fine tunes for use with FUTO keyboard on Android (training: Modal based on Whisper-ACFT skeleton from FUTO)

danielrosehill/daniel_whisper_acft_base_v2

99.1M • Updated Nov 25, 2025 • 3
danielrosehill/daniel_whisper_acft_small_v2

0.3B • Updated Nov 25, 2025 • 3
danielrosehill/daniel_whisper_acft_tiny_v2

57.7M • Updated Nov 25, 2025 • 5

My Public Audio Datasets

Open sourced audio datasets for STT/ASR. All recordings by me (Daniel Rosehill) unless otherwise accredited.

danielrosehill/English-Hebrew-Mixed-Sentences

Viewer • Updated Nov 17, 2025 • 516 • 59
danielrosehill/Tech-Sentences-For-ASR-Training

Viewer • Updated Nov 26, 2025 • 205 • 201 • 2
danielrosehill/Sample-Voice-Context-Data

Viewer • Updated Nov 30, 2025 • 159 • 27

My Whisper Fine-Tunes (V2)

Whisper fine-tunes for my voice and vocab (tech, Hebrew). About 1 hour of training data so still very much POCs!

danielrosehill/daniel_whisper_finetune_large_v3_turbo_v2

Automatic Speech Recognition • 0.8B • Updated Nov 23, 2025 • 3
danielrosehill/daniel_whisper_finetune_medium_v2

Automatic Speech Recognition • 0.8B • Updated Nov 23, 2025 • 4
danielrosehill/daniel_whisper_finetune_tiny_v2

Automatic Speech Recognition • 37.8M • Updated Nov 23, 2025 • 1
danielrosehill/daniel_whisper_finetune_base_v2

Automatic Speech Recognition • 72.6M • Updated Nov 23, 2025 • 5

TTS Models

Xenova/speecht5_tts

Text-to-Speech • Updated Aug 27, 2025 • 5.9k • 42
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 13.4M • • 6.31k
microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 109k • 2.39k
neuphonic/neutts-air

Text-to-Speech • 0.7B • Updated Feb 12 • 18.8k • 874

ASR Beyond Whisper

nvidia/parakeet-tdt-0.6b-v3

Automatic Speech Recognition • 0.6B • Updated 22 days ago • 106k • • 918
ibm-granite/granite-speech-3.3-8b

Automatic Speech Recognition • 9B • Updated Apr 2 • 54.7k • 171
facebook/seamless-m4t-v2-large

Automatic Speech Recognition • 2B • Updated Jan 4, 2024 • 427k • 985
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.08M • 398

ASR Resources

Useful datasets and models for ASR projects and fine tuning

ggerganov/whisper.cpp

Automatic Speech Recognition • Updated Oct 29, 2024 • 1.45k
openslr/librispeech_asr

Viewer • Updated Jul 25, 2025 • 585k • 103k • 228
speechcolab/gigaspeech

Viewer • Updated Feb 7 • 11.9M • 27k • 164
agentlans/high-quality-english-sentences

Viewer • Updated Oct 1, 2024 • 1.71M • 410 • 37

Whisper Hebrish

Fine tune of Whisper Large (V3, Turbo) using a small corpus of mixed language English sentences with Hebrew to improve accuracy

Sleeping

Agents

Whisper Hebrish

🎤

Compare fine-tuned vs stock Whisper models
danielrosehill/Whisper-Hebrish

0.8B • Updated Nov 18, 2025 • 60
danielrosehill/English-Hebrew-Mixed-Sentences

Viewer • Updated Nov 17, 2025 • 516 • 59

My-ASR-Finetunes

danielrosehill/Whisper-Hebrish

0.8B • Updated Nov 18, 2025 • 60
Sleeping

Agents

Whisper Hebrish

🎤

Compare fine-tuned vs stock Whisper models

Old LLMs

Useful models for demonstrating what early and pre-Transformer LLMs looked and functioned like

TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF

1B • Updated Dec 31, 2023 • 197k • 229

FUTO Models

futo-org/acft-whisper-tiny

Automatic Speech Recognition • 57.7M • Updated Jun 25, 2024 • 3 • 1
futo-org/acft-whisper-small.en

Automatic Speech Recognition • 0.3B • Updated Jun 25, 2024 • 6 • 2
futo-org/acft-whisper-base.en

Automatic Speech Recognition • 99.1M • Updated Jun 25, 2024 • 4 • 2
futo-org/acft-whisper-tiny.en

Automatic Speech Recognition • 57.7M • Updated Jun 25, 2024 • 3 • 1

My-Evaluations

danielrosehill/Podcast-ASR-Evaluation

Viewer • Updated Nov 14, 2025 • 27 • 15
danielrosehill/Long-Prompt-Experiment

Viewer • Updated Aug 19, 2025 • 92 • 63
Sleeping

Agents

Podcast ASR Evaluation

🎙

ASR benchmark comparing local and cloud models
Running

Agents

1

LLM Long Output Experiment (Code Generation)

📈

1

Evaluating max single output length of code gen LLMs

ASR Benchmarking

Running on CPU Upgrade

Agents

Featured

1.37k

Open ASR Leaderboard

🏆

1.37k

Explore and compare speech recognition model benchmarks
Sleeping

Agents

2

Asr Metrics

👀

2

Analyze ASR accuracy by comparing text files
danielrosehill/Podcast-ASR-Evaluation

Viewer • Updated Nov 14, 2025 • 27 • 15

Whisper Fine Tunes

Whisper fine-tuned on 1 hour of my voice

Running

Whisper Fine-Tune vs Commercial APIs

🎤

Local fine-tunes beat commercial STT APIs

API Price Comparisons

Running

Agentic LLM Price Comparisons

🤖

218 tool-calling LLMs analyzed by cost and context
danielrosehill/Open-Router-API-Pricing-Analysis

Viewer • Updated Nov 10, 2025 • 2.38k • 125

PDF Downloads

Running

3.88k

The Ultra-Scale Playbook

🌌

3.88k

The ultimate guide to training LLM on large GPU Clusters

My LORAS

danielrosehill/Jerusalem-Images

Text-to-Image • Updated Nov 6, 2025 • 3 •
danielrosehill/Tel-Aviv-Street-Style

Text-to-Image • Updated Nov 6, 2025 • 3 •
danielrosehill/Herman-Poppleberry

Text-to-Image • Updated Nov 6, 2025 • 4 •
danielrosehill/Dinosaur-Sloth

Text-to-Image • Updated Nov 6, 2025 • 3 •

STT Components

Models that work in unison with core STT models in voice workflows

pyannote/voice-activity-detection

Automatic Speech Recognition • Updated May 10, 2024 • 2.63M • 235
pyannote/speaker-diarization-3.1

Automatic Speech Recognition • Updated May 10, 2024 • 8.18M • 2.28k
pyannote/overlapped-speech-detection

Automatic Speech Recognition • Updated May 10, 2024 • 92.2k • 58
pipecat-ai/smart-turn-v3

Voice Activity Detection • Updated Jan 7 • 168

To Check Out

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated Dec 23, 2025 • 128k • • 1.5k
stabilityai/sp4d

Updated Nov 5, 2025 • 25 • 13
Falconsai/text_summarization

Summarization • 60.5M • Updated Feb 17, 2024 • 141k • • 293
distil-whisper/distil-large-v3

Automatic Speech Recognition • 0.8B • Updated Apr 21 • 931k • 375

Video background removal

Running on Zero

Agents

Featured

360

Remove Video Background

🎞

360

Easily remove your videos background!

Architecture Related Models

Collection of models gathered together for my wife who is an architect (of buildings!)

Muapi/jj-s-architecture-office-building

Text-to-Image • Updated 29 days ago • 12 •
prithivMLmods/Canopus-Interior-Architecture-0.1

Text-to-Image • Updated Aug 4, 2024 • 318 • • 26

STT Fine Tune Resources

unsloth/whisper-large-v3

Automatic Speech Recognition • 2B • Updated May 14, 2025 • 3.84k • 16
unsloth/whisper-small

Automatic Speech Recognition • 0.2B • Updated May 14, 2025 • 637 • 6
unsloth/CrisperWhisper

Automatic Speech Recognition • 2B • Updated May 14, 2025 • 45 • 16
unsloth/whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated May 14, 2025 • 1.93k • 10

Image to 3D

Running on Zero

MCP

631

TRELLIS

🏢

631

Scalable and Versatile 3D Generation from images

Concept Outlines

Running

2

Claude Agent Picker Pattern

🎯

2

Pattern for managing multi-agent crews in Claude Code

My Ideas

Running

2

Claude Agent Picker Pattern

🎯

2

Pattern for managing multi-agent crews in Claude Code

STT Evaluations

Sleeping

Agents

Local STT Eval One Sample

😻

Single sample eval for WER on various Whisper models
danielrosehill/Podcast-ASR-Evaluation

Viewer • Updated Nov 14, 2025 • 27 • 15
Sleeping

Agents

Podcast ASR Evaluation

🎙

ASR benchmark comparing local and cloud models
Running

STT Comparison

🦀

Comparing STT models against audio

Whisper Base + variants

openai/whisper-base

Automatic Speech Recognition • 72.6M • Updated Feb 29, 2024 • 4.36M • 271
openai/whisper-base.en

Automatic Speech Recognition • 72.6M • Updated Jan 22, 2024 • 22.4k • 43
onnx-community/whisper-base_timestamped

Automatic Speech Recognition • Updated Mar 5, 2025 • 3.53k • 32
Systran/faster-whisper-base

Automatic Speech Recognition • Updated Nov 23, 2023 • 1.33M • 28

Whisper variants

nyrahealth/CrisperWhisper

Automatic Speech Recognition • 2B • Updated Apr 7 • 70.1k • 334

Voice Modality Apps

Runtime error

Agents

2

Voice Generated Visions

🦀

2

Cloud-Native Voice to-Image Generation using LLMs and GenAI

Entertainment Recommendations

Running

Agents

9

MovieReccomender

📚

9

Get a personalized recommendation using AI

Worlds (3D, Games)

Running on Zero

Agents

183

HunyuanWorld-Mirror

🌍

183

Universal 3D World Reconstruction with Any Prior Prompting

Proofs of Concept

Runtime error

Agents

Baby Noise Cancellation Demo

👶

AI-powered baby noise removal demo with STT comparison

Demos

Runtime error

Agents

Baby Noise Cancellation Demo

👶

AI-powered baby noise removal demo with STT comparison
Running

Nano Banana Sketch Cleanup

🎨

AI sketch cleanup with before/after comparisons

Deep Filter

Running

Agents

17

DeepFilterNet2 No File Size Limit

😻

17

Use DeepFilterNet2 to denoise audio no file size limit

Background Noise Removal

Runtime error

Agents

Baby Noise Cancellation Demo

👶

AI-powered baby noise removal demo with STT comparison
Running

Agents

171

DeepFilterNet2

💩

171

Denoise your recordings and view spectrograms
Running

Agents

17

DeepFilterNet2 No File Size Limit

😻

17

Use DeepFilterNet2 to denoise audio no file size limit

Gemini

Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

Vibe Coding

Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

Claude Code

Running

Agents

Claude Code Slash Commands

⚡

Interactive browser for Claude Code slash commands
Running

Claude Code Linux Desktop Slash Commands

🖥

Slashes for Linux Desktop admin

Project Indexes

Running

AI Project Index

🏆

Navigable index of AI projects, tools, and agents
Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

Shakespeare AI

Just for fun projects for converting conventional text into Shakespearean English

Paused

Agents

Featured

30

Diffusion GPT

🖊

30

Generate Shakespearean text using a diffusion model
Sleeping

Shakespeare AI

😻

Rewrites .. stuff .. in Shakespearean English

AI UIs

Sleeping

Agents

Pen Pal AI

📮

Write letters and get thoughtful AI replies

Real Time Video To Video

Running on CPU Upgrade

Featured

101

Krea Realtime Video

👁

101

Generate AI videos from webcam, video, or text
krea/krea-realtime-video

Text-to-Video • Updated Nov 14, 2025 • 3.18k • 280

Voice Enhancement

Running

Agents

19

BroadcastAudioUpscaling

🌖

19

Enhance broadcast audio with super‑resolution upscaling
Running on Zero

Agents

55

Apollo

💻

55

Restore and enhance audio files with AI models

Multimodal AI

Sleeping

Agents

Multimodal AI Taxonomy

🌍
danielrosehill/multimodal-ai-taxonomy

Viewer • Updated Oct 22, 2025 • 85 • 76
Running

Gemini Vibe Coded POCs

🚀

Explore AI tools for productivity, creativity, analysis, and utilities

Taxonomies

Sleeping

Agents

Multimodal AI Taxonomy

🌍
danielrosehill/multimodal-ai-taxonomy

Viewer • Updated Oct 22, 2025 • 85 • 76

Context Utilities

Experimentary workflows (mostly) that aim to rethink how contextual data can be gathered to achieve personalised inference in AI systems

Sleeping

2

AI Context Generation Interviews

⚡

2

Model workflow using agents to proactively develop context
Sleeping

Agents

Context Cruncher

🎙

Transform voice recordings into structured AI context data

Veo 3.1

Running

Agents

591

veo3.1-fast

🐨

591

Generate videos from text prompts or images

Avatar Videos

Running on Zero

MCP

Featured

604

LatentSync

👄

604

Audio Conditioned LipSync with Latent Diffusion Models
Build error

Agents

Featured

1.43k

SadTalker

😭

1.43k

Generate a talking face video from an image and audio
Running

Agents

181

Gradio Lipsync Wav2lip

👄

181

Create lip‑synced videos from a face image and audio
Running

Agents

70

Wav2lip Gpu

🌍

70

Create a talking‑head video from a photo and audio

Architecture

benlehrburger/modern-architecture

Viewer • Updated May 31, 2023 • 1.09k • 262 • 4
Sleeping

Agents

2

ArchitectureClassifier

📈

2

Classify architectural styles in images
Running

Agents

17

Rocco Architecture Render

🚀

17

Generate interior and exterior designs from sketches
Sleeping

Agents

1

London Architecture

💻

1

Classify architectural styles in images

Resume Utilities

Runtime error

2

ATS Resume Checker

🌖

2

Parse resumes and match with job descriptions
Running

Agents

2

Resume Matcher

🚀

2

Match resume with job description

Style Transfer

Running

367

SD Artists Browser

🤘

367

Explore artist styles and build SDXL prompts
Running on Zero

MCP

65

StyleAligned Transfer

🐠

65

Generate images in the style of a reference image
Running

Agents

17

StyleFeatureEditor

💻

17

Edit images with predefined styles or text prompts
Runtime error

Agents

12

Kontext Style LoRAs

🌍

12

Transform images using selected styles

Multi LLM Experiments

Running on CPU Upgrade

58

Outsmart

🧠

58

Watch LLMs negotiate in a real‑time battle arena

Geolocation Utilities

Running

Agents

10

Location Predictor

🌍

10

Identify image location on a map

Leaderboards

Running

22

AudioBench Leaderboard

🥇

22

View and compare audio model performance rankings
Running

110

AI Phone Leaderboard

📱

110

AI Phone Leaderboard
Running

Agents

356

VBench Leaderboard

📊

356

Submit video model evaluation results to a public benchmark

Image Generation

Running on Zero

Agents

313

Sketch2lineart

🚀

313

Generate lineart images from your photos

Object Detection

Running

Agents

Featured

101

Yolov10

📉

101

Detect objects in images with customizable YOLOv10 models

Security Tools

Running

Agents

25

GLiNER-Multi-PII

💻

25

Extract personally identifiable information from text

Text Processing Utilities

Running

Agents

25

GLiNER-Multi-PII

💻

25

Extract personally identifiable information from text
Running

4

MarkItDown Microsoft

🐠

4

Extract Text in Structured Format

Developer Utilities

Running on Zero

Agents

Featured

925

Screenshot to HTML

⚡

925

Generate HTML code from a website screenshot

Data Visualization

Running

3

neulab/conala

🗺

3

Display interactive data visualizations

Medical

Runtime error

Agents

3

Pharmacology Knowledge Graph

💊

3

Explore drug interactions and effects using AI predictions
Running

Agents

67

Medical Diagnosis

📉

67

Classify symptoms to diagnose health issues
Running

25

MediAI Medical AI Agent

🚀

25

AI-Powered Diagnosis & Treatment Assistant
Sleeping

Agents

Lisdexamfetamine Split Dose Modeller

🚀

Model split-dose protocols for lisdexamfetamine/Vyvanse

Hugging Face Utilities

Sleeping

Agents

7

HF Downloader

📈

7

Download Hugging Face repositories to run locally.✔

CAD Utilities

Running on CPU Upgrade

Agents

Featured

136

SGS 1

🚀

136

Generate 3D CAD models from images

Scraping

Running

MCP

104

Web Scraper

🚀

104

Scrape a website and download its content as markdown

Subtitle generation

Running

Agents

11

Whisper WebUI

🚀

11

Generate subtitles from audio or video files

Video editing utilities

Runtime error

Agents

31

LaMa Video Watermark Remover

🌖

31

Remove watermarks from videos
Running

Agents

9

AudioDenoiser

👁

9

Remove background noise from videos
Paused

11

Image Video Colorization

🎥

11

Colorize black and white videos into full color

Game creation

Running

Featured

235

3D Game Maker

🏢

235

create games with AI

Audio editing utilities

Running

Agents

322

Remove Silence From Audio

🦀

322

Remove Silence From Audio
Running on Zero

Agents

386

Audio🔹Separator

🏃

386

Vocal and background audio separator
Runtime error

Agents

Featured

327

Audio Editing

🎧

327

Edit audios with text prompts
Paused

Agents

472

Resemble Enhance

🚀

472

Enhance and denoise your audio files instantly

OSINT

Spaces and models that may have applications in open source intelligence (OSINT)

Running

147

Reverse Face Search

📉

147

Search Face Online
Runtime error

Agents

SATINT Analyst

👁

Professional satellite imagery intelligence analysis
Running

Agents

10

Location Predictor

🌍

10

Identify image location on a map

Prompt engineering

Running

149

Prompt Lab

😻

149

Enhance your prompt into a structured, expert‑level version

Interesting ideas

AI use-cases and appliations that I found interesting (a repo for myself to explore!)

Running on Zero

Agents

Featured

2.26k

MagicQuill

🪶

2.26k

Edit images with scribble‑based color and edge control
Build error

Agents

20

AutoPR

🚀

20

Generate a Twitter or Xiaohongshu post from a research PDF
Running

147

Reverse Face Search

📉

147

Search Face Online
Runtime error

Agents

16

AI STORYTELLER

🏢

16

Generate a video from a story

Data Processing Utilities

Running

620

MinerU Document Extraction Tools

📚

620

Embedded MinerU document extraction demo
Running

MCP

104

Web Scraper

🚀

104

Scrape a website and download its content as markdown
Running

Agents

102

PDF to Dataset

📄

102

Convert PDFs to a Hugging Face dataset

Background Removal

Running on Zero

MCP

2.86k

Background Removal

🌘

2.86k

Remove backgrounds from images instantly
Running on Zero

Agents

Featured

623

Video Background Removal

📽

623

Remove/Change background of video.
Running on Zero

Agents

948

BRIA RMBG 2.0

🐢

948

remove background from any image

Image To Video

Running on Zero

MCP

Featured

1.51k

LTX Video Fast

🎥

1.51k

ultra-fast video model, LTX 0.9.8 13B distilled

Image captioning

Salesforce/blip-image-captioning-base

Image-to-Text • Updated Feb 3, 2025 • 2.06M • 860

Video To Video

QuantStack/Wan2.2-Animate-14B-GGUF

Video-to-Video • 17B • Updated Sep 20, 2025 • 29.6k • 202

Video Generation Quants

QuantStack/Wan2.2-I2V-A14B-GGUF

Image-to-Video • 14B • Updated Jul 29, 2025 • 217k • 349

TTS With Dialog Support

nari-labs/Dia-1.6B

Text-to-Speech • 2B • Updated Jun 1, 2025 • 8.2k • • 2.88k

OCR & Document Processing

rednote-hilab/dots.ocr

Image-Text-to-Text • 3B • Updated Oct 31, 2025 • 236k • 1.31k
Runtime error

15

Ui Rev Doc Model

😻

15

Analysis of data on an invoice
Paused

Agents

Featured

144

Deepdoctection

🏃

144

Convert PDFs and images to structured text and layout data
Running

13

Docsifer

📚

13

Convert documents into clean, LLM-ready Markdown.

Diarisation

pyannote/speaker-diarization-community-1

Automatic Speech Recognition • Updated Sep 29, 2025 • 2.67M • 533

Fast video generation

lightx2v/Wan2.2-Lightning

Text-to-Video • Updated Nov 13, 2025 • 64 • 618
ByteDance/AnimateDiff-Lightning

Text-to-Video • Updated Jan 6, 2025 • 9.87k • 991

Long speech synthesis

microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 109k • 2.39k

Agentic code generation capable

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

Browser use capable

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

256K Context

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

Code Generation Models

Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.93M • • 1.1k

Video Generation

chetwinlow1/Ovi

Image-to-Video • 12B • Updated Nov 15, 2025 • 292 • • 298
lightx2v/Wan2.2-Lightning

Text-to-Video • Updated Nov 13, 2025 • 64 • 618
Wan-AI/Wan2.2-T2V-A14B

Text-to-Video • Updated Aug 7, 2025 • 3.91k • • 499
QuantStack/Wan2.2-I2V-A14B-GGUF

Image-to-Video • 14B • Updated Jul 29, 2025 • 217k • 349

General LLM Quants

ai21labs/AI21-Jamba-Mini-1.7

52B • Updated Feb 2 • 70 • 42

Reasoning Models

ai21labs/AI21-Jamba-Reasoning-3B

Text Generation • 3B • Updated Oct 8, 2025 • 1.38k • 137
LLM360/K2-Think

Text Generation • 33B • Updated Nov 19, 2025 • 110 • • 366
unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF

Text Generation • 8B • Updated Jun 16, 2025 • 56.2k • 414

Embedding Models

google/embeddinggemma-300m

Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.67M • • 1.71k
nvidia/omni-embed-nemotron-3b

Sentence Similarity • 5B • Updated May 6 • 12.4k • 125
Qwen/Qwen3-Embedding-0.6B

Feature Extraction • 0.6B • Updated Apr 20 • 8.59M • • 1.06k

Instructional LLMs

LLMs optimised for instruction following rather than conversational use - quants and original models

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 9.89M • • 6.06k
Qwen/Qwen3-Omni-30B-A3B-Instruct

Any-to-Any • 35B • Updated Sep 22, 2025 • 1.57M • 936
moonshotai/Kimi-K2-Instruct-0905

Text Generation • 1T • Updated Jan 30 • 2.02M • • 722
mistralai/Mistral-7B-Instruct-v0.3

7B • Updated Dec 3, 2025 • 3.28M • 2.63k

Image Generation Models

tencent/HunyuanImage-3.0

Text-to-Image • 83B • Updated Jan 28 • 1.05M • • 1.09k
black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16, 2024 • 323k • • 5.09k
stabilityai/stable-diffusion-xl-base-1.0

Text-to-Image • Updated Oct 30, 2023 • 1.39M • • 7.81k
Qwen/Qwen-Image

Text-to-Image • Updated Aug 18, 2025 • 173k • • 2.51k

Deep research

FractalAIResearch/Fathom-Search-4B

Text Generation • 4B • Updated Oct 10, 2025 • 23 • • 121
Running

Agents

25

Fathom DeepResearch

📊

25

DeepResearch with the fathom search and synthesizer models

Image Generation Quants

Quantized versions of image generation models

lightx2v/Qwen-Image-Lightning

Text-to-Image • Updated Nov 3, 2025 • 422k • • 802

Mobile LLMs

LLMs optimised for running "on-device" (specifically, for this collection, on smartphones with standard inference capabilities)

facebook/MobileLLM-Pro

Text Generation • 1B • Updated Nov 11, 2025 • 4 • 162

LLMs

zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 13.6k • • 1.23k

Agentic LLMs

zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 13.6k • • 1.23k
AgentFlow/agentflow-planner-7b

8B • Updated Oct 12, 2025 • 1.37k • 63
HuggingFaceH4/zephyr-7b-beta

Text Generation • 7B • Updated Oct 16, 2024 • 147k • • 1.85k
zai-org/GLM-4.5-Air

Text Generation • 110B • Updated Aug 11, 2025 • 320k • • 608

Voice Cloning

neuphonic/neutts-air

Text-to-Speech • 0.7B • Updated Feb 12 • 18.8k • 874
coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 8.84M • 3.59k
Running on Zero

Agents

83

Voice Cloning Studio

🚀

83

This space offers an easy-to-use interface for voice cloning

Local model collection

LLMs (mostly quants) that are small enough to run locally (on my hardware). My go-tos.

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15, 2025 • 7.6M • • 951
zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 13.6k • • 1.23k
Qwen/Qwen3-VL-8B-Thinking

Image-Text-to-Text • 9B • Updated Nov 26, 2025 • 254k • • 210
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 9.89M • • 6.06k

Vision Language Models

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15, 2025 • 7.6M • • 951
moondream/moondream3-preview

Image-Text-to-Text • 9B • Updated Apr 9 • 276k • 650

My Image Datasets

danielrosehill/Hebrew-Language-Signage

Viewer • Updated Nov 6, 2025 • 68 • 22
danielrosehill/Jerusalem-Streetscapes

Viewer • Updated Nov 6, 2025 • 413 • 19
danielrosehill/Tel-Aviv-Pics

Updated Nov 6, 2025 • 22
danielrosehill/Jerusalem-High-Rise-Development

Viewer • Updated Nov 6, 2025 • 56 • 10

Audio Datasets

danielrosehill/ai-generated-podcast-episodes

Viewer • Updated Aug 26, 2025 • 66 • 3
danielrosehill/Small-STT-Eval-Audio-Dataset

Viewer • Updated Dec 10, 2025 • 92 • 39

Datasets

danielrosehill/Zapier-Integrations-260825

Viewer • Updated Aug 26, 2025 • 8.64k • 7

Context Data

danielrosehill/Software-Wish-List-Context-Data

Updated Apr 28, 2025 • 16

Text Transformation

danielrosehill/Shakespearean-Text-Transformation-Prompts

Viewer • Updated Apr 21, 2025 • 1 • 26
danielrosehill/Speech-To-Text-System-Prompts-2

Viewer • Updated Apr 9, 2025 • 2 • 35 • 1
Sleeping

Agents

System Prompt Reformatter

📚

Reformats system prompts in the 2nd person and other edits
Sleeping

Agents

BLUF Email Formatter

📧

Format emails with clear subject lines and summaries

Experiments

danielrosehill/Long-Prompt-Experiment

Viewer • Updated Aug 19, 2025 • 92 • 63
Sleeping

Agents

AI Agent UN - Multi-Agent Simulation Framework

🏛

Explore UN voting simulations with AI agents
Sleeping

Max Output Tokens Analysis

📊

Display max output tokens for models over time

Evaluations

danielrosehill/ChatGPT-AI-Vs-API

Updated Apr 22, 2025 • 2
danielrosehill/Long-Prompt-Experiment

Viewer • Updated Aug 19, 2025 • 92 • 63
danielrosehill/STT-Voice-Notes-Evals

Updated Aug 11, 2025 • 11

AI Agents

AI agent network configs and projects

danielrosehill/Code-Gen-Agents-0925

Updated Sep 15, 2025 • 56
Sleeping

Agents

AI Agent UN - Multi-Agent Simulation Framework

🏛

Explore UN voting simulations with AI agents
Sleeping

Agents

Code Gen Agents Network

🔥

Code generation agent network with config navigator

Jerusalem

danielrosehill/Jerusalem-Emergency-Shelters-0925

Viewer • Updated Sep 19, 2025 • 149 • 13
danielrosehill/Jerusalem-Streetscapes

Viewer • Updated Nov 6, 2025 • 413 • 19
danielrosehill/Jerusalem-High-Rise-Development

Viewer • Updated Nov 6, 2025 • 56 • 10

Israel

Projects related to Israel

danielrosehill/Israel-Alerting-Zones

Preview • Updated May 9, 2025 • 17
danielrosehill/Tel-Aviv-Pics

Updated Nov 6, 2025 • 22
danielrosehill/Jerusalem-Streetscapes

Viewer • Updated Nov 6, 2025 • 413 • 19
danielrosehill/Jerusalem-Emergency-Shelters-0925

Viewer • Updated Sep 19, 2025 • 149 • 13

ISO Standards

ISO standard related projects

danielrosehill/ISO-3166-4217-Consolidated

Preview • Updated Sep 3, 2025 • 9

Reference / Lookup Datasets

danielrosehill/ISO-3166-4217-Consolidated

Preview • Updated Sep 3, 2025 • 9

Sustainability Projects

Datasets and projects related to sustainability esp impact investing

danielrosehill/GHG-Emissions-Data

Viewer • Updated Dec 20, 2024 • 78 • 461
danielrosehill/Global-Value-Factor-Database-Refactor-V2

Updated Sep 2, 2025 • 149
danielrosehill/ifvi_valuefactors_deriv

Updated Aug 21, 2025 • 296
danielrosehill/pay-for-outcomes-instruments

Preview • Updated May 21, 2025 • 6

Voice Note Audio And Training

Datasets for STT training, classification fine-tuning - especially voice notes

danielrosehill/Voice-Note-Audio

Preview • Updated Oct 27, 2025 • 201

Character Creation Datasets

Datasets for generating characters from 2D/3D

danielrosehill/Corn-The-Sloth

Viewer • Updated Apr 18, 2025 • 109 • 106

My System Prompt Collections

Collections of system prompts for assistants, agents, and workflows

danielrosehill/Data-Utils-System-Prompts

Updated Apr 9, 2025 • 23
danielrosehill/Email-Management-System-Prompts

Updated Apr 9, 2025 • 20
danielrosehill/General-Purpose-System-Prompts

Updated Apr 9, 2025 • 27 • 1
danielrosehill/Geopolitical-System-Prompts

Updated Apr 9, 2025 • 29

Israel Photo Galleries

Photo galleries of places in Israel for world generation use cases, among others

danielrosehill/Tel-Aviv-Pics

Updated Nov 6, 2025 • 22
danielrosehill/Jerusalem-High-Rise-Development

Viewer • Updated Nov 6, 2025 • 56 • 10

3D Human Digital Humans

Running on Zero

Agents

191

PSHuman

🏃

191

PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF
Runtime error

Agents

11

Pifuhd

🐠

11

Generate 3D human models from images
Running on Zero

Agents

10

HumanWild

⚡

10

Generate 3D human reconstructions from images
Runtime error

51

HSMR

💀

51

Convert images of humans to biomechanically accurate 3D skeletons

3D-General

Paused

Agents

Featured

2.14k

Hunyuan3D-2.1

👻

2.14k

Image-to-3D Generation
Running on Zero

MCP

631

TRELLIS

🏢

631

Scalable and Versatile 3D Generation from images
Running on Zero

Agents

3.31k

Hunyuan3D-2.0

🌍

3.31k

Text-to-3D and Image-to-3D Generation

Image To Image

Running on Zero

Agents

Featured

1.64k

Expression Editor

🐨

1.64k

Quickly edit the expression of a face
Running on Zero

Agents

Featured

1.54k

InstructPix2Pix

🚀

1.54k

Edit images using text instructions
Qwen/Qwen-Image-Edit-2509

Image-to-Image • Updated Sep 22, 2025 • 355k • • 1.16k
Qwen/Qwen-Image

Text-to-Image • Updated Aug 18, 2025 • 173k • • 2.51k

QR Art

Running on Zero

Agents

Featured

1.99k

QR Code AI Art Generator

📱

1.99k

QR Code AI Art Generator Blend QR codes with AI Art

Generative-AI-Favorites

Paused

Agents

Featured

5.12k

Wan2.2 Animate

👁

5.12k

Wan2.2 Animate

Upscalers

Running on Zero

Agents

2.13k

Finegrain Image Enhancer

🖼

2.13k

Clarity AI Upscaler Reproduction
Running on Zero

Agents

154

RealESRGAN Pytorch

🔥

154

User Friendly Image & Video Upscaler!

Single Shot Image To Image (Reference)

Running on Zero

Agents

Featured

1.94k

PhotoMaker

📷

1.94k

Generate personalized photos of a person from a prompt

Speech To Text (STT)

Running on Zero

Agents

Featured

2.77k

Whisper

📉

2.77k

Transcribe audio files into text instantly
openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 5.05M • • 5.81k
ggerganov/whisper.cpp

Automatic Speech Recognition • Updated Oct 29, 2024 • 1.45k

Text To Speech (TTS)

Runtime error

Agents

Featured

2.77k

XTTS

🐸

2.77k

Generate speech from text using a reference voice
microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 109k • 2.39k
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 13.4M • • 6.31k

Image To Video (No Audio)

Running on Zero

Agents

3.75k

Live Portrait

🤪

3.75k

Apply the motion of a video on a portrait
Paused

Agents

Featured

5.12k

Wan2.2 Animate

👁

5.12k

Wan2.2 Animate
Running on Zero

MCP

Featured

2.02k

Stable Video Diffusion 1.1

📺

2.02k

Generate a short video from a single image
Running on Zero

MCP

Featured

1.61k

Wan2.1 Fast

🎥

1.61k

Animate a still image into a short video using a prompt

Music Generation

Running on Zero

Agents

Featured

5.07k

MusicGen

🎵

5.07k

Generate music from a text description and optional melody
suno/bark

Text-to-Speech • Updated Oct 4, 2023 • 18k • 1.53k
Running on L40S

Featured

744

Song Generation

🎵

744

Generate a song from your lyrics and prompts

Image Editing Utilities

Running on Zero

MCP

2.86k

Background Removal

🌘

2.86k

Remove backgrounds from images instantly
Running on Zero

Agents

2.97k

CLIP Interrogator

🕵

2.97k

Generate detailed prompts from any image
Paused

Agents

441

NoWatermark

⚡

441

Powerful Watermark Removal API
Running

Agents

138

Vectorizer AI

🌍

138

Convert images to SVG vectors with customizable settings

Character-Generation

Running on Zero

Agents

191

PSHuman

🏃

191

PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF
Runtime error

MCP

11

3D Game Environment Builder

📚

11

3D Game Environment Builder MCP

Global Value Factor Database (GVFD) - Visualisation And Data

Version controlled refactors for data analysis of the Global Value Factor Database (GVFD) by the International Foundation for Valuing Impacts (IFVI).

Sleeping

Agents

GVFD Navigator

📉

Data visualisation utility for GVFD by IFVI (unofficial)
danielrosehill/Global-Value-Factor-Database-Refactor-V2

Updated Sep 2, 2025 • 149
danielrosehill/ifvi_valuefactors_deriv

Updated Aug 21, 2025 • 296

AI Experiments

Experiments in AI and prompt engineering

Sleeping

Max Output Tokens Analysis

📊

Display max output tokens for models over time
Running

Agents

1

LLM Long Output Experiment (Code Generation)

📈

1

Evaluating max single output length of code gen LLMs
Running

Single Shot Brevity Training

📈

Using one example to train an LLM for informational brevity
Sleeping

Agents

Local STT Eval One Sample

😻

Single sample eval for WER on various Whisper models

Text Reformatting Apps

Implementations of apps for simple text reformatting tasks

Sleeping

Agents

BLUF Email Formatter

📧

Format emails with clear subject lines and summaries
Runtime error

Agents

System Prompt Depersonaliser

😻

Converts personal system prompts for general use
Sleeping

Shakespeare AI

😻

Rewrites .. stuff .. in Shakespearean English

Useful-Models

modularai/Llama-3.1-8B-Instruct-GGUF

Text Generation • 8B • Updated Sep 9, 2024 • 4.89k • 17
MaziyarPanahi/WizardLM-2-7B-GGUF

Text Generation • 7B • Updated Apr 15, 2024 • 157k • 83
MaziyarPanahi/mathstral-7B-v0.1-GGUF

Text Generation • 7B • Updated Jul 16, 2024 • 156k • 7
MaziyarPanahi/phi-4-GGUF

Text Generation • 15B • Updated Jan 8, 2025 • 156k • 8

Hebrew AI Spaces

Spaces to do with Hebrew language AI

Running on CPU Upgrade

Agents

46

Hebrew LLM Leaderboard

🥇

46

Explore LLM benchmark leaderboard with searchable filters
Running

Agents

Hebrew GPT Neo - Science Fiction and Fantasy

🧙

Generate Hebrew text for science fiction and fantasy stories
Sleeping

Agents

מחולל נונסנס רובושאול

🤖

Generate פיקטיביים שאול אמסטרדمسקי ציטוטים
Build error

Agents

Hebrew Sentiment

😻

Hebrew datasets

Datasets in Hebrew

imvladikon/hebrew_speech_coursera

Viewer • Updated May 5, 2023 • 21k • 275 • 10
imvladikon/hebrew_speech_kan

Viewer • Updated May 5, 2023 • 10k • 197 • 13
sivan22/hebrew-handwritten-dataset

Viewer • Updated May 8, 2023 • 5.09k • 73 • 15
imvladikon/hebrew_speech_campus

Viewer • Updated Nov 20, 2023 • 75.9k • 888 • 6

Benchmarks

Running on CPU Upgrade

Agents

Featured

1.37k

Open ASR Leaderboard

🏆

1.37k

Explore and compare speech recognition model benchmarks
Running

Agents

31

Hebrew Transcription Leaderboard

🥇

31

Benchmarking Hebrew Speech-to-Text Models
Running

Agents

449

Agent Leaderboard

💬

449

Ranking of LLMs for agentic tasks

ASR Models

Collection of ASR models for customized STT model training

nvidia/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • Updated Apr 13 • 365k • 1.5k
ibm-granite/granite-speech-3.3-8b

Automatic Speech Recognition • 9B • Updated Apr 2 • 54.7k • 171
nvidia/canary-qwen-2.5b

Automatic Speech Recognition • 3B • Updated Apr 21 • 39.5k • 437
facebook/omniASR-W2V-1B

Automatic Speech Recognition • Updated Nov 27, 2025 • 6

Vintage-LLMs

openai-community/gpt2

Text Generation • 0.1B • Updated Feb 19, 2024 • 13.1M • 3.3k

Fav-Code-Generation-Models

Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • 33B • Updated Jan 12, 2025 • 1.54M • • 2.04k
deepseek-ai/DeepSeek-V2.5-1210

Text Generation • 236B • Updated Dec 11, 2024 • 614 • 257

Hebrew Large Language Models

Language models supporting TTS (predominantly the text generation task). These are almost exclusively fine-tunes of Gemma, Mistral, or the Qwen models

GiliGold/Knesset-DictaBERT

Fill-Mask • 0.2B • Updated Dec 28, 2024 • 56 • 2
yam-peleg/Hebrew-Gemma-11B

Text Generation • 10B • Updated Mar 16, 2024 • 7 • 38
yam-peleg/Hebrew-Mistral-7B

Text Generation • 8B • Updated Apr 26, 2024 • 1k • 73
yam-peleg/Hebrew-Mistral-7B-200K

Text Generation • 8B • Updated May 6, 2024 • 297 • 15

LLM-Experiments

Running

Agents

1

LLM Long Output Experiment (Code Generation)

📈

1

Evaluating max single output length of code gen LLMs
GiliGold/Knesset-DictaBERT

Fill-Mask • 0.2B • Updated Dec 28, 2024 • 56 • 2

Daniel Rosehill PRO

AI & ML interests

Recent Activity

Organizations

danielrosehill 's collections 164

Can You Run It? LLM version

Whisper Hebrish

Open ASR Leaderboard

Asr Metrics

Agentic LLM Price Comparisons

TRELLIS

Claude Agent Picker Pattern

Voice Generated Visions

HunyuanWorld-Mirror

Baby Noise Cancellation Demo

Nano Banana Sketch Cleanup

Baby Noise Cancellation Demo

DeepFilterNet2

DeepFilterNet2 No File Size Limit

Gemini Vibe Coded POCs

AI Project Index

Gemini Vibe Coded POCs

Pen Pal AI

BroadcastAudioUpscaling

Apollo

Multimodal AI Taxonomy

veo3.1-fast

ArchitectureClassifier

Rocco Architecture Render

London Architecture

SD Artists Browser

StyleAligned Transfer

StyleFeatureEditor

Kontext Style LoRAs

Location Predictor

Sketch2lineart

GLiNER-Multi-PII

Screenshot to HTML

Pharmacology Knowledge Graph

Medical Diagnosis

MediAI Medical AI Agent

Lisdexamfetamine Split Dose Modeller

SGS 1

Whisper WebUI

3D Game Maker

Reverse Face Search

SATINT Analyst

Location Predictor

MagicQuill

AutoPR

Reverse Face Search

AI STORYTELLER

Background Removal

Video Background Removal

BRIA RMBG 2.0

Ui Rev Doc Model

Deepdoctection

Docsifer

Fathom DeepResearch

System Prompt Reformatter