Daniel Rosehill PRO
AI & ML interests
Recent Activity
Organizations
-
zai-org/GLM-ASR-Nano-2512
Automatic Speech Recognition • 2B • Updated • 118k • 370 -
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 39.5k • 437 -
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition • 6B • Updated • 481k • 1.6k -
facebook/omniASR-LLM-7B
Automatic Speech Recognition • Updated • 32
-
futo-org/acft-whisper-tiny
Automatic Speech Recognition • 57.7M • Updated • 3 • 1 -
futo-org/acft-whisper-small.en
Automatic Speech Recognition • 0.3B • Updated • 6 • 2 -
futo-org/acft-whisper-base.en
Automatic Speech Recognition • 99.1M • Updated • 4 • 2 -
futo-org/acft-whisper-tiny.en
Automatic Speech Recognition • 57.7M • Updated • 3 • 1
-
openai/whisper-base
Automatic Speech Recognition • 72.6M • Updated • 4.36M • 271 -
openai/whisper-base.en
Automatic Speech Recognition • 72.6M • Updated • 22.4k • 43 -
onnx-community/whisper-base_timestamped
Automatic Speech Recognition • Updated • 3.53k • 32 -
Systran/faster-whisper-base
Automatic Speech Recognition • Updated • 1.33M • 28
- Runtime errorAgents
Baby Noise Cancellation Demo
👶AI-powered baby noise removal demo with STT comparison
- RunningAgents171
DeepFilterNet2
💩171Denoise your recordings and view spectrograms
- RunningAgents17
DeepFilterNet2 No File Size Limit
😻17Use DeepFilterNet2 to denoise audio no file size limit
-
benlehrburger/modern-architecture
Viewer • Updated • 1.09k • 262 • 4 - SleepingAgents2
ArchitectureClassifier
📈2Classify architectural styles in images
- RunningAgents17
Rocco Architecture Render
🚀17Generate interior and exterior designs from sketches
- SleepingAgents1
London Architecture
💻1Classify architectural styles in images
- Running367
SD Artists Browser
🤘367Explore artist styles and build SDXL prompts
- Running on ZeroMCP65
StyleAligned Transfer
🐠65Generate images in the style of a reference image
- RunningAgents17
StyleFeatureEditor
💻17Edit images with predefined styles or text prompts
- Runtime errorAgents12
Kontext Style LoRAs
🌍12Transform images using selected styles
- Runtime errorAgents3
Pharmacology Knowledge Graph
💊3Explore drug interactions and effects using AI predictions
- RunningAgents67
Medical Diagnosis
📉67Classify symptoms to diagnose health issues
- Running25
MediAI Medical AI Agent
🚀25AI-Powered Diagnosis & Treatment Assistant
- SleepingAgents
Lisdexamfetamine Split Dose Modeller
🚀Model split-dose protocols for lisdexamfetamine/Vyvanse
- Running on ZeroAgentsFeatured2.26k
MagicQuill
🪶2.26kEdit images with scribble‑based color and edge control
- Build errorAgents20
AutoPR
🚀20Generate a Twitter or Xiaohongshu post from a research PDF
- Running147
Reverse Face Search
📉147Search Face Online
- Runtime errorAgents16
AI STORYTELLER
🏢16Generate a video from a story
-
rednote-hilab/dots.ocr
Image-Text-to-Text • 3B • Updated • 236k • 1.31k - Runtime error15
Ui Rev Doc Model
😻15Analysis of data on an invoice
- PausedAgentsFeatured144
Deepdoctection
🏃144Convert PDFs and images to structured text and layout data
- Running13
Docsifer
📚13Convert documents into clean, LLM-ready Markdown.
-
danielrosehill/Shakespearean-Text-Transformation-Prompts
Viewer • Updated • 1 • 26 -
danielrosehill/Speech-To-Text-System-Prompts-2
Viewer • Updated • 2 • 35 • 1 - SleepingAgents
System Prompt Reformatter
📚Reformats system prompts in the 2nd person and other edits
- SleepingAgents
BLUF Email Formatter
📧Format emails with clear subject lines and summaries
- Running on ZeroAgents3.75k
Live Portrait
🤪3.75kApply the motion of a video on a portrait
- PausedAgentsFeatured5.12k
Wan2.2 Animate
👁5.12kWan2.2 Animate
- Running on ZeroMCPFeatured2.02k
Stable Video Diffusion 1.1
📺2.02kGenerate a short video from a single image
- Running on ZeroMCPFeatured1.61k
Wan2.1 Fast
🎥1.61kAnimate a still image into a short video using a prompt
- Running on ZeroMCP2.86k
Background Removal
🌘2.86kRemove backgrounds from images instantly
- Running on ZeroAgents2.97k
CLIP Interrogator
🕵2.97kGenerate detailed prompts from any image
- PausedAgents441
NoWatermark
⚡441Powerful Watermark Removal API
- RunningAgents138
Vectorizer AI
🌍138Convert images to SVG vectors with customizable settings
- Running on CPU UpgradeAgents46
Hebrew LLM Leaderboard
🥇46Explore LLM benchmark leaderboard with searchable filters
- RunningAgents
Hebrew GPT Neo - Science Fiction and Fantasy
🧙Generate Hebrew text for science fiction and fantasy stories
- SleepingAgents
מחולל נונסנס רובושאול
🤖Generate פיקטיביים שאול אמסטרדمسקי ציטוטים
- Build errorAgents
Hebrew Sentiment
😻
- Running on CPU UpgradeAgentsFeatured1.37k
Open ASR Leaderboard
🏆1.37kExplore and compare speech recognition model benchmarks
- RunningAgents31
Hebrew Transcription Leaderboard
🥇31Benchmarking Hebrew Speech-to-Text Models
- RunningAgents449
Agent Leaderboard
💬449Ranking of LLMs for agentic tasks
-
imvladikon/wav2vec2-large-xlsr-53-hebrew
Automatic Speech Recognition • 0.3B • Updated • 446 • 7 -
Mizurodp/wav2vec2-large-xls-r-300m-hebrew-colab
Automatic Speech Recognition • Updated • 37 • 1 -
imvladikon/wav2vec2-xls-r-300m-lm-hebrew
Automatic Speech Recognition • 0.3B • Updated • 78 • 4 -
imvladikon/wav2vec2-xls-r-1b-hebrew
Automatic Speech Recognition • 1.0B • Updated • 14 • 2
-
danielrosehill/daniel_whisper_finetune_large_v3_turbo_v2
Automatic Speech Recognition • 0.8B • Updated • 3 -
danielrosehill/daniel_whisper_finetune_medium_v2
Automatic Speech Recognition • 0.8B • Updated • 4 -
danielrosehill/daniel_whisper_finetune_tiny_v2
Automatic Speech Recognition • 37.8M • Updated • 1 -
danielrosehill/daniel_whisper_finetune_base_v2
Automatic Speech Recognition • 72.6M • Updated • 5
-
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • 0.6B • Updated • 106k • • 918 -
ibm-granite/granite-speech-3.3-8b
Automatic Speech Recognition • 9B • Updated • 54.7k • 171 -
facebook/seamless-m4t-v2-large
Automatic Speech Recognition • 2B • Updated • 427k • 985 -
facebook/wav2vec2-base-960h
Automatic Speech Recognition • 94.4M • Updated • 1.08M • 398
-
danielrosehill/Podcast-ASR-Evaluation
Viewer • Updated • 27 • 15 -
danielrosehill/Long-Prompt-Experiment
Viewer • Updated • 92 • 63 - SleepingAgents
Podcast ASR Evaluation
🎙ASR benchmark comparing local and cloud models
- RunningAgents1
LLM Long Output Experiment (Code Generation)
📈1Evaluating max single output length of code gen LLMs
-
pyannote/voice-activity-detection
Automatic Speech Recognition • Updated • 2.63M • 235 -
pyannote/speaker-diarization-3.1
Automatic Speech Recognition • Updated • 8.18M • 2.28k -
pyannote/overlapped-speech-detection
Automatic Speech Recognition • Updated • 92.2k • 58 -
pipecat-ai/smart-turn-v3
Voice Activity Detection • Updated • 168
-
unsloth/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 3.84k • 16 -
unsloth/whisper-small
Automatic Speech Recognition • 0.2B • Updated • 637 • 6 -
unsloth/CrisperWhisper
Automatic Speech Recognition • 2B • Updated • 45 • 16 -
unsloth/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 1.93k • 10
- SleepingAgents
Local STT Eval One Sample
😻Single sample eval for WER on various Whisper models
-
danielrosehill/Podcast-ASR-Evaluation
Viewer • Updated • 27 • 15 - SleepingAgents
Podcast ASR Evaluation
🎙ASR benchmark comparing local and cloud models
- Running
STT Comparison
🦀Comparing STT models against audio
- Running on ZeroMCPFeatured604
LatentSync
👄604Audio Conditioned LipSync with Latent Diffusion Models
- Build errorAgentsFeatured1.43k
SadTalker
😭1.43kGenerate a talking face video from an image and audio
- RunningAgents181
Gradio Lipsync Wav2lip
👄181Create lip‑synced videos from a face image and audio
- RunningAgents70
Wav2lip Gpu
🌍70Create a talking‑head video from a photo and audio
- RunningAgents322
Remove Silence From Audio
🦀322Remove Silence From Audio
- Running on ZeroAgents386
Audio🔹Separator
🏃386Vocal and background audio separator
- Runtime errorAgentsFeatured327
Audio Editing
🎧327Edit audios with text prompts
- PausedAgents472
Resemble Enhance
🚀472Enhance and denoise your audio files instantly
- Running on ZeroAgents191
PSHuman
🏃191PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF
- Runtime errorAgents11
Pifuhd
🐠11Generate 3D human models from images
- Running on ZeroAgents10
HumanWild
⚡10Generate 3D human reconstructions from images
- Runtime error51
HSMR
💀51Convert images of humans to biomechanically accurate 3D skeletons
- Running on ZeroAgentsFeatured1.64k
Expression Editor
🐨1.64kQuickly edit the expression of a face
- Running on ZeroAgentsFeatured1.54k
InstructPix2Pix
🚀1.54kEdit images using text instructions
-
Qwen/Qwen-Image-Edit-2509
Image-to-Image • Updated • 355k • • 1.16k -
Qwen/Qwen-Image
Text-to-Image • Updated • 173k • • 2.51k
- Sleeping
Max Output Tokens Analysis
📊Display max output tokens for models over time
- RunningAgents1
LLM Long Output Experiment (Code Generation)
📈1Evaluating max single output length of code gen LLMs
- Running
Single Shot Brevity Training
📈Using one example to train an LLM for informational brevity
- SleepingAgents
Local STT Eval One Sample
😻Single sample eval for WER on various Whisper models
-
modularai/Llama-3.1-8B-Instruct-GGUF
Text Generation • 8B • Updated • 4.89k • 17 -
MaziyarPanahi/WizardLM-2-7B-GGUF
Text Generation • 7B • Updated • 157k • 83 -
MaziyarPanahi/mathstral-7B-v0.1-GGUF
Text Generation • 7B • Updated • 156k • 7 -
MaziyarPanahi/phi-4-GGUF
Text Generation • 15B • Updated • 156k • 8
-
nvidia/parakeet-tdt-0.6b-v2
Automatic Speech Recognition • Updated • 365k • 1.5k -
ibm-granite/granite-speech-3.3-8b
Automatic Speech Recognition • 9B • Updated • 54.7k • 171 -
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 39.5k • 437 -
facebook/omniASR-W2V-1B
Automatic Speech Recognition • Updated • 6
-
imvladikon/wav2vec2-large-xlsr-53-hebrew
Automatic Speech Recognition • 0.3B • Updated • 446 • 7 -
Mizurodp/wav2vec2-large-xls-r-300m-hebrew-colab
Automatic Speech Recognition • Updated • 37 • 1 -
imvladikon/wav2vec2-xls-r-300m-lm-hebrew
Automatic Speech Recognition • 0.3B • Updated • 78 • 4 -
imvladikon/wav2vec2-xls-r-1b-hebrew
Automatic Speech Recognition • 1.0B • Updated • 14 • 2
-
zai-org/GLM-ASR-Nano-2512
Automatic Speech Recognition • 2B • Updated • 118k • 370 -
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 39.5k • 437 -
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition • 6B • Updated • 481k • 1.6k -
facebook/omniASR-LLM-7B
Automatic Speech Recognition • Updated • 32
-
danielrosehill/daniel_whisper_finetune_large_v3_turbo_v2
Automatic Speech Recognition • 0.8B • Updated • 3 -
danielrosehill/daniel_whisper_finetune_medium_v2
Automatic Speech Recognition • 0.8B • Updated • 4 -
danielrosehill/daniel_whisper_finetune_tiny_v2
Automatic Speech Recognition • 37.8M • Updated • 1 -
danielrosehill/daniel_whisper_finetune_base_v2
Automatic Speech Recognition • 72.6M • Updated • 5
-
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • 0.6B • Updated • 106k • • 918 -
ibm-granite/granite-speech-3.3-8b
Automatic Speech Recognition • 9B • Updated • 54.7k • 171 -
facebook/seamless-m4t-v2-large
Automatic Speech Recognition • 2B • Updated • 427k • 985 -
facebook/wav2vec2-base-960h
Automatic Speech Recognition • 94.4M • Updated • 1.08M • 398
-
futo-org/acft-whisper-tiny
Automatic Speech Recognition • 57.7M • Updated • 3 • 1 -
futo-org/acft-whisper-small.en
Automatic Speech Recognition • 0.3B • Updated • 6 • 2 -
futo-org/acft-whisper-base.en
Automatic Speech Recognition • 99.1M • Updated • 4 • 2 -
futo-org/acft-whisper-tiny.en
Automatic Speech Recognition • 57.7M • Updated • 3 • 1
-
danielrosehill/Podcast-ASR-Evaluation
Viewer • Updated • 27 • 15 -
danielrosehill/Long-Prompt-Experiment
Viewer • Updated • 92 • 63 - SleepingAgents
Podcast ASR Evaluation
🎙ASR benchmark comparing local and cloud models
- RunningAgents1
LLM Long Output Experiment (Code Generation)
📈1Evaluating max single output length of code gen LLMs
-
pyannote/voice-activity-detection
Automatic Speech Recognition • Updated • 2.63M • 235 -
pyannote/speaker-diarization-3.1
Automatic Speech Recognition • Updated • 8.18M • 2.28k -
pyannote/overlapped-speech-detection
Automatic Speech Recognition • Updated • 92.2k • 58 -
pipecat-ai/smart-turn-v3
Voice Activity Detection • Updated • 168
-
unsloth/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 3.84k • 16 -
unsloth/whisper-small
Automatic Speech Recognition • 0.2B • Updated • 637 • 6 -
unsloth/CrisperWhisper
Automatic Speech Recognition • 2B • Updated • 45 • 16 -
unsloth/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 1.93k • 10
- SleepingAgents
Local STT Eval One Sample
😻Single sample eval for WER on various Whisper models
-
danielrosehill/Podcast-ASR-Evaluation
Viewer • Updated • 27 • 15 - SleepingAgents
Podcast ASR Evaluation
🎙ASR benchmark comparing local and cloud models
- Running
STT Comparison
🦀Comparing STT models against audio
-
openai/whisper-base
Automatic Speech Recognition • 72.6M • Updated • 4.36M • 271 -
openai/whisper-base.en
Automatic Speech Recognition • 72.6M • Updated • 22.4k • 43 -
onnx-community/whisper-base_timestamped
Automatic Speech Recognition • Updated • 3.53k • 32 -
Systran/faster-whisper-base
Automatic Speech Recognition • Updated • 1.33M • 28
- Runtime errorAgents
Baby Noise Cancellation Demo
👶AI-powered baby noise removal demo with STT comparison
- RunningAgents171
DeepFilterNet2
💩171Denoise your recordings and view spectrograms
- RunningAgents17
DeepFilterNet2 No File Size Limit
😻17Use DeepFilterNet2 to denoise audio no file size limit
- Running on ZeroMCPFeatured604
LatentSync
👄604Audio Conditioned LipSync with Latent Diffusion Models
- Build errorAgentsFeatured1.43k
SadTalker
😭1.43kGenerate a talking face video from an image and audio
- RunningAgents181
Gradio Lipsync Wav2lip
👄181Create lip‑synced videos from a face image and audio
- RunningAgents70
Wav2lip Gpu
🌍70Create a talking‑head video from a photo and audio
-
benlehrburger/modern-architecture
Viewer • Updated • 1.09k • 262 • 4 - SleepingAgents2
ArchitectureClassifier
📈2Classify architectural styles in images
- RunningAgents17
Rocco Architecture Render
🚀17Generate interior and exterior designs from sketches
- SleepingAgents1
London Architecture
💻1Classify architectural styles in images
- Running367
SD Artists Browser
🤘367Explore artist styles and build SDXL prompts
- Running on ZeroMCP65
StyleAligned Transfer
🐠65Generate images in the style of a reference image
- RunningAgents17
StyleFeatureEditor
💻17Edit images with predefined styles or text prompts
- Runtime errorAgents12
Kontext Style LoRAs
🌍12Transform images using selected styles
- Runtime errorAgents3
Pharmacology Knowledge Graph
💊3Explore drug interactions and effects using AI predictions
- RunningAgents67
Medical Diagnosis
📉67Classify symptoms to diagnose health issues
- Running25
MediAI Medical AI Agent
🚀25AI-Powered Diagnosis & Treatment Assistant
- SleepingAgents
Lisdexamfetamine Split Dose Modeller
🚀Model split-dose protocols for lisdexamfetamine/Vyvanse
- RunningAgents322
Remove Silence From Audio
🦀322Remove Silence From Audio
- Running on ZeroAgents386
Audio🔹Separator
🏃386Vocal and background audio separator
- Runtime errorAgentsFeatured327
Audio Editing
🎧327Edit audios with text prompts
- PausedAgents472
Resemble Enhance
🚀472Enhance and denoise your audio files instantly
- Running on ZeroAgentsFeatured2.26k
MagicQuill
🪶2.26kEdit images with scribble‑based color and edge control
- Build errorAgents20
AutoPR
🚀20Generate a Twitter or Xiaohongshu post from a research PDF
- Running147
Reverse Face Search
📉147Search Face Online
- Runtime errorAgents16
AI STORYTELLER
🏢16Generate a video from a story
-
rednote-hilab/dots.ocr
Image-Text-to-Text • 3B • Updated • 236k • 1.31k - Runtime error15
Ui Rev Doc Model
😻15Analysis of data on an invoice
- PausedAgentsFeatured144
Deepdoctection
🏃144Convert PDFs and images to structured text and layout data
- Running13
Docsifer
📚13Convert documents into clean, LLM-ready Markdown.
-
danielrosehill/Shakespearean-Text-Transformation-Prompts
Viewer • Updated • 1 • 26 -
danielrosehill/Speech-To-Text-System-Prompts-2
Viewer • Updated • 2 • 35 • 1 - SleepingAgents
System Prompt Reformatter
📚Reformats system prompts in the 2nd person and other edits
- SleepingAgents
BLUF Email Formatter
📧Format emails with clear subject lines and summaries
- Running on ZeroAgents191
PSHuman
🏃191PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF
- Runtime errorAgents11
Pifuhd
🐠11Generate 3D human models from images
- Running on ZeroAgents10
HumanWild
⚡10Generate 3D human reconstructions from images
- Runtime error51
HSMR
💀51Convert images of humans to biomechanically accurate 3D skeletons
- Running on ZeroAgentsFeatured1.64k
Expression Editor
🐨1.64kQuickly edit the expression of a face
- Running on ZeroAgentsFeatured1.54k
InstructPix2Pix
🚀1.54kEdit images using text instructions
-
Qwen/Qwen-Image-Edit-2509
Image-to-Image • Updated • 355k • • 1.16k -
Qwen/Qwen-Image
Text-to-Image • Updated • 173k • • 2.51k
- Running on ZeroAgents3.75k
Live Portrait
🤪3.75kApply the motion of a video on a portrait
- PausedAgentsFeatured5.12k
Wan2.2 Animate
👁5.12kWan2.2 Animate
- Running on ZeroMCPFeatured2.02k
Stable Video Diffusion 1.1
📺2.02kGenerate a short video from a single image
- Running on ZeroMCPFeatured1.61k
Wan2.1 Fast
🎥1.61kAnimate a still image into a short video using a prompt
- Running on ZeroMCP2.86k
Background Removal
🌘2.86kRemove backgrounds from images instantly
- Running on ZeroAgents2.97k
CLIP Interrogator
🕵2.97kGenerate detailed prompts from any image
- PausedAgents441
NoWatermark
⚡441Powerful Watermark Removal API
- RunningAgents138
Vectorizer AI
🌍138Convert images to SVG vectors with customizable settings
- Sleeping
Max Output Tokens Analysis
📊Display max output tokens for models over time
- RunningAgents1
LLM Long Output Experiment (Code Generation)
📈1Evaluating max single output length of code gen LLMs
- Running
Single Shot Brevity Training
📈Using one example to train an LLM for informational brevity
- SleepingAgents
Local STT Eval One Sample
😻Single sample eval for WER on various Whisper models
-
modularai/Llama-3.1-8B-Instruct-GGUF
Text Generation • 8B • Updated • 4.89k • 17 -
MaziyarPanahi/WizardLM-2-7B-GGUF
Text Generation • 7B • Updated • 157k • 83 -
MaziyarPanahi/mathstral-7B-v0.1-GGUF
Text Generation • 7B • Updated • 156k • 7 -
MaziyarPanahi/phi-4-GGUF
Text Generation • 15B • Updated • 156k • 8
- Running on CPU UpgradeAgents46
Hebrew LLM Leaderboard
🥇46Explore LLM benchmark leaderboard with searchable filters
- RunningAgents
Hebrew GPT Neo - Science Fiction and Fantasy
🧙Generate Hebrew text for science fiction and fantasy stories
- SleepingAgents
מחולל נונסנס רובושאול
🤖Generate פיקטיביים שאול אמסטרדمسקי ציטוטים
- Build errorAgents
Hebrew Sentiment
😻
- Running on CPU UpgradeAgentsFeatured1.37k
Open ASR Leaderboard
🏆1.37kExplore and compare speech recognition model benchmarks
- RunningAgents31
Hebrew Transcription Leaderboard
🥇31Benchmarking Hebrew Speech-to-Text Models
- RunningAgents449
Agent Leaderboard
💬449Ranking of LLMs for agentic tasks
-
nvidia/parakeet-tdt-0.6b-v2
Automatic Speech Recognition • Updated • 365k • 1.5k -
ibm-granite/granite-speech-3.3-8b
Automatic Speech Recognition • 9B • Updated • 54.7k • 171 -
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 39.5k • 437 -
facebook/omniASR-W2V-1B
Automatic Speech Recognition • Updated • 6