Melvin Vivas
AI & ML interests
Recent Activity
Organizations
- Running on ZeroAgentsFeatured1.17k
Omni Video Factory
🏆1.17ktext to video, image to video, video extend
- Running on ZeroMCP2.71k
Wan2.2 14B Preview
🐌2.71kgenerate a video from an image with a text prompt
- Running on ZeroMCPFeatured3.17k
Wan2.2 14B Fast
🎥3.17kgenerate a video from an image with a text prompt
- Running on ZeroAgentsFeatured167
LTX 2.3 Sync
🕺167Portrait animation & lipsync with LTX 2.3
- Sleeping1
Qwen-3-VL-8B OCR Receipts
🚀1structured data parser from receipt images
- RunningAgentsFeatured264
Qwen3 Omni Demo
⚡264Chat with AI using text, audio, images, or video
- Running on ZeroAgentsFeatured115
VLM Object Understanding
🦀115Explore object detection, visual grounding, keypoint Detecti
- SleepingAgents2
Dataset Card Drafter
😻2Create dataset descriptions and open PRs automatically
- Running on ZeroAgentsFeatured185
VibeVoice-Realtime-0.5B
🐨185Generate natural speech from text with selectable voices
-
microsoft/VibeVoice-1.5B
Text-to-Speech • 3B • Updated • 56.8k • 2.39k - RunningAgentsFeatured403
Qwen3 TTS Demo
🚀403Generate spoken audio from your text in many voices
-
mradermacher/Qwen3-1.7B-Multilingual-TTS-GGUF
2B • Updated • 2.1k • 12
- Running on ZeroAgents947
BRIA RMBG 2.0
🐢947remove background from any image
- Running on CPU UpgradeAgents1.85k
Omni Image Editor
🖼1.85kImage edit, text to image, image upscale, remove watermark
- Running on ZeroMCPFeatured1.61k
Qwen-Image-Edit-2511-LoRAs-Fast
🎃1.61kDemo of the Collection of Qwen Image Edit LoRAs
- Running on CPU UpgradeAgents1.02k
Open VLM Leaderboard
🌎1.02kVLMEvalKit Evaluation Results Collection
- Running on ZeroAgentsFeatured470
DeepSeek OCR 2 Demo
🚀470Try out DeepSeek-OCR-2 on your PDFs or images
- Running on ZeroMCP69
Multimodal OCR3
🌖69Chandra-OCR / Nanonets-OCR2 / olmOCR-2 / Dots.OCR
-
Qwen/Qwen3-VL-30B-A3B-Instruct
Image-Text-to-Text • 31B • Updated • 991k • • 576
-
openai/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 5.36M • • 5.78k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 8.62M • • 3.06k - Running on ZeroMCPFeatured846
Whisper Large V3
🤫846Transcribe audio or YouTube videos to text
- Running on ZeroAgentsFeatured90
Kugel Audio
👀90Generate natural-sounding speech in European languages with voice cloning
- Running on ZeroAgents947
BRIA RMBG 2.0
🐢947remove background from any image
- Running on CPU UpgradeAgents1.85k
Omni Image Editor
🖼1.85kImage edit, text to image, image upscale, remove watermark
- Running on ZeroMCPFeatured1.61k
Qwen-Image-Edit-2511-LoRAs-Fast
🎃1.61kDemo of the Collection of Qwen Image Edit LoRAs
- Running on ZeroAgentsFeatured1.17k
Omni Video Factory
🏆1.17ktext to video, image to video, video extend
- Running on ZeroMCP2.71k
Wan2.2 14B Preview
🐌2.71kgenerate a video from an image with a text prompt
- Running on ZeroMCPFeatured3.17k
Wan2.2 14B Fast
🎥3.17kgenerate a video from an image with a text prompt
- Running on ZeroAgentsFeatured167
LTX 2.3 Sync
🕺167Portrait animation & lipsync with LTX 2.3
- Sleeping1
Qwen-3-VL-8B OCR Receipts
🚀1structured data parser from receipt images
- RunningAgentsFeatured264
Qwen3 Omni Demo
⚡264Chat with AI using text, audio, images, or video
- Running on ZeroAgentsFeatured115
VLM Object Understanding
🦀115Explore object detection, visual grounding, keypoint Detecti
- SleepingAgents2
Dataset Card Drafter
😻2Create dataset descriptions and open PRs automatically
- Running on CPU UpgradeAgents1.02k
Open VLM Leaderboard
🌎1.02kVLMEvalKit Evaluation Results Collection
- Running on ZeroAgentsFeatured470
DeepSeek OCR 2 Demo
🚀470Try out DeepSeek-OCR-2 on your PDFs or images
- Running on ZeroMCP69
Multimodal OCR3
🌖69Chandra-OCR / Nanonets-OCR2 / olmOCR-2 / Dots.OCR
-
Qwen/Qwen3-VL-30B-A3B-Instruct
Image-Text-to-Text • 31B • Updated • 991k • • 576
- Running on ZeroAgentsFeatured185
VibeVoice-Realtime-0.5B
🐨185Generate natural speech from text with selectable voices
-
microsoft/VibeVoice-1.5B
Text-to-Speech • 3B • Updated • 56.8k • 2.39k - RunningAgentsFeatured403
Qwen3 TTS Demo
🚀403Generate spoken audio from your text in many voices
-
mradermacher/Qwen3-1.7B-Multilingual-TTS-GGUF
2B • Updated • 2.1k • 12
-
openai/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 5.36M • • 5.78k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 8.62M • • 3.06k - Running on ZeroMCPFeatured846
Whisper Large V3
🤫846Transcribe audio or YouTube videos to text
- Running on ZeroAgentsFeatured90
Kugel Audio
👀90Generate natural-sounding speech in European languages with voice cloning