Model Detail human-review-recommended

CosyVoice 300M

Alibaba's CosyVoice. Multilingual TTS with emotion control.

COSYVOICEapache-2.02024-08
Parameters
0.3B
Dense model
Context window
Standard context
Architecture
decoder only transformer
tts
Quality score
70
Planner signal

Task Fit

Code AgentNot a fit

Not marked for code agent in the current library.

CodeNot a fit

Not marked for code in the current library.

ChatNot a fit

Not marked for chat in the current library.

RAGNot a fit

Not marked for rag in the current library.

VisionNot a fit

Not marked for vision in the current library.

Image GenerationNot a fit

Not marked for image generation in the current library.

Video GenerationNot a fit

Not marked for video generation in the current library.

VoiceSupported

Speech recognition, TTS, or audio workflows.

Source Confidence

Overallhigh · 86/100
ParametersReviewed / seeded
Task fitReviewed / seeded
MemorySeeded artifact
LicenseSource / seed
BenchmarksMissing
Hardware fitCalculated
Review flags
missing benchmarks

Variants and Quant Artifacts

Choose the artifact first; hardware fit follows from RAM, VRAM, format, and runtime.

2 artifacts
QuantFormatQualityMin RAMReco RAMRuntimeAction
FP16ggufhigh4GB8GBollama, llama.cpp, lm-studio Plan with this
Q8ggufhigh4GB8GBollama, llama.cpp, lm-studio Plan with this

Recommended Hardware

Cheapest That Works
Minisforum UM890 (Ryzen 9 8945HS)
Compatible

Lowest estimated 5-year cost that can run this model.

32GB RAM$597 / 5y
Best Value
Mac mini M4 16GB
Compatible

Enough unified/system memory with a balanced 5-year cost.

16GB RAM$670 / 5y
Best Performance
NVIDIA RTX 6000 Blackwell 96GB
Compatible

Highest local performance signal among compatible hardware.

96GB VRAM$12,376 / 5y

Benchmarks

No benchmark data is available for this model yet.

Source and Review

Hugging FaceFunAudioLLM/CosyVoice-300M
Ollamacosyvoice:300m
Verificationhuman-review-recommended
Artifact sourceseeded
Default variantCosyVoice 300M Coder
Tool callingNot marked

Similar Models