Chatterbox Turbo - WebGPU Compatible
This is a WebGPU-compatible version of ResembleAI/chatterbox-turbo-ONNX.
Changes from Original
The original model contains int64 Cast operations and tensors that WebGPU cannot execute.
This version converts all int64 operations to int32, enabling direct WebGPU inference.
Modifications Made:
- conditional_decoder: 521 Cast nodes inserted (376 Shape/Range ops)
- speech_encoder: 350 Cast nodes inserted (243 Shape/Range ops)
- language_model: 3 Cast nodes inserted
- embed_tokens: 1 Cast node inserted
Usage with Transformers.js
import { AutoModel, AutoProcessor } from '@huggingface/transformers';
const model = await AutoModel.from_pretrained('spacekaren/chatterbox-turbo-webgpu', {
device: 'webgpu',
dtype: 'q4f16',
});
const processor = await AutoProcessor.from_pretrained('spacekaren/chatterbox-turbo-webgpu');
Model Size
- Total: ~539 MB (q4f16 quantization)
- Same architecture as original, just int64→int32 conversion
License
MIT (same as original)
Credits
- Original model: ResembleAI/chatterbox-turbo-ONNX
- Conversion script: local.core/scripts/convert_int64_to_int32.py
- Downloads last month
- 186
Model tree for spacekaren/chatterbox-turbo-webgpu
Base model
ResembleAI/chatterbox-turbo
Quantized
ResembleAI/chatterbox-turbo-ONNX