Chatterbox Turbo - WebGPU Compatible

This is a WebGPU-compatible version of ResembleAI/chatterbox-turbo-ONNX.

Changes from Original

The original model contains int64 Cast operations and tensors that WebGPU cannot execute. This version converts all int64 operations to int32, enabling direct WebGPU inference.

Modifications Made:

conditional_decoder: 521 Cast nodes inserted (376 Shape/Range ops)
speech_encoder: 350 Cast nodes inserted (243 Shape/Range ops)
language_model: 3 Cast nodes inserted
embed_tokens: 1 Cast node inserted

Usage with Transformers.js

import { AutoModel, AutoProcessor } from '@huggingface/transformers';

const model = await AutoModel.from_pretrained('spacekaren/chatterbox-turbo-webgpu', {
  device: 'webgpu',
  dtype: 'q4f16',
});

const processor = await AutoProcessor.from_pretrained('spacekaren/chatterbox-turbo-webgpu');

Model Size

Total: ~539 MB (q4f16 quantization)
Same architecture as original, just int64→int32 conversion

License

MIT (same as original)

Credits

Original model: ResembleAI/chatterbox-turbo-ONNX
Conversion script: local.core/scripts/convert_int64_to_int32.py

Downloads last month: 186

Model tree for spacekaren/chatterbox-turbo-webgpu

Base model

ResembleAI/chatterbox-turbo

Quantized

ResembleAI/chatterbox-turbo-ONNX

Quantized

(1)

this model