Daizee
/

Dirty-Calla-4B-mlx

Text Generation

instruction-tuned

Model card Files Files and versions

Daizee commited on Nov 7

Commit

1d20f66

·

verified ·

1 Parent(s): eaa3bc8

Create README.md

Files changed (1) hide show

README.md +44 -0

README.md ADDED Viewed

	@@ -0,0 +1,44 @@

+---
+tags:
+- mlx
+- apple-silicon
+- text-generation
+- gemma3
+- instruction-tuned
+library_name: mlx-lm
+pipeline_tag: text-generation
+base_model: Daizee/Dirty-Calla-4B
+license: apache-2.0
+---
+# 🖤 Dirty-Calla-4B — **MLX** builds for Apple Silicon
+**Dirty-Calla-4B-mlx** provides Apple Silicon–optimized versions of **Daizee/Dirty-Calla-4B**, a fine-tuned **Gemma 3 (4B)** model developed by **Daizee** for expressive, humanlike, and emotionally textured responses.
+This conversion uses Apple’s **MLX** framework for local inference on **M1, M2, and M3 Macs**.
+Each variant trades size for speed or precision, so you can choose what fits your workflow.
+> 🧩 **Note on vocab padding:**
+> The tokenizer and embedding matrix were padded to the next multiple of 64 tokens (262,208 total).
+> Added tokens are labeled `<pad_ex_*>` — they will not appear in normal generations.
+---
+## ⚙️ Variants
+| Folder        | Bits | Group Size | Description |
+|----------------|------|------------|--------------|
+| `mlx/g128/`   | int4 | 128 | Smallest & fastest (lightest memory use) |
+| `mlx/g64/`    | int4 | 64  | Balanced: slightly slower, more stable |
+| `mlx/int8/`   | int8 | —   | Closest to fp16 precision, best coherence |
+---
+## 🚀 Quickstart
+### Run directly from Hugging Face
+```bash
+python -m mlx_lm.generate \
+  --model hf://Daizee/Dirty-Calla-4B-mlx/mlx/g64 \
+  --prompt "Describe a rainy city from the perspective of a poet." \
+  --max-tokens 150 --temp 0.4