Daizee commited on
Commit
1d20f66
·
verified ·
1 Parent(s): eaa3bc8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - mlx
4
+ - apple-silicon
5
+ - text-generation
6
+ - gemma3
7
+ - instruction-tuned
8
+ library_name: mlx-lm
9
+ pipeline_tag: text-generation
10
+ base_model: Daizee/Dirty-Calla-4B
11
+ license: apache-2.0
12
+ ---
13
+
14
+ # 🖤 Dirty-Calla-4B — **MLX** builds for Apple Silicon
15
+
16
+ **Dirty-Calla-4B-mlx** provides Apple Silicon–optimized versions of **Daizee/Dirty-Calla-4B**, a fine-tuned **Gemma 3 (4B)** model developed by **Daizee** for expressive, humanlike, and emotionally textured responses.
17
+
18
+ This conversion uses Apple’s **MLX** framework for local inference on **M1, M2, and M3 Macs**.
19
+ Each variant trades size for speed or precision, so you can choose what fits your workflow.
20
+
21
+ > 🧩 **Note on vocab padding:**
22
+ > The tokenizer and embedding matrix were padded to the next multiple of 64 tokens (262,208 total).
23
+ > Added tokens are labeled `<pad_ex_*>` — they will not appear in normal generations.
24
+
25
+ ---
26
+
27
+ ## ⚙️ Variants
28
+
29
+ | Folder | Bits | Group Size | Description |
30
+ |----------------|------|------------|--------------|
31
+ | `mlx/g128/` | int4 | 128 | Smallest & fastest (lightest memory use) |
32
+ | `mlx/g64/` | int4 | 64 | Balanced: slightly slower, more stable |
33
+ | `mlx/int8/` | int8 | — | Closest to fp16 precision, best coherence |
34
+
35
+ ---
36
+
37
+ ## 🚀 Quickstart
38
+
39
+ ### Run directly from Hugging Face
40
+ ```bash
41
+ python -m mlx_lm.generate \
42
+ --model hf://Daizee/Dirty-Calla-4B-mlx/mlx/g64 \
43
+ --prompt "Describe a rainy city from the perspective of a poet." \
44
+ --max-tokens 150 --temp 0.4