Phi-3.5-mini-instruct-GGUF
Original Model
microsoft/Phi-3.5-mini-instruct
Run with LlamaEdge
LlamaEdge version: v0.14.0 and above
Prompt template
Context size: 128000
Run as LlamaEdge service
wasmedge --dir .:. --nn-preload default:GGML:AUTO:Phi-3.5-mini-instruct-Q5_K_M.gguf \
llama-api-server.wasm \
--prompt-template phi-3-chat \
--ctx-size 128000 \
--model-name phi-3-mini
Run as LlamaEdge command app
wasmedge --dir .:. --nn-preload default:GGML:AUTO:Phi-3.5-mini-instruct-Q5_K_M.gguf \
llama-chat.wasm \
--prompt-template phi-3-chat \
--ctx-size 128000
Quantized GGUF Models
| Name |
Quant method |
Bits |
Size |
Use case |
| Phi-3.5-mini-instruct-Q2_K.gguf |
Q2_K |
2 |
1.42 GB |
smallest, significant quality loss - not recommended for most purposes |
| Phi-3.5-mini-instruct-Q3_K_L.gguf |
Q3_K_L |
3 |
2.09 GB |
small, substantial quality loss |
| Phi-3.5-mini-instruct-Q3_K_M.gguf |
Q3_K_M |
3 |
1.96 GB |
very small, high quality loss |
| Phi-3.5-mini-instruct-Q3_K_S.gguf |
Q3_K_S |
3 |
1.68 GB |
very small, high quality loss |
| Phi-3.5-mini-instruct-Q4_0.gguf |
Q4_0 |
4 |
2.18 GB |
legacy; small, very high quality loss - prefer using Q3_K_M |
| Phi-3.5-mini-instruct-Q4_K_M.gguf |
Q4_K_M |
4 |
2.39 GB |
medium, balanced quality - recommended |
| Phi-3.5-mini-instruct-Q4_K_S.gguf |
Q4_K_S |
4 |
2.19 GB |
small, greater quality loss |
| Phi-3.5-mini-instruct-Q5_0.gguf |
Q5_0 |
5 |
2.64 GB |
legacy; medium, balanced quality - prefer using Q4_K_M |
| Phi-3.5-mini-instruct-Q5_K_M.gguf |
Q5_K_M |
5 |
2.82 GB |
large, very low quality loss - recommended |
| Phi-3.5-mini-instruct-Q5_K_S.gguf |
Q5_K_S |
5 |
2.64 GB |
large, low quality loss - recommended |
| Phi-3.5-mini-instruct-Q6_K.gguf |
Q6_K |
6 |
3.14 GB |
very large, extremely low quality loss |
| Phi-3.5-mini-instruct-Q8_0.gguf |
Q8_0 |
8 |
4.06 GB |
very large, extremely low quality loss - not recommended |
| Phi-3.5-mini-instruct-f16.gguf |
f16 |
16 |
7.64 GB |
|
Quantized with llama.cpp b3499.