RedHatAI
/

Llama-3.2-3B-Instruct-quantized.w8a8

The model is compressed however was incorrectly saved using the "frozen" quantization_status, which results in incorrect loading through transformers.

Files changed (1) hide show

config.json +1 -1

config.json CHANGED Viewed

@@ -75,6 +75,6 @@
     ],
     "kv_cache_scheme": null,
     "quant_method": "compressed-tensors",
-    "quantization_status": "frozen"
   }
 }

     ],
     "kv_cache_scheme": null,
     "quant_method": "compressed-tensors",
+    "quantization_status": "compressed"
   }
 }