Read our How to Run Qwen-Image-2512 Guide! ๐Ÿ’œ

This is a GGUF quantized version of Qwen-Image-2512.
unsloth/Qwen-Image-2512-GGUF uses Unsloth Dynamic 2.0 methodology for SOTA performance.

Samples


๐Ÿ’œ Qwen Chat   |   ๐Ÿค— Hugging Face   |   ๐Ÿค– ModelScope   |    ๐Ÿ“‘ Tech Report    |    ๐Ÿ“‘ Blog   
๐Ÿ–ฅ๏ธ Demo   |   ๐Ÿ’ฌ WeChat (ๅพฎไฟก)   |   ๐Ÿซจ Discord  |    Github  

Introduction

We are excited to introduce Qwen-Image-2512, the December update of Qwen-Imageโ€™s text-to-image foundational model. You are welcome to try the latest model at Qwen Chat. Compared to the base Qwen-Image model released in August, Qwen-Image-2512 features the following key improvements:

  • Enhanced Huamn Realism Qwen-Image-2512 significantly reduces the โ€œAI-generatedโ€ look and substantially enhances overall image realism, especially for human subjects.
  • Finer Natural Detail Qwen-Image-2512 delivers notably more detailed rendering of landscapes, animal fur, and other natural elements.
  • Improved Text Rendering Qwen-Image-2512 improves the accuracy and quality of textual elements, achieving better layout and more faithful multimodal (text + image) composition.

Model Performance

We conducted over 10,000 rounds of blind model evaluations on AI Arena, and the results show that Qwen-Image-2512 is currently the strongest open-source modelโ€”while remaining highly competitive even among closed-source models.

Quick Start

Install the latest version of diffusers

pip install git+https://github.com/huggingface/diffusers

The following contains a code snippet illustrating how to use Qwen-Image-2512:

from diffusers import DiffusionPipeline
import torch

model_name = "Qwen/Qwen-Image-2512"

# Load the pipeline
if torch.cuda.is_available():
    torch_dtype = torch.bfloat16
    device = "cuda"
else:
    torch_dtype = torch.float32
    device = "cpu"

pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype).to(device)

# Generate image
prompt = '''A 20-year-old East Asian girl with delicate, charming features and large, bright brown eyesโ€”expressive and lively, with a cheerful or subtly smiling expression. Her naturally wavy long hair is either loose or tied in twin ponytails. She has fair skin and light makeup accentuating her youthful freshness. She wears a modern, cute dress or relaxed outfit in bright, soft colorsโ€”lightweight fabric, minimalist cut. She stands indoors at an anime convention, surrounded by banners, posters, or stalls. Lighting is typical indoor illuminationโ€”no staged lightingโ€”and the image resembles a casual iPhone snapshot: unpretentious composition, yet brimming with vivid, fresh, youthful charm.'''

negative_prompt = "ไฝŽๅˆ†่พจ็އ๏ผŒไฝŽ็”ป่ดจ๏ผŒ่‚ขไฝ“็•ธๅฝข๏ผŒๆ‰‹ๆŒ‡็•ธๅฝข๏ผŒ็”ป้ข่ฟ‡้ฅฑๅ’Œ๏ผŒ่œกๅƒๆ„Ÿ๏ผŒไบบ่„ธๆ— ็ป†่Š‚๏ผŒ่ฟ‡ๅบฆๅ…‰ๆป‘๏ผŒ็”ป้ขๅ…ทๆœ‰AIๆ„Ÿใ€‚ๆž„ๅ›พๆททไนฑใ€‚ๆ–‡ๅญ—ๆจก็ณŠ๏ผŒๆ‰ญๆ›ฒใ€‚"


# Generate with different aspect ratios
aspect_ratios = {
    "1:1": (1328, 1328),
    "16:9": (1664, 928),
    "9:16": (928, 1664),
    "4:3": (1472, 1104),
    "3:4": (1104, 1472),
    "3:2": (1584, 1056),
    "2:3": (1056, 1584),
}

width, height = aspect_ratios["16:9"]

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=width,
    height=height,
    num_inference_steps=50,
    true_cfg_scale=4.0,
    generator=torch.Generator(device="cuda").manual_seed(42)
).images[0]

image.save("example.png")

Showcase

Enhanced Huamn Realism

In Qwen-Image-2512, human depiction has been substantially refined. Compared to the August release, Qwen-Image-2512 adds significantly richer facial details and better environmental context. For example:

A Chinese female college student, around 20 years old, with a very short haircut that conveys a gentle, artistic vibe. Her hair naturally falls to partially cover her cheeks, projecting a tomboyish yet charming demeanor. She has cool-toned fair skin and delicate features, with a slightly shy yet subtly confident expressionโ€”her mouth crooked in a playful, youthful smirk. She wears an off-shoulder top, revealing one shoulder, with a well-proportioned figure. The image is framed as a close-up selfie: she dominates the foreground, while the background clearly shows her dormitoryโ€”a neatly made bed with white linens on the top bunk, a tidy study desk with organized stationery, and wooden cabinets and drawers. The photo is captured on a smartphone under soft, even ambient lighting, with natural tones, high clarity, and a bright, lively atmosphere full of youthful, everyday energy.

For the same prompt, Qwen-Image-2512 yields notably more lifelike facial features, and background objectsโ€”e.g., the desk, stationery, and beddingโ€”are rendered with significantly greater clarity than in Qwen-Image.

A 20-year-old East Asian girl with delicate, charming features and large, bright brown eyesโ€”expressive and lively, with a cheerful or subtly smiling expression. Her naturally wavy long hair is either loose or tied in twin ponytails. She has fair skin and light makeup accentuating her youthful freshness. She wears a modern, cute dress or relaxed outfit in bright, soft colorsโ€”lightweight fabric, minimalist cut. She stands indoors at an anime convention, surrounded by banners, posters, or stalls. Lighting is typical indoor illuminationโ€”no staged lightingโ€”and the image resembles a casual iPhone snapshot: unpretentious composition, yet brimming with vivid, fresh, youthful charm.

Here, hair strands serve as a key differentiator: Qwen-Imageโ€™s August version tends to blur them together, losing fine detail, whereas Qwen-Image-2512 renders individual strands with precision, resulting in a more natural and realistic appearance.

Another case:

An East Asian teenage boy, aged 15โ€“18, with soft, fluffy black short hair and refined facial contours. His large, warm brown eyes sparkle with energy. His fair skin and sunny, open smile convey an approachable, friendly demeanorโ€”no makeup or blemishes. He wears a blue-and-white summer uniform shirt, slightly unbuttoned, made of thin breathable fabric, with black headphones hanging around his neck. His hands are in his pockets, body leaning slightly forward in a relaxed pose, as if engaged in conversation. Behind him lies a summer school playground: lush green grass and a red rubber track in the foreground, blurred school buildings in the distance, a clear blue sky with fluffy white clouds. The bright, airy lighting evokes a joyful, carefree adolescent atmosphere.

In this example, Qwen-Image-2512 better adheres to semantic instructionsโ€”for instance, the prompt specifies โ€œbody leaning slightly forward,โ€ and Qwen-Image-2512 accurately captures this posture, unlike its predecessor.

An elderly Chinese couple in their 70s in a clean, organized home kitchen. The woman has a kind face and a warm smile, wearing a patterned apron; the man stands behind her, also smiling, as they both gaze at a steaming pot of buns on the stove. The kitchen is bright and tidy, exuding warmth and harmony. The scene is captured with a wide-angle lens to fully show the subjects and their surroundings.

This comparison starkly highlights the gap between the August and December models. The original Qwen-Image struggles to accurately render aged facial features (e.g., wrinkles), resulting in an artificial โ€œAI look.โ€ In contrast, Qwen-Image-2512 precisely captures age cues, dramatically boosting realism.

Finer Natural Detail

Qwen-Image-2512โ€™s enhanced detail rendering extends beyond humansโ€”to landscapes, wildlife, and more. For instance:

A turquoise river winds through a lush canyon. Thick moss and dense ferns blanket the rocky walls; multiple waterfalls cascade from above, enveloped in mist. At noon, sunlight filters through the dense canopy, dappling the river surface with shimmering light. The atmosphere is humid and fresh, pulsing with primal jungle vitality. No humans, text, or artificial traces present.

Side-by-side, Qwen-Image-2512 exhibits superior fidelity in water flow, foliage, and waterfall mistโ€”and renders richer gradation in greens. Another example (wave rendering):

At dawn, a thin mist veils the sea. An ancient stone lighthouse stands at the cliffโ€™s edge, its beacon faintly visible through the fog. Black rocks are pounded by waves, sending up bursts of white spray. The sky glows in soft blue-purple hues under cool, hazy lightโ€”evoking solitude and solemn grandeur.

Fur detail is another highlightโ€”here, a golden retriever portrait:

An ultra-realistic close-up of a golden retriever outdoors under soft daylight. Hair is exquisitely detailed: strands distinct, color transitioning naturally from warm gold to light cream, light glinting delicately at the tips; a gentle breeze adds subtle volume. Undercoat is soft and dense; guard hairs are long and well-defined, with visible layering. Eyes are moist, expressive; nose is slightly damp with fine specular highlights. Background is softly blurred to emphasize the dogโ€™s tangible texture and vivid expression.

Similarly, texture quality improves in depictions of rugged wildlifeโ€”for example, a male argali sheep:

A male argali stands atop a barren, rocky mountainside. Its coarse, dense grey-brown coat covers a powerful, muscular body. Most striking are its massive, thick, outward-spiraling hornsโ€”a symbol of wild strength. Its gaze is alert and sharp. The background reveals steep alpine terrain: jagged peaks, sparse low vegetation, and abundant sunlightโ€”conveying the harsh yet majestic wilderness and the animalโ€™s resilient vitality.

Improved Text Rendering

Qwen-Image-2512 further elevates text renderingโ€”already a strength of the originalโ€”by improving accuracy, layout, and multimodal integration.

For instance, this prompt requests a complete PPT slide illustrating Qwen-Imageโ€™s development roadmap (generation and editing tracks):

่ฟ™ๆ˜ฏไธ€ๅผ ็Žฐไปฃ้ฃŽๆ ผ็š„็ง‘ๆŠ€ๆ„Ÿๅนป็ฏ็‰‡๏ผŒๆ•ดไฝ“้‡‡็”จๆทฑ่“่‰ฒๆธๅ˜่ƒŒๆ™ฏใ€‚ๆ ‡้ข˜ๆ˜ฏโ€œQwen-Imageๅ‘ๅฑ•ๅކ็จ‹โ€ใ€‚ไธ‹ๆ–นไธ€ๆกๆฐดๅนณๅปถไผธ็š„ๅ‘ๅ…‰ๆ—ถ้—ด่ฝด๏ผŒ่ฝด็บฟไธญ้—ดๅ†™็€โ€œ็”Ÿๅ›พ่ทฏ็บฟโ€ใ€‚็”ฑๅทฆไพงๆทก่“่‰ฒๆธๅ˜ไธบๅณไพงๆทฑ็ดซ่‰ฒ๏ผŒๅนถไปฅ็ฒพ่‡ด็š„็ฎญๅคดๆ”ถๅฐพใ€‚ๆ—ถ้—ด่ฝดไธŠๆฏไธช่Š‚็‚น้€š่ฟ‡่™š็บฟ่ฟžๆŽฅ่‡ณไธ‹ๆ–น้†’็›ฎ็š„่“่‰ฒๅœ†่ง’็Ÿฉๅฝขๆ—ฅๆœŸๆ ‡็ญพ๏ผŒๆ ‡็ญพๅ†…ไธบๆธ…ๆ™ฐ็™ฝ่‰ฒๅญ—ไฝ“๏ผŒไปŽๅทฆๅ‘ๅณไพๆฌกๅ†™็€๏ผšโ€œ2025ๅนด5ๆœˆ6ๆ—ฅ Qwen-Image ้กน็›ฎๅฏๅŠจโ€โ€œ2025ๅนด8ๆœˆ4ๆ—ฅ Qwen-Image ๅผ€ๆบๅ‘ๅธƒโ€โ€œ2025ๅนด12ๆœˆ31ๆ—ฅ Qwen-Image-2512 ๅผ€ๆบๅ‘ๅธƒโ€ ๏ผˆๅ‘จๅ›ดๅ…‰ๆ™•ๆ˜พ่‘—๏ผ‰ๅœจไธ‹ๆ–นไธ€ๆกๆฐดๅนณๅปถไผธ็š„ๅ‘ๅ…‰ๆ—ถ้—ด่ฝด๏ผŒ่ฝด็บฟไธญ้—ดๅ†™็€โ€œ็ผ–่พ‘่ทฏ็บฟโ€ใ€‚็”ฑๅทฆไพงๆทก่“่‰ฒๆธๅ˜ไธบๅณไพงๆทฑ็ดซ่‰ฒ๏ผŒๅนถไปฅ็ฒพ่‡ด็š„็ฎญๅคดๆ”ถๅฐพใ€‚ๆ—ถ้—ด่ฝดไธŠๆฏไธช่Š‚็‚น้€š่ฟ‡่™š็บฟ่ฟžๆŽฅ่‡ณไธ‹ๆ–น้†’็›ฎ็š„่“่‰ฒๅœ†่ง’็Ÿฉๅฝขๆ—ฅๆœŸๆ ‡็ญพ๏ผŒๆ ‡็ญพๅ†…ไธบๆธ…ๆ™ฐ็™ฝ่‰ฒๅญ—ไฝ“๏ผŒไปŽๅทฆๅ‘ๅณไพๆฌกๅ†™็€๏ผšโ€œ2025ๅนด8ๆœˆ18ๆ—ฅ Qwen-Image-Edit ๅผ€ๆบๅ‘ๅธƒโ€โ€œ2025ๅนด9ๆœˆ22ๆ—ฅ Qwen-Image-Edit-2509 ๅผ€ๆบๅ‘ๅธƒโ€โ€œ2025ๅนด12ๆœˆ19ๆ—ฅ Qwen-Image-Layered ๅผ€ๆบๅ‘ๅธƒโ€โ€œ2025ๅนด12ๆœˆ23ๆ—ฅ Qwen-Image-Edit-2511 ๅผ€ๆบๅ‘ๅธƒโ€

We can even generate a before-and-after comparison slide to highlight the leap from โ€œAI-blurryโ€ to โ€œphotorealisticโ€:

่ฟ™ๆ˜ฏไธ€ๅผ ็Žฐไปฃ้ฃŽๆ ผ็š„็ง‘ๆŠ€ๆ„Ÿๅนป็ฏ็‰‡๏ผŒๆ•ดไฝ“้‡‡็”จๆทฑ่“่‰ฒๆธๅ˜่ƒŒๆ™ฏใ€‚้กถ้ƒจไธญๅคฎไธบ็™ฝ่‰ฒๆ— ่กฌ็บฟ็ฒ—ไฝ“ๅคงๅญ—ๆ ‡้ข˜โ€œQwen-Image-2512้‡็ฃ…ๅ‘ๅธƒโ€ใ€‚็”ป้ขไธปไฝ“ไธบๆจชๅ‘ๅฏนๆฏ”ๅ›พ๏ผŒ่ง†่ง‰็„ฆ็‚น้›†ไธญไบŽไธญ้—ด็š„ๅ‡็บงๅฏนๆฏ”ๅŒบๅŸŸใ€‚ๅทฆไพงไธบ้ข้ƒจๅ…‰ๆป‘ๆฒกๆœ‰ไปปไฝ•็ป†่Š‚็š„ๅฅณๆ€งไบบๅƒ๏ผŒ่ดจๆ„Ÿๅทฎ๏ผ›ๅณไพงไธบ้ซ˜ๅบฆๅ†™ๅฎž็š„ๅนด่ฝปๅฅณๆ€ง่‚–ๅƒ๏ผŒ็šฎ่‚คๅ‘ˆ็Žฐ็œŸๅฎžๆฏ›ๅญ”็บน็†ไธŽ็ป†ๅพฎๅ…‰ๅฝฑๅ˜ๅŒ–๏ผŒๅ‘ไธๆ นๆ นๅˆ†ๆ˜Ž๏ผŒ็œผ็œธ้€ไบฎ๏ผŒ่กจๆƒ…่‡ช็„ถ๏ผŒๆ•ดไฝ“่ดจๆ„ŸๆŽฅ่ฟ‘ๅ†™ๅฎžๆ‘„ๅฝฑใ€‚ไธคๅ›พๅƒไน‹้—ดไปฅไธ€ไธช็ปฟ่‰ฒๆต็บฟๅž‹็ฎญๅคด้“พๆŽฅใ€‚้€ ๅž‹็ง‘ๆŠ€ๆ„Ÿๅ่ถณ๏ผŒไธญ้ƒจๆ ‡ๆณจโ€œ2512่ดจๆ„Ÿๅ‡็บงโ€๏ผŒไฝฟ็”จ็™ฝ่‰ฒๅŠ ็ฒ—ๅญ—ไฝ“๏ผŒๅฑ…ไธญๆ˜พ็คบใ€‚็ฎญๅคดไธคไพงๆœ‰ๅพฎๅผฑๅ…‰ๆ™•ๆ•ˆๆžœ๏ผŒๅขžๅผบๅŠจๆ€ๆ„Ÿใ€‚ๅœจๅ›พๅƒไธ‹ๆ–น๏ผŒไปฅ็™ฝ่‰ฒๆ–‡ๅญ—ๅ‘ˆ็Žฐไธ‰่กŒ่ฏดๆ˜Ž๏ผšโ€œโ— ๆ›ด็œŸๅฎž็š„ไบบ็‰ฉ่ดจๆ„Ÿใ€‚ๅคงๅน…ๅบฆ้™ไฝŽไบ†็”Ÿๆˆๅ›พ็‰‡็š„AIๆ„Ÿ๏ผŒๆๅ‡ไบ†ๅ›พๅƒ็œŸๅฎžๆ€ง โ— ๆ›ด็ป†่…ป็š„่‡ช็„ถ็บน็†ใ€‚ๅคงๅน…ๅบฆๆๅ‡ไบ†็”Ÿๆˆๅ›พ็‰‡็š„็บน็†็ป†่Š‚ใ€‚้ฃŽๆ™ฏๅ›พ๏ผŒๅŠจ็‰ฉๆฏ›ๅ‘ๅˆป็”ปๆ›ด็ป†่…ปใ€‚โ— ๆ›ดๅคๆ‚็š„ๆ–‡ๅญ—ๆธฒๆŸ“ใ€‚ๅคงๅน…ๆๅ‡ไบ†ๆ–‡ๅญ—ๆธฒๆŸ“็š„่ดจ้‡ใ€‚ๅ›พๆ–‡ๆททๅˆๆธฒๆŸ“ๆ›ดๅ‡†็กฎ๏ผŒๆŽ’็‰ˆๆ›ดๅฅฝโ€

A more complex infographic example:

่ฟ™ๆ˜ฏไธ€ๅน…ไธ“ไธš็บงๅทฅไธšๆŠ€ๆœฏไฟกๆฏๅ›พ่กจ๏ผŒๆ•ดไฝ“้‡‡็”จๆทฑ่“่‰ฒ็ง‘ๆŠ€ๆ„Ÿ่ƒŒๆ™ฏ๏ผŒๅ…‰็บฟๅ‡ๅŒ€ๆŸ”ๅ’Œ๏ผŒ่ฅ้€ ๅ‡บๅ†ท้™ใ€็ฒพๅ‡†็š„็Žฐไปฃๅทฅไธšๆฐ›ๅ›ดใ€‚็”ป้ขๅˆ†ไธบๅทฆๅณไธคๅคงๆฟๅ—๏ผŒๅธƒๅฑ€ๆธ…ๆ™ฐ๏ผŒ่ง†่ง‰ๅฑ‚ๆฌกๅˆ†ๆ˜Žใ€‚ๅทฆไพงๆฟๅ—ๆ ‡้ข˜ไธบโ€œๅฎž้™…ๅ‘็”Ÿ็š„็Žฐ่ฑกโ€๏ผŒไปฅๆต…่“่‰ฒๅœ†่ง’็Ÿฉๅฝขๆก†็ชๅ‡บๆ˜พ็คบ๏ผŒๅ†…้ƒจๆŽ’ๅˆ—ไธ‰ไธชๆทฑ่“่‰ฒๆŒ‰้’ฎๅผๆก็›ฎ๏ผŒ็ฌฌไธ€ไธชๆก็›ฎๅฑ•็คบไธ€ๅ †ๆฃ•่‰ฒ็ฒ‰ๆœซ็ŠถๅŽŸๆ–™ไธŠๆปด่ฝๆฐดๆปด็š„ๅ›พๆ ‡๏ผŒๆ–‡ๅญ—ไธบโ€œๅ›ข่š/็ป“ๅ—โ€๏ผŒๅŽ้ข้…ๆœ‰็ปฟ่‰ฒๅฏน้’ฉ๏ผ›็ฌฌไบŒไธชๆก็›ฎไธบไธ€ไธช่ฃ…ๆœ‰่“่‰ฒๆถฒไฝ“ๅนถๅ†’ๅ‡บๆฐ”ๆณก็š„้”ฅๅฝข็“ถ๏ผŒๆ–‡ๅญ—ไธบโ€œไบง็”Ÿๆฐ”ๆณก/็ผบ้™ทโ€๏ผŒๅŽ้ข้…ๆœ‰็ปฟ่‰ฒๅฏน้’ฉ๏ผ›็ฌฌไธ‰ไธชๆก็›ฎไธบไธคไธช็”Ÿ้”ˆ็š„้ฝฟ่ฝฎ๏ผŒๆ–‡ๅญ—ไธบโ€œ่ฎพๅค‡่…่š€/ๅ‚ฌๅŒ–ๅ‰‚ๅคฑๆดปโ€๏ผŒๅŽ้ข้…ๆœ‰็ปฟ่‰ฒๅฏน้’ฉใ€‚ๅณไพงๆฟๅ—ๆ ‡้ข˜ไธบโ€œใ€ไธไผšใ€‘ๅ‘็”Ÿ็š„็Žฐ่ฑกโ€๏ผŒไฝฟ็”จ็ฑณ้ป„่‰ฒๅœ†่ง’็Ÿฉๅฝขๆก†ๅ‘ˆ็Žฐ๏ผŒๅ†…้ƒจๅ››ไธชๆก็›ฎๅ‡็ฝฎไบŽๆทฑ็ฐ่‰ฒ่ƒŒๆ™ฏๆ–นๆก†ไธญใ€‚ๅ›พๆ ‡ๅˆ†ๅˆซไธบ๏ผšไธ€็ป„็ฒพๅฏ†ๅ•ฎๅˆ็š„้‡‘ๅฑž้ฝฟ่ฝฎ๏ผŒๆ–‡ๅญ—ไธบโ€œๅๅบ”ๆ•ˆ็އใ€ๆ˜พ่‘—ๆ้ซ˜ใ€‘โ€๏ผŒไธŠๆ–น่ฆ†็›–้†’็›ฎ็š„็บข่‰ฒๅ‰ๅท๏ผ›ไธ€ๆ†ๆ•ด้ฝๆŽ’ๅˆ—็š„้‡‘ๅฑž็ฎกๆ๏ผŒๆ–‡ๅญ—ไธบโ€œๆˆๅ“ๅ†…้ƒจใ€็ปๅฏนๆ— ๆฐ”ๆณก/ๅญ”้š™ใ€‘โ€๏ผŒไธŠๆ–น่ฆ†็›–้†’็›ฎ็š„็บข่‰ฒๅ‰ๅท๏ผ›ไธ€ๆกๅšๅ›บ็š„้‡‘ๅฑž้“พๆกๆญฃๅœจๆ‰ฟๅ—ๆ‹‰ๅŠ›๏ผŒๆ–‡ๅญ—ไธบโ€œๆๆ–™ๅผบๅบฆไธŽ่€ไน…ๆ€งใ€ๅพ—ๅˆฐๅขžๅผบใ€‘โ€๏ผŒไธŠๆ–น่ฆ†็›–้†’็›ฎ็š„็บข่‰ฒๅ‰ๅท๏ผ›ไธ€ๅ †่…่š€็š„ๆ‰ณๆ‰‹๏ผŒๆ–‡ๅญ—ไธบโ€œๅŠ ๅทฅ่ฟ‡็จ‹ใ€้›ถ่…่š€/้›ถๅ‰ฏๅๅบ”้ฃŽ้™ฉใ€‘โ€๏ผŒไธŠๆ–น่ฆ†็›–้†’็›ฎ็š„็บข่‰ฒๅ‰ๅทใ€‚ๅบ•้ƒจไธญๅคฎๆœ‰ไธ€่กŒๅฐๅญ—ๆณจ้‡Š๏ผšโ€œๆณจ๏ผšๆฐดๅˆ†็š„ๅญ˜ๅœจ้€šๅธธไผšๅฏผ่‡ด่ดŸ้ขๆˆ–ๅนฒๆ‰ฐๆ€ง็š„็ป“ๆžœ๏ผŒ่€Œ้ž็†ๆƒณๆˆ–ๅขžๅผบ็š„็Šถๆ€โ€๏ผŒๅญ—ไฝ“ไธบ็™ฝ่‰ฒ๏ผŒๆธ…ๆ™ฐๅฏ่ฏปใ€‚ๆ•ดไฝ“้ฃŽๆ ผ็Žฐไปฃ็ฎ€็บฆ๏ผŒ้…่‰ฒๅฏนๆฏ”ๅผบ็ƒˆ๏ผŒๅ›พๅฝข็ฌฆๅทๅ‡†็กฎไผ ่พพๆŠ€ๆœฏ้€ป่พ‘๏ผŒ้€‚ๅˆ็”จไบŽๅทฅไธšๅŸน่ฎญๆˆ–็ง‘ๆ™ฎๆผ”็คบๅœบๆ™ฏใ€‚

Or even a full educational poster:

่ฟ™ๆ˜ฏไธ€ๅน…็”ฑๅไบŒไธชๅˆ†ๆ ผ็ป„ๆˆ็š„3ร—4็ฝ‘ๆ ผๅธƒๅฑ€็š„ๅ†™ๅฎžๆ‘„ๅฝฑไฝœๅ“๏ผŒๆ•ดไฝ“ๅ‘ˆ็Žฐโ€œๅฅๅบท็š„ไธ€ๅคฉโ€ไธป้ข˜๏ผŒ็”ป้ข้ฃŽๆ ผ็ฎ€ๆดๆธ…ๆ™ฐ๏ผŒๆฏไธ€ๅˆ†ๆ ผ็‹ฌ็ซ‹ๆˆๆ™ฏๅˆ็ปŸไธ€ไบŽ็”Ÿๆดป่Š‚ๅฅ็š„ๅ™ไบ‹่„‰็ปœใ€‚็ฌฌไธ€่กŒๅˆ†ๅˆซๆ˜ฏโ€œ06:00 ๆ™จ่ท‘ๅ”ค้†’่บซไฝ“โ€๏ผš้ข้ƒจ็‰นๅ†™๏ผŒไธ€ไฝๅฅณๆ€ง่บซ็ฉฟ็ฐ่‰ฒ่ฟๅŠจๅฅ—่ฃ…๏ผŒ่ƒŒๆ™ฏๆ˜ฏๅˆๅ‡็š„ๆœ้˜ณไธŽ่‘ฑ้ƒ็ปฟๆ ‘๏ผ›โ€œ06:30 ๅŠจๆ€ๆ‹‰ไผธๆฟ€ๆดปๅ…ณ่Š‚โ€๏ผšๅฅณๆ€ง่บซ็€็‘œไผฝๆœๅœจ้˜ณๅฐๅšๆ™จ้—ดๆ‹‰ไผธ๏ผŒ่บซไฝ“่ˆ’ๅฑ•๏ผŒ่ƒŒๆ™ฏไธบๆทก็ฒ‰่‰ฒๅคฉ็ฉบไธŽ่ฟœๅฑฑ่ฝฎๅป“๏ผ›โ€œ07:30 ๅ‡่กก่ฅๅ…ปๆ—ฉ้คโ€๏ผšๆกŒไธŠๆ‘†ๆ”พๅ…จ้บฆ้ขๅŒ…ใ€็‰›ๆฒนๆžœๅ’Œไธ€ๆฏๆฉ™ๆฑ๏ผŒๅฅณๆ€งๅพฎ็ฌ‘็€ๅ‡†ๅค‡็”จ้ค๏ผ›โ€œ08:00 ่กฅๆฐดๆถฆ็‡ฅโ€๏ผš้€ๆ˜Ž็Žป็’ƒๆฐดๆฏไธญๆตฎๆœ‰ๆŸ ๆชฌ็‰‡๏ผŒๅฅณๆ€งๆ‰‹ๆŒๆฐดๆฏ่ฝปๅ•œ๏ผŒ้˜ณๅ…‰ไปŽๅทฆไพงๆ–œ็…งๅ…ฅๅฎค๏ผŒๆฏๅฃๆฐด็ ๆป‘่ฝ๏ผ›็ฌฌไบŒ่กŒๅˆ†ๅˆซๆ˜ฏ๏ผšโ€œ09:00 ไธ“ๆณจ้ซ˜ๆ•ˆๅทฅไฝœโ€๏ผšๅฅณๆ€งไธ“ๆณจๆ•ฒๅ‡ป้”ฎ็›˜๏ผŒๅฑๅน•ๆ˜พ็คบ็ฎ€ๆด็•Œ้ข๏ผŒ่บซๆ—ๆ”พๆœ‰ไธ€ๆฏๅ’–ๅ•กไธŽไธ€็›†็ปฟๆค๏ผ›โ€œ12:00 ้™ๅฟƒ้˜…่ฏปๆ—ถๅ…‰โ€๏ผšๅฅณๆ€งๅๅœจไนฆๆกŒๅ‰็ฟป้˜…็บธ่ดจไนฆ็ฑ๏ผŒๅฐ็ฏๆ•ฃๅ‘ๆš–ๅ…‰๏ผŒไนฆ้กตๆณ›้ป„๏ผŒๆ—ๆ”พๅŠๆฏ็บข่Œถ๏ผ›โ€œ12:30 ๅˆๅŽ่ฝปๆพๆผซๆญฅโ€๏ผšๅฅณๆ€งๅœจๆž—่ซ้“ไธŠๆผซๆญฅ๏ผŒ่„ธ้ƒจ็‰นๅ†™๏ผ›โ€œ15:00 ่Œถ้ฆ™ไผดๅˆๅŽโ€๏ผšๅฅณๆ€ง็ซฏ็€้ชจ็“ท่Œถๆฏ็ซ™ๅœจ็ช—่พน๏ผŒ็ช—ๅค–ๆ˜ฏๅŸŽๅธ‚่ก—ๆ™ฏไธŽ้ฃ˜ๅŠจไบ‘ๆœต๏ผŒ่Œถ้ฆ™่ข…่ข…๏ผ›็ฌฌไธ‰่กŒๅˆ†ๅˆซๆ˜ฏ๏ผšโ€œ18:00 ่ฟๅŠจ้‡Šๆ”พๅŽ‹ๅŠ›โ€๏ผšๅฅ่บซๆˆฟๅ†…๏ผŒๅฅณๆ€งๆญฃๅœจ็ปƒไน ็‘œไผฝ๏ผ›โ€œ19:00 ็พŽๅ‘ณๆ™š้คโ€๏ผšๅฅณๆ€งๅœจๅผ€ๆ”พๅผๅŽจๆˆฟไธญๅˆ‡่œ๏ผŒ็ งๆฟไธŠๆœ‰็•ช่Œ„ไธŽ้’ๆค’๏ผŒ้”…ไธญ็ƒญๆฐ”ๅ‡่…พ๏ผŒ็ฏๅ…‰ๆธฉๆš–๏ผ›โ€œ21:00 ๅ†ฅๆƒณๅŠฉ็œ โ€๏ผšๅฅณๆ€ง็›˜่…ฟๅๅœจๆŸ”่ฝฏๅœฐๆฏฏไธŠๅ†ฅๆƒณ๏ผŒๅŒๆ‰‹่ฝปๆ”พ่†ไธŠ๏ผŒ้—ญ็›ฎๅฎ้™๏ผ›โ€œ21:30 ่ฟ›ๅ…ฅ็ก็œ โ€๏ผšๅฅณๆ€ง่บบๅœจๅบŠไธŠไผ‘ๆฏใ€‚ๆ•ดไฝ“้‡‡็”จ่‡ช็„ถๅ…‰็บฟไธบไธป๏ผŒ่‰ฒ่ฐƒไปฅๆš–็™ฝไธŽ็ฑณ็ฐไธบๅŸบ่ฐƒ๏ผŒๅ…‰ๅฝฑๅฑ‚ๆฌกๅˆ†ๆ˜Ž๏ผŒ็”ป้ขๅ……ๆปกๆธฉ้ฆจ็š„็”Ÿๆดปๆฐ”ๆฏไธŽ่ง„ๅพ‹็š„่Š‚ๅฅๆ„Ÿใ€‚

These are the core enhancements in this update. We hope you enjoy using Qwen-Image-2512!

Citation

If Qwen-Image-2512 proves helpful in your research, weโ€™d greatly appreciate your citation ๐Ÿ“ :)

@misc{wu2025qwenimagetechnicalreport,
      title={Qwen-Image Technical Report}, 
      author={Chenfei Wu and Jiahao Li and Jingren Zhou and Junyang Lin and Kaiyuan Gao and Kun Yan and Sheng-ming Yin and Shuai Bai and Xiao Xu and Yilei Chen and Yuxiang Chen and Zecheng Tang and Zekai Zhang and Zhengyi Wang and An Yang and Bowen Yu and Chen Cheng and Dayiheng Liu and Deqing Li and Hang Zhang and Hao Meng and Hu Wei and Jingyuan Ni and Kai Chen and Kuan Cao and Liang Peng and Lin Qu and Minggang Wu and Peng Wang and Shuting Yu and Tingkun Wen and Wensen Feng and Xiaoxiao Xu and Yi Wang and Yichang Zhang and Yongqiang Zhu and Yujia Wu and Yuxuan Cai and Zenan Liu},
      year={2025},
      eprint={2508.02324},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.02324}, 
}
Downloads last month
49,560
GGUF
Model size
20B params
Architecture
qwen_image
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Examples
Examples
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for unsloth/Qwen-Image-2512-GGUF

Quantized
(4)
this model

Collection including unsloth/Qwen-Image-2512-GGUF