YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Description

This model uses K-Means clustering to group wholesale customers based on annual spending patterns.
It identifies patterns across six numerical features from the Wholesale Customers Dataset:

Intended Uses & Limitations

This model is intended only for assignment and analytical use in the context of CSC310 coursework.
It is not designed for production systems or real world deployment.

Intended Uses

  • Demonstrate how to apply K-Means to tabular data.
  • Show how to share models with documentation using Hugging Face.
  • Provide an example of an unsupervised machine learning workflow.

Limitations

  • Dataset is small and not representative of bigger consumer populations.
  • Right-skewed data, and outliers.
  • K-Means may not reflect real world data.
  • Clusters are statistical groupings, not value based categories.

How to Get Started with the Model

You can download and test this model or notebook directly from this repository.

from huggingface_hub import hf_hub_download import joblib, pandas as pd

Download trained model

hf_hub_download(repo_id="CSC310-fall25"/wholesale_clustering", filename="model.joblib")

Hyperparameters

Hyperparameter Value
algorithm lloyd
copy_x True
init k-means++
max_iter 300
n_clusters 3
n_init auto
random_state 42
tol 0.0001
verbose 0

Model Plot

km3 = cluster.KMeans(n_clusters = 3, random_state = 42)

km3

Load model

model = joblib.load("model.joblib")

Model Card Authors

Name: Steven Doss

Model Card Contact

[email protected]

[email protected]

Citation

https://archive.ics.uci.edu/dataset/292/wholesale+customers

Evaluation Results

Silhouette score is 0.46 which confrims a vaid level of separation between customers. Which means that there is enough diversity in spending across the clusters that customers are similar within a group to be different form other clusters., 2. Adjusted Mutual Informatio - This score is low. (0.10). However, it is expected because KMeans found 3 spending groups while labels are only defined by two (Horeca & Retail).

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support