YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model description

This model is a Gaussian Naive Bayes classifier trained on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. It predicts whether a breast tumor is malignant (cancerous) or benign (non-cancerous) based on 30 numerical features computed from a digitized image of a fine needle aspirate (FNA) of a breast mass.

The model uses Gaussian Naive Bayes because the features are continuous and approximately normally distributed after scaling. PCA was also used to visualize structure and variance in the dataset, confirming that classes were fairly separable in reduced dimensions.

Intended uses & limitations

Intended use:
This model is for educational and research purposes only. It demonstrates binary classification, feature scaling, and probabilistic modeling using Gaussian Naive Bayes.

Not for clinical or real-world use.
The dataset comes from a medical diagnostic context, but this model has not been clinically validated or certified for healthcare applications.

Limitations:

  • Naive Bayes assumes feature independence, which may not hold perfectly in real data.
  • Performance depends heavily on data quality and class balance.
  • Should not be used to make real patient predictions or medical decisions.

Training Procedure

Training data:
Wisconsin Diagnostic Breast Cancer dataset, containing 569 samples and 30 numeric features (radius, texture, area, smoothness, etc.).

Preprocessing:

  • Features were standardized using StandardScaler.
  • PCA (2 components) was used for visualization only, not for model fitting.

Training/testing split:

  • 80% training / 20% testing (stratified)
  • The training data was further split into subtrain/validation (75%/20% of total).

Model:

  • Gaussian Naive Bayes (default parameters)
  • Fitted on scaled features.

Evaluation metrics:

  • Confusion matrix and classification report were generated.
  • The model achieved high precision and recall, indicating good generalization on the test set.

Hyperparameters

Click to expand
Hyperparameter Value
priors None
var_smoothing 1e-09

Model Plot

GaussianNB()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Evaluation Results

Metric Score
Accuracy 0.95
Precision 0.97
Recall 0.90
F1-score 0.93

How to Get Started with the Model

Follow the instructions posted on the course website to download it

Example input: a single data sample (replace with your own)

sample = pd.DataFrame({ "mean radius": [14.5], "mean texture": [20.5], "mean perimeter": [94.0], "mean area": [600.0], # include all 30 feature columns here as in training })

Model Card Authors

This model card is written by following authors:

Jad Alsassa

Model Card Contact

You can contact the model card authors through following channels: [email protected]

Citation

Wolberg, W., Mangasarian, O., Street, N., & Street, W. (1993). Breast Cancer Wisconsin (Diagnostic) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5DW2B.

BibTeX:

[More Information Needed]

Model Card Author

Jad Alsassa

Intended Uses & Limitations

This model is for educational and demonstration purposes only. It is not validated for medical or clinical use.

Model Description

This Gaussian Naive Bayes model predicts breast tumor malignancy using the Wisconsin Diagnostic Breast Cancer dataset. Features were standardized and PCA was applied for visualization but not during training.

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support