Model description
This model is a Gaussian Naive Bayes classifier trained on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. It predicts whether a breast tumor is malignant (cancerous) or benign (non-cancerous) based on 30 numerical features computed from a digitized image of a fine needle aspirate (FNA) of a breast mass.
The model uses Gaussian Naive Bayes because the features are continuous and approximately normally distributed after scaling. PCA was also used to visualize structure and variance in the dataset, confirming that classes were fairly separable in reduced dimensions.
Intended uses & limitations
Intended use:
This model is for educational and research purposes only. It demonstrates binary classification, feature scaling, and probabilistic modeling using Gaussian Naive Bayes.
Not for clinical or real-world use.
The dataset comes from a medical diagnostic context, but this model has not been clinically validated or certified for healthcare applications.
Limitations:
- Naive Bayes assumes feature independence, which may not hold perfectly in real data.
- Performance depends heavily on data quality and class balance.
- Should not be used to make real patient predictions or medical decisions.
Training Procedure
Training data:
Wisconsin Diagnostic Breast Cancer dataset, containing 569 samples and 30 numeric features (radius, texture, area, smoothness, etc.).
Preprocessing:
- Features were standardized using
StandardScaler. - PCA (2 components) was used for visualization only, not for model fitting.
Training/testing split:
- 80% training / 20% testing (stratified)
- The training data was further split into subtrain/validation (75%/20% of total).
Model:
- Gaussian Naive Bayes (default parameters)
- Fitted on scaled features.
Evaluation metrics:
- Confusion matrix and classification report were generated.
- The model achieved high precision and recall, indicating good generalization on the test set.
Hyperparameters
Click to expand
| Hyperparameter | Value |
|---|---|
| priors | None |
| var_smoothing | 1e-09 |
Model Plot
GaussianNB()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| priors | None | |
| var_smoothing | 1e-09 |
Evaluation Results
| Metric | Score |
|---|---|
| Accuracy | 0.95 |
| Precision | 0.97 |
| Recall | 0.90 |
| F1-score | 0.93 |
How to Get Started with the Model
Follow the instructions posted on the course website to download it
Example input: a single data sample (replace with your own)
sample = pd.DataFrame({ "mean radius": [14.5], "mean texture": [20.5], "mean perimeter": [94.0], "mean area": [600.0], # include all 30 feature columns here as in training })
Model Card Authors
This model card is written by following authors:
Jad Alsassa
Model Card Contact
You can contact the model card authors through following channels: [email protected]
Citation
Wolberg, W., Mangasarian, O., Street, N., & Street, W. (1993). Breast Cancer Wisconsin (Diagnostic) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5DW2B.
BibTeX:
[More Information Needed]
Model Card Author
Jad Alsassa
Intended Uses & Limitations
This model is for educational and demonstration purposes only. It is not validated for medical or clinical use.
Model Description
This Gaussian Naive Bayes model predicts breast tumor malignancy using the Wisconsin Diagnostic Breast Cancer dataset. Features were standardized and PCA was applied for visualization but not during training.
- Downloads last month
- 5