Model Card
Model Card Authors
Mathew
Model Description
This is a Linear Regression model trained on the UCI Automobile dataset to predict the 'symboling' insurance risk rating from 17 car features including price, horsepower, bore, and curb-weight, amongst other continous variables. Symboling is defined as an integer value (whole number), ranging from -3 to +3.
Intended Uses & Limitations
This model is for educational purposes only. It is not suitable for production use because the dataset is small (only 200 or so entries), outdated (1980s), and contained a lot of missing values (41 missing normalized-losses, around 20% of all rows had a missing normalized-losses entry). While the missing data was imputated, predictions should not be used for real insurance predictions.
Training Data
Data source: UCI Automobile dataset (https://archive.ics.uci.edu/dataset/10/automobile). Contains ~200 cars with mixed numeric and categorical features. Missing values were imputed using MICE.
Evaluation Metrics
- R2: 0.603
- RMSE: 0.713
Ethical Considerations
The 'symboling' risk value is not only determined by continous, but categorical variables as well, which the model does not account for. While things such as horsepower, bore, engine-size, and number of doors are good predictors, insurance companies also use brands of cars and the type of car (luxury, sport, etc), as well as a variety of other variables to help determine risk factors. Because the model does not take these variables into account, it is very unreliable.
Audit Questions
- What features most strongly influence predictions?
- Are residuals randomly scattered or patterned?
- How reliable are the evaluation metrics?
Coefficients
| features | coefficients |
|---|---|
| price | -1.73704e-05 |
| highway-mpg | 0.0438076 |
| city-mpg | -0.0610687 |
| peak-rpm | -5.49499e-05 |
| horsepower | 0.00207246 |
| compression-ratio | 0.0187334 |
| stroke | -0.555667 |
| bore | -0.827261 |
| engine-size | 0.013724 |
| num-of-cylinders | -0.498651 |
| curb-weight | -5.04019e-05 |
| height | 0.0239754 |
| width | 0.195005 |
| length | 0.0120506 |
| wheel-base | -0.153431 |
| num-of-doors | -0.428882 |
| normalized-losses | 0.0116676 |
Plots
Predicted vs Actual
Residuals Plot
- Downloads last month
- 4

