Sensitivity of Machine Learning Models to Sampling Variability

Department

Computer Science and Cybersecurity

Document Type

Poster

Abstract

Machine learning is increasingly used in healthcare for disease prediction, but small datasets pose challenges. With limited samples, the choice of sampling strategy, how data is divided for training and validation, can significantly affect reported model performance. Two common approaches, K-Fold Cross-Validation and Repeated Random Sub-sampling, may yield different results for the same model. Understanding which algorithms are most sensitive to this variability is critical for clinical applications where consistent, reliable predictions matter. This study evaluates the sensitivity of three classification models: K-Nearest Neighbors, Gaussian Naive Bayes, and Neural Networks to sampling strategy using the Heart Disease dataset.

Publication Date

Spring 4-9-2026

Comments

Spring 2026: Student Research Conference

Distinguished Presenter Award: Amina Mohamud

Excellence in Knowledge Sharing Award: Amina Mohamud

Recommended Citation

Haxton, H., Mohamud, A., Reitz, S. & Perez-Villa, I. (2026, April 9). Sensitivity of machine learning models to sampling variability [Poster presentation]. Student Research Conference Spring 2026, Saint Paul, MN, United States. https://metroworks.metrostate.edu/student-scholarship/37

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Student Scholarship

Sensitivity of Machine Learning Models to Sampling Variability

Department

Document Type

Abstract

Publication Date

Comments

Recommended Citation

Creative Commons License

Search

Browse

Author Corner

Links

Student Scholarship

Sensitivity of Machine Learning Models to Sampling Variability

Authors

Department

Document Type

Abstract

Publication Date

Comments

Recommended Citation

Creative Commons License

Share

Search

Browse

Author Corner

Links