This paper examines a critical question in the application of machine learning models in the social sciences, namely the use of performance metrics for model evaluation of binary classification tasks. More specifically, it investigates how sample data imbalance as measured in prevalence level...