Prepare for the Society of Actuaries PA Exam with our comprehensive quizzes. Our interactive questions and detailed explanations are designed to help guide you through the exam process with confidence.

Each practice test/flash card set has 50 randomly selected questions from a bank of over 500. You'll get a new set of questions each time!

Practice this question and more.


What effect do undersampling and oversampling have on predictions?

  1. They worsen the predictions for the minority class

  2. They balance the performance between minority and majority classes

  3. They make predictions more unreliable

  4. They have no effect on the predictions

The correct answer is: They balance the performance between minority and majority classes

The correct answer highlights the utility of both undersampling and oversampling in addressing class imbalance in datasets. When dealing with skewed distributions of data — where one class, often the minority class, is significantly less represented than the others — these techniques help to balance the representation of each class. Oversampling involves increasing the number of instances in the minority class, often by duplicating existing instances or creating synthetic data points. This approach provides the model with more information about the minority class, helping it to better learn its characteristics and improve prediction accuracy for that class. On the other hand, undersampling reduces the number of instances in the majority class to better balance the dataset. This can help to prevent the model from being biased toward the majority class and improves the model's ability to recognize patterns from the minority class. Both methods aim to create a more balanced dataset that enables the predictive model to perform optimally across all classes, enhancing overall predictive accuracy and ensuring the model does not disproportionately favor the majority class. As a result, the prediction performance improves, especially for the minority class, which is crucial in many applications, such as fraud detection or disease diagnosis, where identifying rare events is vital. These methods can lead to improved predictive performance, particularly in scenarios where class imbalance