Quantcast
Channel: Trending Questions - Cross Validated Meta
Viewing all articles
Browse latest Browse all 364

Class Imbalance in machine learning (again) [closed]

$
0
0

Over the last few days, I have been trying to write a "canonical Q&A" on the class imbalance "problem" for the datascience stackexchange because there is so much confusion over there. I think @Dave will know what I mean.

As many here will know, on CV we have some great threads on this topic, including these:

What is the root cause of the class imbalance problem?

When is unbalanced data really a problem in Machine Learning?

Area under the ROC curve when there is imbalance: is there a problem, and if not, why does this rumor exist?

Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?

Brier Score and extreme class imbalance

There are so many great answers and comments, and happily a reasonable consensus has formed. I want to give attribution to the correct people and that is what I tried to do.

I'm sure that I have missed things, and if anyone would like me to add their contributions, please let me know!

Here is the link to the Data Science.SE thread: Is class imbalance really a problem in machine learning?.


Viewing all articles
Browse latest Browse all 364

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>