| Strengths | Weaknesses |
---|---|---|
k-nearest neighbors | • Make no assumption about underlying data distribution | • Does not produce a model, limiting the ability to understand how the features are related to the class |
 • If there are more samples of one class than other class, the dominant class will control the classification and cause wrong classification | ||
Naive Bayes | • Requires relatively few examples for training | • Relies on an often-faulty assumption of equally important and independent features |
 • Not ideal for datasets with many numeric features | ||
Decision tree | • Can be used on small dataset | • It is easy to overfit or underfit the model |
 • Model is easy to interpret | • Small changes in the training data can result in large changes to decision logic | |
Neural network | • Conceptually similar to human neural function | • Very prone to overfitting training data |
 • Capable of modeling more complex patterns | • Susceptible to multicollinearity | |
Support vector machines | • High accuracy but not overly influenced by noisy data and not very prone to overfitting | • Finding the best model requires testing of various combinations of kernels and model parameters |
 • Easier for users due to the existence of several well-supported SVM algorithms | ||
 • Most commonly used | ||
Random forest | • Can handle noisy or missing data | • The model is not easily interpretable |
 • Suitable for class imbalance problems |