Skip to main content

Table 3 Comparison of strengths and weaknesses of machine learning algorithms in electronic nose studies

From: Diagnosis of ventilator-associated pneumonia using electronic nose sensor array signals: solutions to improve the application of machine learning in respiratory research

 

Strengths

Weaknesses

k-nearest neighbors

• Make no assumption about underlying data distribution

• Does not produce a model, limiting the ability to understand how the features are related to the class

 • If there are more samples of one class than other class, the dominant class will control the classification and cause wrong classification

Naive Bayes

• Requires relatively few examples for training

• Relies on an often-faulty assumption of equally important and independent features

 • Not ideal for datasets with many numeric features

Decision tree

• Can be used on small dataset

• It is easy to overfit or underfit the model

 • Model is easy to interpret

• Small changes in the training data can result in large changes to decision logic

Neural network

• Conceptually similar to human neural function

• Very prone to overfitting training data

 • Capable of modeling more complex patterns

• Susceptible to multicollinearity

Support vector machines

• High accuracy but not overly influenced by noisy data and not very prone to overfitting

• Finding the best model requires testing of various combinations of kernels and model parameters

 • Easier for users due to the existence of several well-supported SVM algorithms

 • Most commonly used

Random forest

• Can handle noisy or missing data

• The model is not easily interpretable

 • Suitable for class imbalance problems

  1. Summarized from [27, 53, 54]