The following pages and posts are tagged with

TitleTypeExcerpt
AdaBoost Page * Short for Adaptive Boosting, AdaBoost is another ensemble algorithm. ## Differences with decision trees * In random forests, we built complete trees each time, but in AdaBoost, each tree only consists of a node and two leaves (which is called a stump). * Another difference to RF is that...
Decision trees Page * Decision trees are one of the most popular ML algorithm. * They can be used for regression and classification. * They categorize data in a similar way to human thinking. * Therefore, they are also easy to understand and interpret. * Succinctly, in a decision tree, each node represents...
Gradient-boosted trees Page ## Gradient-boosted trees * This algorithm can also be used for regression and classification. * It builds trees one after another, each new tree fixing the problems of the previous one. * It involves no randomization by default. * It uses shallow trees (maximum depth about 5). Therefore requires less...
K-Nearest Neighbors (KNN) Page * KNN is a nonparametric learning algorithm, i.e., it does not make any assumptions about the structure of the data. * It is a distance based majority vote algorithm. It checks the category of k-nearest neighbors of a data sample, and assigns it to the category of the majority of...
Logistic Regression Page ## Rationale * Logistic regression works similar to linear regression, except that the outcome is binary. * We need a different function than linear regression; a function that given any input, creates an output between 0 and 1. * The equation of binary decision used in logistic regression is: $$...
Naive Bayes Page * Naive Bayes is a probabilistic algorithm that considers the features as _independent_, hence the term naive. * Generally, it is similar to linear models, but its faster to train, and is worse in generalization. * There are three kinds of NB: Gaussian, Bernoulli, and multinomial. *...
Neural Networks Page ## What is a neural network * A "neuron" in a neural network is a function. It accepts some inputs, applies some calculations on them, and then returns a single number. * For regression...
Random Forests Page ## Ensemble algorithms * Random forests is an ensemble algorithm. * They combine multiple ML algorithms to create a more powerful one. * In competitions, ensemble algorithms are usually the winners. * Two most common ensemble algorithms are _random forests_ and _gradient boosted decision trees_. ## Random forests * A...