Machine Learning

# TOP MACHINE LEARNING ALGORITHMS YOU SHOULD KNOW The top 10 Machine Learning Algorithms are as follows:

1.   Logistic Regression
2.   Linear  Discriminant  Analysis
3.   Classification and Regression Trees
4.   Naive Bayes
5.   K-Nearest  Neighbours (KNN)
6.   Learning Vector Quantization (LVQ)
7.   Support Vector Machines (SVM)
8.   Random Forest
9.   Boosting

### 1 )  Linear Regression

• Linear Regression is one of the well-known and well-understood algorithms in Machine Learning.
• This model is concerned with minimizing the error and predicting the output.

Let us consider an example as follows:
y=B0+B1*x
where y represents the output variable
x represents the input variable.
B represents the coefficients.
Here we will predict y and where input x is given and the goal              of linear regression is to find the values for coefficients of B0              and B1.

### 2) Logistic Regression

• It is the go-to method for binary classification problems (problems with two class values).
This regression is the same as linear regression like finding the values of coefficient but here output is predicted by using the non-linear function called the logistic function.
• The logistic function looks like a big S and will transform any value into the range 0 to 1.
This is useful because we can apply a rule to the output of the logistic function to snap values to 0   and 1
(e.g. IF less than 0.5 then output 1) and predict a class value.

### 3)  Linear Discriminant Analysis

• It is a classification algorithm traditionally limited to only two-class classification problems.
• If there are more than two classes then the Linear Discriminant Analysis algorithm is the preferred linear classification technique.
• The representation of  LDA is pretty straightforward. It consists of statistical properties of your data, calculated for each class. For a single input variable, this includes.
• The mean value for each class.
• The variance was calculated across all classes.

### 4 )  Classification and Regression Trees

•  The representation of the decision tree model is a binary tree.
•  Each node represents a single input variable (x) and a split point on that variable.
•  The leaf nodes of the tree contain an output variable (y) which is used to make a prediction. Predictions are made by walking the splits of the tree until arriving at a leaf node and outputting the class value at that leaf node.

### 5 )  Naive Bayes

•  Naive Bayes is a simple but and most popular algorithm for predictive modeling.
•  The model is comprised of two types of probabilities that can be calculated directly from your training data.

1) The probability of each class;

2) The conditional probability for each class given each x value

•  When your data is real-valued it is common to assume a Gaussian distribution so that you can easily estimate these probabilities.
•  Naive Bayes is called naive because it assumes that each input variable is independent.
•  This technique is very effective on a large range of complex problems.

### 6 ) K-Nearest Neighbours

•  Here the predictions are made for a data point by searching through the entire training set for the K most similar instances and summarizing the output variable for those K instances.
•  To determine the similarities between two data instances can be calculated directly based on the difference in each input value.
• It requires a lot of memory or space to store all of the data, but only performs a calculation when a prediction is needed, just in time.

### 7 )  Learning Vector Quantization

• This type of algorithm is an artificial neural network algorithm that allows you to choose how many training instances to hang onto and learn exactly what that instance looks like.
• The representation of  LVQ is a collection of codebook vectors.
• The most similar neighbor can be found by calculating the difference between each codebook vector and the new data instance.

### 8 ) Support Vector Machines

•       In SVM  hyperlink is used to separate the points in the input variable space by their class, either class0 or class1.
•       The distance between the hyperplane and the closest data points is referred to as the margin.
•       The best or optimal hyperplane that can separate the two classes is the line that has the largest margins.

### 9 )  Bagging and Random Forest

•  The bootstrap is a powerful statistical method for estimating quantity from a data sample.
•  In bagging, the same approach is used but instead of estimating entire statistical models, most commonly decision trees.

### 10 ) Boosting and AdaBoost

•  It is an ensemble technique that attempts to create a strong classifier from a number of weak classifiers.
•  This is done by building a model from training data, then creating a second model that attempts to correct the errors from the first model.
•  AdaBoost was the first really successful boosting algorithm developed for binary classification.