Machine Learning Notes

This repository contains comprehensive notes on various machine learning topics. These notes cover a wide range of concepts from basic principles to advanced algorithms.

Hypothesis Space
Bayes Classifier
Linear Regression
Generalized Linear Regression
Non-parametric Density Estimation
Parzen Window Estimate
K-Nearest Neighbour (KNN)
Linear Discriminant Analysis (LDA)
Support Vector Machine (SVM)
Neural Networks
Backpropagation
Decision Trees
Ensemble Learning
Bagging and Random Forest
Boosting
XGBoost
Principal Component Analysis (PCA)
K-means Clustering
Expectation Maximization (EM) Algorithm
Miscellaneous Machine Learning Terms

Hypothesis Space

The hypothesis space is defined as the set of all possible hypothesis functions that map feature vectors to labels. It's represented as:

H = {h : X → Y}

More details

Bayes Classifier

The Bayes classifier is defined as:

h*(x) = argmax[y ∈ Y] P(Y = y | X = x)

It's proven to be the best classifier for the 0-1 loss function.

More details

Linear Regression

Linear regression models the relationship between input features and output as a linear function. The notes cover the formulation, ideal regressor derivation, and the closed-form solution.

More details

Generalized Linear Regression

This extends linear regression by projecting data into a higher-dimensional space before performing linear regression, allowing for capture of more complex relationships.

More details

Non-parametric Density Estimation

Non-parametric density estimation techniques estimate the probability density function directly from the data without assuming a specific functional form.

More details

Parzen Window Estimate

Also known as kernel density estimation, this method uses a window function to estimate the probability density function.

More details

K-Nearest Neighbour (KNN)

KNN is a non-parametric method used for classification and regression. The algorithm and its formulation are explained in detail.

More details

Linear Discriminant Analysis (LDA)

LDA is explained from a Bayesian perspective, including the derivation of the decision boundary.

More details

Support Vector Machine (SVM)

The notes cover SVM for both linearly separable and non-linearly separable data, as well as the kernel trick for handling non-linear decision boundaries.

More details

Neural Networks

The notes provide a mathematical formulation of neural networks and explain the importance of non-linear activation functions.

More details

Backpropagation

A detailed derivation of the backpropagation algorithm used for training neural networks is provided.

More details

Decision Trees

The notes cover how decision trees work, including the growing and pruning processes, and metrics like Gini Impurity and Mean Squared Error.

More details

Ensemble Learning

An introduction to ensemble learning techniques, which combine multiple models to improve overall performance.

More details

Bagging and Random Forest

Bagging (Bootstrap Aggregating) and Random Forest, which is an application of bagging to decision trees, are explained.

More details

Boosting

The notes provide a mathematical formulation of boosting, explaining how it sequentially trains models to correct errors of previous ones.

More details

XGBoost

XGBoost, a specific implementation of gradient boosting, is explained in detail.

More details

Principal Component Analysis (PCA)

PCA, a dimensionality reduction technique, is explained step-by-step, including the intuition behind it.

More details

K-means Clustering

K-means clustering, an unsupervised learning algorithm, is explained along with its connection to the Expectation-Maximization algorithm.

More details

Expectation Maximization (EM) Algorithm

The EM algorithm, used for finding maximum likelihood estimates of parameters in statistical models with latent variables, is derived and explained.

More details

Miscellaneous Machine Learning Terms

Various important machine learning terms and concepts are explained, including epochs, batch size, gradient descent variants, batch normalization, layer normalization, dropout, and N-fold cross-validation.

More details

Contributing

Contributions to improve or expand these notes are welcome. Please feel free to submit a pull request or open an issue for discussion.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
0 - Index.md		0 - Index.md
1 - Hypothesis space.md		1 - Hypothesis space.md
12 - Linear regression.md		12 - Linear regression.md
12.1 - Generalized linear regression.md		12.1 - Generalized linear regression.md
13 - Logistic regression.md		13 - Logistic regression.md
14 - Softmax regression.md		14 - Softmax regression.md
15 - Maximum aposteriori estimate.md		15 - Maximum aposteriori estimate.md
16 - Non parametric density estimation.md		16 - Non parametric density estimation.md
17 - Parzen window estimate.md		17 - Parzen window estimate.md
18 - K Nearest neighbour.md		18 - K Nearest neighbour.md
19 -Bias variance tradeoff.md		19 -Bias variance tradeoff.md
2 - Loss function.md		2 - Loss function.md
20 - Regularization.md		20 - Regularization.md
21 - Support vector machine - data linearly separable.md		21 - Support vector machine - data linearly separable.md
22 - SVM - dataset is not linearly separable.md		22 - SVM - dataset is not linearly separable.md
23 - SVM with kernel.md		23 - SVM with kernel.md
24 - SCM as ERM.md		24 - SCM as ERM.md
25 - Neural network.md		25 - Neural network.md
26 - Backpropogation.md		26 - Backpropogation.md
27 - Miscellaneous terms.md		27 - Miscellaneous terms.md
28 - Descion trees.md		28 - Descion trees.md
29 - Ensemble learning.md		29 - Ensemble learning.md
3 - Risk function.md		3 - Risk function.md
30 - Bagging and random forest.md		30 - Bagging and random forest.md
31 - Boosting.md		31 - Boosting.md
32 - XG Boost.md		32 - XG Boost.md
33- Adaboost.md		33- Adaboost.md
34 - Unsupervised learning.md		34 - Unsupervised learning.md
34.1 - Soft clustering.md		34.1 - Soft clustering.md
35 - K means clustering.md		35 - K means clustering.md
36 - PCA.md		36 - PCA.md
4 - Learning problem.md		4 - Learning problem.md
5 - Bayes classifier.md		5 - Bayes classifier.md
6 - KL Divergence - Kullback-Leibler Divergence.md		6 - KL Divergence - Kullback-Leibler Divergence.md
7 - Likelyhood.md		7 - Likelyhood.md
8 - EM algorithm.md		8 - EM algorithm.md
9 - Linear discriminant analysis.md		9 - Linear discriminant analysis.md
Machine learning.pdf.pdf		Machine learning.pdf.pdf
README.md		README.md
Read me.md		Read me.md
Readme for Github.md		Readme for Github.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Notes

Table of Contents

Hypothesis Space

Bayes Classifier

Linear Regression

Generalized Linear Regression

Non-parametric Density Estimation

Parzen Window Estimate

K-Nearest Neighbour (KNN)

Linear Discriminant Analysis (LDA)

Support Vector Machine (SVM)

Neural Networks

Backpropagation

Decision Trees

Ensemble Learning

Bagging and Random Forest

Boosting

XGBoost

Principal Component Analysis (PCA)

K-means Clustering

Expectation Maximization (EM) Algorithm

Miscellaneous Machine Learning Terms

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Notes

Table of Contents

Hypothesis Space

Bayes Classifier

Linear Regression

Generalized Linear Regression

Non-parametric Density Estimation

Parzen Window Estimate

K-Nearest Neighbour (KNN)

Linear Discriminant Analysis (LDA)

Support Vector Machine (SVM)

Neural Networks

Backpropagation

Decision Trees

Ensemble Learning

Bagging and Random Forest

Boosting

XGBoost

Principal Component Analysis (PCA)

K-means Clustering

Expectation Maximization (EM) Algorithm

Miscellaneous Machine Learning Terms

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages