Hi this is my repository where I try out data science and AI projects
Most projects follow a similar format where I try to test my own knowledge by implementing a mathematical model then apply to the real world context (usually in a financial context since stock data is easily available)
While I try to make my implementation as optimal as I can in these projects I mostly focus on the mathematical concepts rather than optimal programming (especially so for older projects when I was less familiar with python)
Project titles link to the respective source code files
list of contents:
Cubic Interpolation Self Implementation and real world stock data application
-My own implementation of cubic spline interpolation using numpy for Gaussian Elimination and Matplotlib for visualisation
-Starts off with a manual example to test mathematical understanding followed by automation to scale for larger datasets
-Interpolate Daily AAPL price data for a month for real world application using Yfinance and Pandas
MNIST Number Recognition/NN with Numpy.ipynb
-recognition of MNIST handwritten dataset with my own dense neural net model using numpy (>90% accuracy)
-uses simple gradient descent as optimiser (easiest for me to implement)
-also want to try random pixel inputs (~like diffusion) to see which pixels the model emphasises on for each number recognised
-will try using deep learning with keras to upscale to image recognition (likely classify cats and dogs and play with different optimisers)
Human Keyboard Spam Recognition/Keyboard Spam Bayes.ipynb
-classify human keyboard spam and pseudo random numbers generated by python's random library
-data collected using google forms (asked participants to keyboard spam forms link: https://forms.gle/kLjdNS3ZsyCz9y9U7)
-tried using dense neural net with same architecture as MNIST recogniser (~53% did not work very well)
-tried multinomial naive bayes algorithm (~68% accuracy)
-consistently slightly better than sklearn library (surprising but the library probably has some other parameteres to optimise and is still much more concise)
-will try gaussian bayes next (probably library version instead of self-implemented)
more information available in code comments in respective source code files