This project is an end-to-end automated machine learning (AutoML) model that streamlines the process of data acquisition, preprocessing, building, comparison, evaluating, testing, visualizing and saving machine learning models for both classification and regression tasks. It is designed to help users with minimal data science expertise quickly go from raw data to a well-performing machine learning model.
The AutoML model follows a structured workflow and provides a user-friendly interface using Streamlit. It includes the following steps:
-
Data Acquisition: Load your dataset and define the target variable.
-
Data Exploration and Analysis (EDA): Perform exploratory data analysis to understand the dataset's characteristics, visualize distributions, and analyze relationships between features.
-
Data Preparation: Preprocess the data by handling missing values, dropping duplicates, encoding categorical features, and scaling numerical features.
-
Model Creation: Select from a variety of machine learning models for both classification and regression tasks. You can choose specific models or apply all models available.
-
Model Comparison: Train and evaluate the selected models using training and validation data. The best-performing model is identified based on metrics such as accuracy, recall, precision (for classification), or mean absolute error, mean squared error, R-squared (for regression).
-
Model Testing: Test the best model on a separate test dataset and generate a testing report.
-
Model Visualization: Visualize the performance of the best model on training, validation, and test datasets using bar plots for various metrics.
- Python 3.6+
- Streamlit
- Pandas
- Scikit-learn
- Matplotlib
- Seaborn
-
Clone this repository to your local machine:
git clone https://github.com/elsayedelmandoh/automated_ml.git
-
Install the required dependencies using command pip install -r requirements.txt
-
Run this command streamlit run main.py
Contributions are welcome! If you have suggestions, improvements, or additional content to contribute, feel free to open issues, submit pull requests, or provide feedback.
This repository is maintained by Elsayed Elmandoh, an AI Engineer. You can connect with Elsayed on LinkedIn and Twitter/X for updates and discussions related to Machine learning, deep learning and NLP.
Happy coding!