Skip to content

TryOmar/data-miner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataMiner

License: MIT Python Streamlit

DataMiner is a modern, interactive web application for data mining and machine learning workflows. Built with Streamlit, it empowers users to upload, inspect, and profile datasets, laying the foundation for advanced analytics and model building.


🚀 Features

  • Multi-page Streamlit app with sidebar navigation
  • Single data upload: Upload once, use everywhere
  • Paginated data preview: Easily browse large datasets
  • Data profiling: Instantly see shape, columns, and types
  • Summary statistics: Numeric and categorical summaries, with clear explanations
  • Scalable architecture: Ready for future ML and data transformation features

📸 Demo

Page Screenshot Preview
Home Home
Data Upload Data Upload
Data Upload (Paged) Data Upload 2
Profiling Profiling
Profiling (Types) Profiling 2
Summary Statistics Summary Statistics

🛠️ How It Works

  1. Upload Data: Go to the "Data Upload" page and upload your CSV, Excel, or JSON file.
  2. Profile Data: Navigate to "Profiling" to view dataset shape, columns, and data types.
  3. View Summary: Check "Summary Statistics" for numeric and categorical summaries.
  4. Navigate Easily: Use the sidebar to switch between features. Your uploaded data is available on all pages.

📁 Folder Structure

.
├── Home.py                  # Landing page
├── requirements.txt         # Python dependencies
├── pages/
│   ├── Data_Upload.py       # Upload and preview data
│   ├── Profiling.py         # Dataset profiling
│   └── Summary_Statistics.py# Summary statistics

✅ Progress Checklist

Phase 1: Data Handling and Basic Preprocessing

  • Multi-page Streamlit app structure
  • Upload CSV, Excel, JSON files
  • Paginated data preview
  • Data profiling (shape, columns, types)
  • Summary statistics (numeric & categorical, with explanations)
  • Single upload shared across all pages

Next Phases (Planned)

  • Advanced data preprocessing (missing values, outliers, feature engineering)
  • Data transformation (scaling, normalization, dimensionality reduction)
  • Machine learning model training and evaluation
  • Model export and deployment
  • Results sharing and reporting

🛠️ Recent Issues Fixed


📦 Requirements

  • Python 3.9+
  • streamlit
  • pandas
  • numpy
  • openpyxl
  • pyarrow

🤝 Contributing

Contributions are welcome!

  1. Fork the repo
  2. Create a feature branch
  3. Commit your changes
  4. Open a pull request

📄 License

This project is licensed under the MIT License.

🚦 How to Run

  1. Clone the repository:
    git clone https://github.com/Omar7001-B/data-miner.git
    cd data-miner
  2. Install dependencies:
    pip install -r requirements.txt
  3. Start the app:
    streamlit run Home.py
  4. Open your browser at http://localhost:8501

About

DataMiner is an interactive web application for data mining and machine learning. It helps users upload, clean, transform, and analyze datasets while building predictive models — all through a simple and powerful Streamlit interface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors