DataMiner is a modern, interactive web application for data mining and machine learning workflows. Built with Streamlit, it empowers users to upload, inspect, and profile datasets, laying the foundation for advanced analytics and model building.
- Multi-page Streamlit app with sidebar navigation
- Single data upload: Upload once, use everywhere
- Paginated data preview: Easily browse large datasets
- Data profiling: Instantly see shape, columns, and types
- Summary statistics: Numeric and categorical summaries, with clear explanations
- Scalable architecture: Ready for future ML and data transformation features
| Page | Screenshot Preview |
|---|---|
| Home | ![]() |
| Data Upload | ![]() |
| Data Upload (Paged) | ![]() |
| Profiling | ![]() |
| Profiling (Types) | ![]() |
| Summary Statistics | ![]() |
- Upload Data: Go to the "Data Upload" page and upload your CSV, Excel, or JSON file.
- Profile Data: Navigate to "Profiling" to view dataset shape, columns, and data types.
- View Summary: Check "Summary Statistics" for numeric and categorical summaries.
- Navigate Easily: Use the sidebar to switch between features. Your uploaded data is available on all pages.
.
├── Home.py # Landing page
├── requirements.txt # Python dependencies
├── pages/
│ ├── Data_Upload.py # Upload and preview data
│ ├── Profiling.py # Dataset profiling
│ └── Summary_Statistics.py# Summary statistics
- Multi-page Streamlit app structure
- Upload CSV, Excel, JSON files
- Paginated data preview
- Data profiling (shape, columns, types)
- Summary statistics (numeric & categorical, with explanations)
- Single upload shared across all pages
- Advanced data preprocessing (missing values, outliers, feature engineering)
- Data transformation (scaling, normalization, dimensionality reduction)
- Machine learning model training and evaluation
- Model export and deployment
- Results sharing and reporting
- Summary statistics table shows many null values for non-numeric columns (#1)
- Improve dashboard navigation and layout for better usability (#2)
- Python 3.9+
- streamlit
- pandas
- numpy
- openpyxl
- pyarrow
Contributions are welcome!
- Fork the repo
- Create a feature branch
- Commit your changes
- Open a pull request
This project is licensed under the MIT License.
- Clone the repository:
git clone https://github.com/Omar7001-B/data-miner.git cd data-miner - Install dependencies:
pip install -r requirements.txt
- Start the app:
streamlit run Home.py
- Open your browser at http://localhost:8501





