This course will teach students to use popular tools for sourcing data, transforming it, building and optimizing models, communicating these as visual stories, and deploying them in production.
-
Updated
Aug 24, 2022 - HTML
This course will teach students to use popular tools for sourcing data, transforming it, building and optimizing models, communicating these as visual stories, and deploying them in production.
Comprehensive data governance pipeline for SSH honeypot logs—covering data profiling, cleansing, quality assurance, encryption, classification, and GDPR/CCPA/HIPAA compliance. Built with Pandas, Pandera, YData Profiling, and cryptography, with simulated Caesar cipher attacks to demonstrate practical data-security techniques.
A comprehensive tool for quick data profiling and exploratory data analysis.
This project is about inspection analysis of Food establishments in Chicago and Dallas. Involves dimensional modelling, building ETL pipelines, data warehousing and data visualization.
Generates automated EDA reports using YData Profiling for quick data understanding.
Plateforme de data gouvernance dockerisée pour un hôpital : PostgreSQL, profilage Python, Superset et OpenMetadata.
This repository showcases my learning process of automating EDA using 'ydata-profiling'
The team explored persona‑driven behavioural analytics to address risky resource planning practices. By combining detailed persona definitions, behavioural metrics, and deep analysis of forecasting and utilisation data, they designed a dashboard concept that highlights over‑optimistic planning, generic resource use, and weak feedback loops,...
A R Notebook to perform basic data profiling and exploratory data analysis on the FIFA19 players dataset and create a dream-team of the top 11 players considering various player attributes.
Collection of notebooks documenting best practices for data profiling and QA in R.
Collection of APIs for Informatica Intelligent Cloud Services (IICS) and Intelligent Data Management Cloud (IDMC), providing programmatic access to data integration, data governance, data quality, master data management, B2B gateway, and platform administration capabilities.
Optimizing and Forecasting Supply Chain Performance
An end-to-end Business Intelligence (BI) pipeline designed to process and analyze 141 million IMDb records for deriving insights on movies, ratings, and global cinema trends. The project demonstrates large-scale data engineering, ELT automation, and dashboard-driven analytics.
Data file examples and user guides for VerityPy and VerityDotNet libraries
Add a description, image, and links to the data-profiling topic page so that developers can more easily learn about it.
To associate your repository with the data-profiling topic, visit your repo's landing page and select "manage topics."