CodeSamples

This repository contains two different code samples. The first, VehicularAccidents_311ServiceRequestAnalysis, adapts work I completed for a final project in my master’s program. The second folder, CapstoneDataProcessing, is the core of my code for my capstone project, which is still in progress as of November 2025.

Together, these notebooks show how I work end to end with messy public-sector data: acquiring and cleaning datasets, joining across sources, building features, doing exploratory analysis, and organizing code in a way that is easy for others to follow.

VehicularAccidents_311ServiceRequestAnalysis

NYC Accidents & 311 Complaints.ipynb

Overall goal
Explore how New York City vehicular accidents and 311 service complaints line up in space and time, to see where traffic safety issues and resident-reported problems overlap.

What this notebook does

Loads open data on motor vehicle collisions and 311 complaints for New York City.
Cleans and standardizes fields such as dates, times, locations, and complaint categories.
Joins and aggregates the datasets by location and time (for example, neighborhood or borough and time of day).
Builds features that describe accident patterns and complaint patterns, such as counts, rates, and complaint types.
Uses charts and tables to highlight hotspots and possible relationships between accidents and specific 311 complaint categories.

Skills showcased

Working with real-world city open data in Jupyter.
Data cleaning and feature engineering on multi-source datasets.
Joining and aggregating data across time and geography.
Exploratory data analysis and clear visualization to tell a story around safety and service delivery.

CapstoneDataProcessing (PROJECT IN PROGRESS)

The notebooks in this folder support my capstone project, which looks at the relationship between urban trees and building energy performance in New York City. The focus is on building a reliable, analysis-ready dataset from multiple open data sources. NOTE: This project is currently in progress, and has been simplified to serve a sample. However, it still is a bit rough.

BuildingWork_Part1.ipynb

Overall goal
Create a building-level panel dataset of New York City energy and water benchmarking results that can be linked to building geometry and later joined to tree canopy measures.

What this notebook does

Pulls several years of Local Law 84 benchmarking data for New York City buildings.
Cleans and standardizes building identifiers across years, including handling missing or inconsistent IDs.
Geocodes buildings and assigns them to tax lots and building footprints.
Joins in additional building attributes such as zoning, height, and other physical characteristics.
Outputs a consistent building-level file that is ready for downstream spatial analysis and modeling.

Sources used in this notebook include

New York City Open Data building energy and water disclosure datasets for Local Law 84 (multiple years).
New York City Open Data building historic records.
New York City Open Data building elevation and subgrade data.
New York City Planning MapPLUTO building footprint and tax lot data.
New York City Planning tree canopy change data derived from LIDAR.
New York City Planning Labs geosearch service for address cleaning and geocoding.

Skills showcased

Building robust ETL pipelines for large civic datasets.
Cleaning and reconciling multi-year records at the building level.
Basic geospatial processing and joining tabular data to spatial layers.
Preparing high-quality, reusable datasets for analysis.

TreeWork_Part2.ipynb

Overall goal
Construct a tree-level dataset that tracks which street trees exist in each year and when they are removed, based on inventory and work order records.

What this notebook does

Loads New York City street tree inventory data and forestry work order records from open data.
Cleans and aligns tree identifiers, locations, and key fields across the two datasets.
Uses work order histories to infer tree removal dates and statuses.
Restricts analysis to closed tickets to ensure reliable outcomes.
Builds a yearly tree-level file that can be joined to nearby buildings for panel analysis.

Sources used in this notebook include

New York City Open Data street tree inventory and point location data.
New York City Open Data tree work order and forestry service request records.
New York City Open Data geographic reference layers for streets and neighborhoods.
New York City planning and open data portals for reference building and location context.

Skills showcased

Working with operational city data that is noisy and event based.
Designing rule-based logic to infer entity status over time (for example, whether a tree is still present).
Combining event histories with inventory data to build panel-style datasets.
Preparing features that can be linked to other spatial units such as buildings.

Analysis_Part3.ipynb

Overall goal
Bring together the building and tree datasets from Parts 1 and 2 to study how changes in tree canopy relate to building energy performance. This analysis is in progress for this semester.

What this notebook does

Merges the building-level dataset with nearby tree and canopy information.
Filters to buildings with usable energy use and building characteristics.
Creates final features that capture tree canopy exposure and changes over time at the building level.
Runs exploratory analysis and first-pass models to examine relationships between canopy metrics and building energy intensity.
Documents open questions and next steps for improving the models and interpretation.

Status and skills showcased

This analysis is still in progress for the current semester, and I am continuing to refine the modeling and diagnostics.
Demonstrates the ability to stitch multi-step data pipelines into a single analysis.
Shows familiarity with panel-style data and modeling energy outcomes as a function of environmental and built-environment features.
Highlights communication of intermediate results, limitations, and planned future work.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
NYC Vehicular Accidents nd 311 Service Request Analysis		NYC Vehicular Accidents nd 311 Service Request Analysis
Urban Tree Presence and Multifamily Residential Building Energy Use in New York City		Urban Tree Presence and Multifamily Residential Building Energy Use in New York City
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeSamples

VehicularAccidents_311ServiceRequestAnalysis

NYC Accidents & 311 Complaints.ipynb

CapstoneDataProcessing (PROJECT IN PROGRESS)

BuildingWork_Part1.ipynb

TreeWork_Part2.ipynb

Analysis_Part3.ipynb

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CodeSamples

VehicularAccidents_311ServiceRequestAnalysis

NYC Accidents & 311 Complaints.ipynb

CapstoneDataProcessing (PROJECT IN PROGRESS)

BuildingWork_Part1.ipynb

TreeWork_Part2.ipynb

Analysis_Part3.ipynb

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages