| Skills: Data Cleaning | Exploratory Data Analysis | Data Visualization | Large Dataset Processing | SQL/SQLite | Statistical Analysis | R (dplyr, ggplot2) |
Motorcycle accidents account for a significant portion of traffic injuries and fatalities each year. This project analyzes large-scale traffic collision data to identify demographic, environmental, and temporal patterns associated with motorcycle accidents. The goal is to uncover patterns that may inform safety interventions and policy improvements.
As a night-shift medical laboratory scientist, I frequently encounter trauma cases resulting from motorcycle accidents. This exposure motivated an investigation into whether large-scale traffic datasets could reveal patterns associated with higher accident risk or severity.
This analysis integrates three public transportation datasets:
1. California Traffic Collision Data (SWITRS)
2. Road Traffic Injury Dataset
3. U.S. Transportation Accident Data
Data preparation involved several steps:
Handling large datasets required memory management and selective filtering to focus on motorcycle-related incidents.
The analysis focused on exploratory data analysis and statistical comparisons, including:
Visualization techniques included:
Motorcycle accidents resulted in fatalities approximately 3.2% of the time, with injuries occurring in 82% of cases.
Younger riders, particularly those in their twenties and early thirties, were involved in the highest number of accidents.
Motorcycle accidents were heavily skewed toward male riders, though some records contained unknown gender classifications.
Contrary to common assumptions, most motorcycle accidents occurred under normal weather, road, and lighting conditions, suggesting that rider behavior and traffic interactions may be more influential than environmental conditions.
Motorcycle accidents produced a higher injury rate than trucks, buses, and bicycles, but slightly lower than cars and pedestrians in overall injury distribution.
The findings highlight potential areas for safety intervention:
R, ggplot2, dplyr, tidyr, RSQLite, readxl