I am starting to dip my toe into the world of Data Science. I am teaching myself Python and SQL, and following a Masters course in Data Science at the Technical University of Dublin. Below are sample projects I have produced for job interviews etc.
Brooklyn motor vehicle collisions case study
An investigation of motor vehicle collisions in Brooklyn 2014-2017. Aim is to reduce vehicle collisions and deaths.
Executive summary
- 190,981 motor vehicle collisions in Brooklyn in 2014-2017
- 207 resulted in deaths
- Driver Inattention is biggest factor in motor vehicle collisions
- Involved in 55,971 collisions
- Complex intersections on Atlantic Avenue and zip code 11207 are more dangerous areas
- Accidents are more frequent around typical rush hours
- 8am and 4pm Monday to Friday
Approach taken
- Data filtered by date (2014-2017) and borough (Brooklyn)
- Using SQL from open-source NYPD database of collisions
- Deaths:
- Data filtered by factors (no NULL and no Unspecified)
- If no factors recorded, unclear how to avoid these deaths
- Time / date
- Analysed by week day and by hour
- Inspected data for Mon-Fri and weekends separately
- Location
- Analysed by longitude and latitude, cross-street and zip code
Key findings on accident patterns
- Most common factor associated with all collisions
- Driver inattention
- Most common factors associated with deadly collisions
- Driver inattention, Failure to Yield Right-of-Way, Disregard of Traffic Control
- Driver inattention could lead to these other factors
- Collisions are most common during rush hour Mon-Fri
- Collisions are most common on road intersections
Recommendations to reduce collisions
- Reduce distracted driving
- Driver Distraction/Inattention is the factor most often reported in all collisions, and in collisions that cause deaths. Other factors (Failure to yield, Ignored traffic control measures) may also be attributed to driver distraction.
- Recommend a data-driven distracted driving campaign to alert the public
- New York launched a distracted driving campaign on April 9 2019
- Reduce speeds at complex intersections
- Collisions are most common at road intersections
- The single biggest collision hotspot is a complex road intersection with multiple landmarks nearby
- Reducing speed limits at these points may help reduce collisions and deaths
Nice work. You find this interesting Python Interview Practice repo https://github.com/Anyulund/Python-Interview-Practice/blob/master/Data-Science-Interview-Questions.ipynb
ReplyDelete