This work explains the method and process used for Waze traffic data manipulation and further analysis. The major concentration of the project is to give a clear flow of handling big data files and data cleaning processes using Hadoop and Hive. In addition to that, analysis of this data is conducted using Excel and Power BI, depicting visuals such as maps, timeline and charts on traffic conditions of LA County. Due to limited access to the data, this paper gives a method and a prototype model of analysis on a portion of data, but more insights can be found using full dataset (100GB+) using the same flow of work.
Schemas for the files used are fetched from following: https://docs.google.com/presentation/d/1jkfyaSE1hkv8tkrYFFgEmgB7L5LU5XZfaMvNzN6IOx0/edit?ts=5c05a21f#slide=id.g4625f8ce11_1_1231