Highlighted Learnings:
- Location Data is important to multiple industries but understanding the paths people take is more complicated than just connecting the dots
- Citilabs’ Streetlytics calculates the path of every vehicle on the road by combining location data, logical predictions, and traffic counts using an optimization process it has built over the past 40 years
- Streetlytics calculates the average annual movement in the US for Monday through Thursday, Friday, Saturday, and Sunday and soon will be the average monthly movement
- Streetlytics is the volumes, speeds, origins, destinations, home locations, and paths for every section of the road in America
Location Data is exploding and it is driving innovation. But, beware! Understanding a sample of movement is not understanding how people move. Companies sell Location Data collected from smart phones, GPS probes and other such devices. Location Data can tell you where the sampled people are coming from and going to, when they are traveling and where they live and work. But Location Data by itself cannot answer the questions of “how many” and “how did they get there.” That’s because Location Data is collected from a segment of the overall population: users of an app, owners of a certain brand of car, or subscribers to a specific cell phone carrier and Location Data does not typically have enough location points to determine the route and mode of travel.
For many industries and applications, answering these questions is very important. Tell me, “how many people see my message on an average day?”, “how many people travel from one area to another area?”, “how many vehicles will be on this segment of road at 5pm?” – all important questions to answer. Without a rich understanding of the movement of people, these questions cannot be accurately answered.
Using Optimization to Provide Total Population Movement
In a simple weighting process, one takes Location Data collected from users of a mobile app, determines where those users live, and then estimates a weight based on the number of users of the app compared to the population of the area. This equates to proportionally scaling the sample to the full population based solely on the penetration rate of the app’s use in the area.
While that may sound adequate, it alone is not very accurate. Frequently, when comparing the scaled results against independent data sources such as traffic counts and data quantifying the number of people that live and work in a location, the results obtained through simple weighting are clearly not accurate.
Streetlytics uses a proprietary optimization process to accurately weight and adjust Location Data. The Streetlytics optimization process works in the following way:
- First, we create the location data view by scaling the source location data using traditional survey weighting processes. This provides the scaled estimate of total population movement based on the source Location Data.
- Second, we build a logical view of the probable movement of total population using modeling techniques that we have developed over the last 30 years. We apply predictive modeling algorithms, developed through the analysis of robust national and regional government travel surveys, to accurate, up-to-date household and employment data. These algorithms calculate “how many” trips are most likely to travel from each neighborhood to each neighborhood, what time of day those trips are most likely to occur and what mode of travel is most likely being used based on travel behavior captured in the travel surveys. For instance, if 500 households of a specific type live in this neighborhood, own these many cars, have access to a certain level of transit service, encounter a certain level of traffic congestion and have a certain number of shops nearby—we can determine ‘how many’ and ‘where’ those trips ‘should’ be occurring. This ‘logical view’ provides a rationality check in our weighting process and is completely independent of any source Location Data.
- Third, we collect all of the ground truth traffic counts available in each area. Traffic counts have been routinely gathered by governments for years. While these counts may not be perfect, as they are also samples, also weighted and are collected using a variety of techniques, they provide another independent view of “how many” vehicles and people are traveling on a region’s roadways. Knowing how many vehicles each roadway has historically served, including details about how the volumes vary by time of day, day of week, and during the course of a year, ensures that Streetlytics explicitly reflects local traffic dynamics.
- Lastly, we apply an optimization process that integrates each of these different ‘views’ of population movement. The weighted and corrected Location Data provides one view, the logical view provides a second and the traffic counts provide a third. The optimization applies varying weights to each of these observations based on their characteristics and quality of the underlying data to combine the independent views into an optimized understanding of total population movement in a region. This process takes the best of each data source maintaining the important spatial and temporal patterns provided by the Location Data, the overall rationality calculated by applying behavioral models on current population and employment data provide by the logical view, while constraining the calculated movements to the measured ground truth.
This process produces a robust and accurate understanding of the entire moving population—telling us where people are coming from and going to, what they pass by, when they travel, where they live and work, and what mode they are likely using.