High Dimensional Visualizations of Transportation Demand and Supply Conditions
Description
Abstract:
A growing number of public and private datasets allow transportation professionals to measure demand every day. Understanding how demand and supply conditions vary on a daily basis and analyzing the patterns and relationships that emerge can be of value to many applications. For example, modelers can improve their survey-based estimates of trip tables and transportation practitioners can optimize the management of for-hire vehicle fleets that are of growing importance to the evolution of mobility. This paper applies high-dimensional visualization algorithms and clustering techniques on high-dimensional supply and demand data including daily taxi and citibike zone-to-zone flows in New York for an entire year. Multidimensional scaling (MDS) is used to linearly project the high dimensional space and t-distributed stochastic neighborhood embedding (t-SNE) is deployed to project the datasets to a two-dimensional map. The resulting two-dimensional projections of the high dimensional data provide novel insights on the similarity of zones based on the distribution and volume of the trips originating from them. In particular, the t-SNE projection of taxi and bike flows results in a highly clustered map by New York neighborhood. The similarities and separation of weekday and weekend travel can also be clearly seen in the t-SNE projections. On the other hand MDS is found much more effective than t-SNE in revealing the global structure and outliers in the data. Finally, metrics are introduced that quantify the separation of the different clusters seen on the projected maps and the inherent dimensionality of the high-dimensional transportation data.