Research vision: Integrating Autonomy into Societal Systems
The modern world is a global one. With increased connectivity, it has become easier than ever before to introduce changes into the global society, and the effects thereof can spread faster than ever before. However, these effects may be positive or negative, and may be intended or unintended; for example, early evidence suggests that social media algorithms may be linked with political polarization. Under accelerated change, we must be proactive about considering the impacts of the changes before they are introduced into societal systems. However, we lack adequate tools to do so.
My long-term goal is thus to develop algorithms and tools which increase the capacity for society to make good decisions, by analyzing the impacts of proposed changes, amidst the complexities of the open world. These proposed changes could take on an overwhelmingly broad swath of forms—new technologies, algorithms, artificial intelligence (AI) models, regulations, business models, and so on. I specially focus on studying the impact of autonomy, or advanced forms of automation, with the aim of informing its integration into society.
Establishing an empirical science for societal decision making
Simply stated, instead of deploying changes into the open world and following a wait-and-see strategy, we deploy changes in simulation and analyze their counterfactual outcomes. Ongoing advances in high-performance computing and artificial intelligence are super-powering the fine-grained analysis of complex simulation systems. Consider, for example, urban mobility. By conducting detailed analyses of autonomous vehicles (AVs) in simulations modeled from real cities, we are contributing a still-nascent body of knowledge around the system-level impacts of AVs. In particular, we employ deep reinforcement learning (RL) algorithms to model the behavior of AVs, which may be optimizing for selfish, prosocial, or business interests. By studying isolated traffic phenomena that are well-characterized by traffic flow theory, we uncovered that a small fraction of AVs (5-10%) can cause a significant improvement in traffic congestion and travel times (40-150% improvement). In this way, negative externalities, such as traffic congestion, can be rigorously analyzed to inform public policy and business models. These policies could induce self-driving fleet operators to contribute towards addressing long-standing societal challenges, by promoting equity of access and opportunity and reducing energy consumption through advanced congestion management strategies. As such, fine-grained simulation-based counterfactual analysis promises to help guide the long arduous integration of AVs into society.
The project is open source, and is supported by Amazon, MIT RSC, NSF, and DOE. The Flow project website contains our publications and tutorials. Our work has been featured by Science, Wired, TWIML AI Podcast (twice!), O’Reilly (Chinese version), Berkeley College of Engineering, abc News, Berkeley Lab, India Times, and Russian Forbes.
Flow: A Modular Learning Framework for Autonomy in Traffic
Cathy Wu, Aboudy Kreidieh, Kanaad Parvate, Eugene Vinitsky, Alexandre Bayen
IEEE Transactions on Robotics (T-RO). In review.
pdf / arXiv / videos / github / project page
Lagrangian Control through Deep-RL: Applications to Bottleneck Decongestion
Eugene Vinitsky, Kanaad Parvate, Abdul Rahman Kreidieh, Cathy Wu, Alexandre Bayen
IEEE Intelligent Transportation Systems Conference (ITSC), 2018.
Scalable learning for networked decision problems
To analyze scenarios at the level of complexity required for integrating autonomy into societal systems, new and more scalable optimization techniques are required. Deep reinforcement learning offers a promising class of simulation-based optimization that has the potential to scale by implicitly capturing pertinent low-dimensional structure via deep neural networks. Societal systems are a composite of discrete and continuous dynamics, operating over a network of agents, such as humans, vehicles, infrastructure elements, and buildings. By exploiting specific structure in networked decision problems, we thus study a variety of approaches to improve the scalability of deep reinforcement learning, including variance reduction of policy gradient methods for high-dimensional control (pictured right), multi-agent reinforcement learning, transfer learning for networked systems, and deep RL for combinatorial optimization.
Variance Reduction for Policy Gradient Using Action-Dependent Factorized Baselines
Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel
International Conference on Learning Representations (ICLR), 2018. Oral (2%).
Deep Reinforcement Learning Symposium (NIPS), 2017. Contributed talk.
arXiv / OpenReview
Coping with imperfections of data-driven decision making
Another major research initiative is motivated by the general concern around deploying AI systems, due to limited transparency and robustness guarantees. That is, even if we were successful in scaling learning methods and in conducting counterfactual analysis for autonomy, the practical implications of these insights may be limited because the learned agents would be unsafe to deploy in the real world. To this end, we study numerous ways to leverage human supervision to enable the safe deployment of complex AI models in the near term, despite their shortcomings. We call this human-compatible reinforcement learning. Promising approaches in this direction include training human operators to imperfectly but safely execute AI-based autonomy models (such as AVs); advancing interpretability for RL-based AI systems; improving the scalability of human supervision; and improving off-policy learning methods to incorporate shifts in data distribution.
Human-compatible and tractable ridesharing
Mobility is embedded in an overall socioeconomic system, and one major anticipated long-term impact of automated vehicles is induced demand, in which more people travel in response to the newly available roadway capacity. This additional demand on the mobility system may compromise the benefits in road velocity and throughput with the corresponding elevated energy consumption. We therefore study the dynamics of the overall socioeconomic system and in particular, its couplings with the mobility system. To this end, in collaboration with Microsoft, we investigate human mobility preferences based on a user study of employees at a major technology corporation. We select ridesharing as a promising design paradigm within the mobility system, with the potential to mitigate the effects of induced demand by dramatically improving the throughput (supply). We propose that, with lightly modified existing infrastructure and, crucially, taking into account complex human factors, ridesharing has the potential to dramatically improve (nearly triple) the throughput of the mobility system — in particular through the effective use of high-occupancy vehicle (HOV) lanes. We propose algorithms to solve the allocation problem, and find that the structure of the ridesharing problem is amenable to clustering algorithms from machine learning for set partitioning in the combinatorial optimization framework. Our work provides an example of how a careful synthesis of understanding human behavior, incremental system design, and new algorithms can promote sustainable urban mobility.
Learning and Optimization for Mixed Autonomy Systems – A Mobility Context
Chapter 8: Human mobility preferences.
Thesis. PhD, Electrical Engineering and Computer Sciences, UC Berkeley, 2018.
Optimizing the diamond lane: A more tractable carpool problem and algorithms
Cathy Wu, K. Shankari, Ece Kamar, Randy Katz, David Culler, Christos Papadimitriou, Eric Horvitz, Alexandre Bayen
IEEE Intelligent Transportation Systems Conference (ITSC), 2016.
proceedings / pdf
Urban-scale Traffic State Estimation using Cellular Network Data
We cannot control what we cannot measure. Traffic flow estimation is notoriously difficult due to the shortage, unreliability, and cost of sensors such as induction coils embedded underneath roadways. We therefore explored the possibility of using cellular network information to improve estimation of traffic flow in urban-scale networks. We devised a new convex optimization framework and algorithm to exploit the unique structure of cellular network data, in particular its simplex structure. The accuracy, computational efficiency, and versatility of the proposed approach are validated on the I-210 corridor near Los Angeles. We achieve 90% route flow accuracy (as compared to a 50% baseline) with 1033 traffic sensors and 1000 cellular towers covering a large network of highways and arterials with more than 20,000 links. Due to the high accuracy, our work may enable new short-time horizon traffic applications concerning prediction, control, and operations. Our system is open source and was a collaboration with AT&T.
Cellpath: fusion of cellular and traffic sensor data for route flow estimation via convex optimization
Cathy Wu, Jerome Thai, Steve Yadlowsky, Alexei Pozdnoukhov, Alexandre Bayen
Transportation Research: Part C, 2015.
International Symposium on Transportation and Traffic Theory (ISTTT), 2015. Oral (14%).
journal / pdf / github (system) / github (algorithm)
Block simplex signal recovery: a method comparison and an application to routing
Cathy Wu, Alexei Pozdnoukhov, Alexandre Bayen
IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2019.
journal / github