Approximate Multiagent Reinforcement Learning for On-Demand Urban Mobility Problem on a Large Map (extended version)
arxiv(2023)
摘要
In this paper, we focus on the autonomous multiagent taxi routing problem for
a large urban environment where the location and number of future ride requests
are unknown a-priori, but can be estimated by an empirical distribution. Recent
theory has shown that a rollout algorithm with a stable base policy produces a
near-optimal stable policy. In the routing setting, a policy is stable if its
execution keeps the number of outstanding requests uniformly bounded over time.
Although, rollout-based approaches are well-suited for learning cooperative
multiagent policies with considerations for future demand, applying such
methods to a large urban environment can be computationally expensive due to
the large number of taxis required for stability. In this paper, we aim to
address the computational bottleneck of multiagent rollout by proposing an
approximate multiagent rollout-based two phase algorithm that reduces
computational costs, while still achieving a stable near-optimal policy. Our
approach partitions the graph into sectors based on the predicted demand and
the maximum number of taxis that can run sequentially given the user's
computational resources. The algorithm then applies instantaneous assignment
(IA) for re-balancing taxis across sectors and a sector-wide multiagent rollout
algorithm that is executed in parallel for each sector. We provide two main
theoretical results: 1) characterize the number of taxis m that is sufficient
for IA to be stable; 2) derive a necessary condition on m to maintain
stability for IA as time goes to infinity. Our numerical results show that our
approach achieves stability for an m that satisfies the theoretical
conditions. We also empirically demonstrate that our proposed two phase
algorithm has equivalent performance to the one-at-a-time rollout over the
entire map, but with significantly lower runtimes.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要