Structured Reinforcement Learning for Media Streaming at the Wireless Edge
arxiv(2024)
摘要
Media streaming is the dominant application over wireless edge (access)
networks. The increasing softwarization of such networks has led to efforts at
intelligent control, wherein application-specific actions may be dynamically
taken to enhance the user experience. The goal of this work is to develop and
demonstrate learning-based policies for optimal decision making to determine
which clients to dynamically prioritize in a video streaming setting. We
formulate the policy design question as a constrained Markov decision problem
(CMDP), and observe that by using a Lagrangian relaxation we can decompose it
into single-client problems. Further, the optimal policy takes a threshold form
in the video buffer length, which enables us to design an efficient constrained
reinforcement learning (CRL) algorithm to learn it. Specifically, we show that
a natural policy gradient (NPG) based algorithm that is derived using the
structure of our problem converges to the globally optimal policy. We then
develop a simulation environment for training, and a real-world intelligent
controller attached to a WiFi access point for evaluation. We empirically show
that the structured learning approach enables fast learning. Furthermore, such
a structured policy can be easily deployed due to low computational complexity,
leading to policy execution taking only about 15μs. Using YouTube streaming
experiments in a resource constrained scenario, we demonstrate that the CRL
approach can increase QoE by over 30
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要