Deep Reinforcement Learning for Scheduling Uplink IoT Traffic with Strict Deadlines

2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM)(2021)

引用 4|浏览19
暂无评分
摘要
This paper considers the Multiple Access problem where N Internet of Things (IoT) devices share a common wireless medium towards a central Base Station (BS). We propose a Reinforcement Learning (RL) method where the BS is the agent and the devices are part of the environment. A device is allowed to transmit only when the BS decides to schedule it. Besides the information packets, devices send additional messages like the delay or the number of discarded packets since their last transmission. This information is used to design the RL reward function and constitutes the next observation that the agent can use to schedule the next device. Leveraging RL allows us to learn the sporadic and heterogeneous traffic patterns of the IoT devices and an optimal scheduling policy that maximizes the channel throughput. We adapt the Proximal Policy Optimization (PPO) algorithm with a Recurrent Neural Network (RNN) to handle the partial observability of our problem and exploit the temporal correlations of the users' traffic. We demonstrate the performance of our model through simulations on different number of heterogeneous devices with periodic traffic and individual latency constraints. We show that our RL algorithm outperforms traditional scheduling schemes and distributed medium access algorithms.
更多
查看译文
关键词
Multiple Access, Reinforcement Learning, Proximal Policy Optimization, POMDP, Internet of Things, Wireless sensor networks, scheduling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要