[1]毛 晋,熊 轲,位 宁,等.基于深度强化学习的超密集网络中多用户上行功率控制方法[J].南京师范大学学报(工程技术版),2022,22(01):016-23.[doi:10.3969/j.issn.1672-1292.2022.01.003]
 Mao Jin,Xiong Ke,Wei Ning,et al.Power Control in Ultra Dense Network:A DeepReinforcement Learning Based Method[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(01):016-23.[doi:10.3969/j.issn.1672-1292.2022.01.003]
点击复制

基于深度强化学习的超密集网络中多用户上行功率控制方法
分享到:

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
22卷
期数:
2022年01期
页码:
016-23
栏目:
机器学习
出版日期:
2022-03-15

文章信息/Info

Title:
Power Control in Ultra Dense Network:A DeepReinforcement Learning Based Method
文章编号:
1672-1292(2022)01-0016-08
作者:
毛 晋12熊 轲12位 宁34张 煜5张锐晨12
(1.北京交通大学计算机与信息技术学院,北京 100044)(2.交通数据分析与挖掘北京市重点实验室,北京 100044)(3.中兴通讯股份有限公司,广东 深圳 518057)(4.移动网络和移动多媒体技术国家重点实验室,广东 深圳 518055)(5.国网能源研究院有限公司,北京 102209)
Author(s):
Mao Jin12Xiong Ke12Wei Ning34Zhang Yu5Zhang Ruichen12
(1.School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)(2.Beijing Key Laboratory of Traffic Data Analysis and Mining,Beijing 100044,China)(3.ZTE Corporation,Shenzhen 518057,China)(4.State Key Laboratory of Mobile Network and Mobile Multimedia Technology,Shenzhen 518055,China)(5.State Grid Energy Research Institute Co.,Ltd.,Beijing 102209,China)
关键词:
超密集网络功率控制信息容量服务质量深度强化学习
Keywords:
ultra-dense networkspower controlinformation capacityQoSdeep reinforcement learning
分类号:
TP391
DOI:
10.3969/j.issn.1672-1292.2022.01.003
文献标志码:
A
摘要:
针对超密集网络中由于用户数量多、相互距离近,通信过程中彼此之间干扰大,导致频谱利用率不高的问题,建立了通过优化控制发射功率同时提升系统信息容量和满足服务质量的用户数量的优化问题. 由于该问题非凸且功率控制为离散变量,将其建模为马尔科夫决策过程. 在此基础上,提出了基于深度强化学习的功率控制算法,并设计了相应的动作空间、状态空间及奖励函数. 仿真结果表明,所提算法与最大发射功率策略和随机发射功率策略相比,分别提高了至少15.9%的信息容量和至少10.7%的用户服务质量满足率. 与不考虑用户服务质量满足率提升的算法相比,所提算法通过适当降低信息容量,提升了用户服务质量满足率.
Abstract:
For ultra-dense networks,in view of the problem of low spectrum utilization due to excessive users and large interference,an optimization problem is formulated to increase the system information capacity and satisfy the number of users with the quality of service(QoS)by optimizing the transmission power. Since the problem is non convex and the power control is a discrete variable,it is modeled as a Markov decision policy process. To this end,a power control algorithm based on deep reinforcement learning is proposed,and the corresponding action space,state space and reward function are designed. Simulation results show that compared with the maximum transmit power strategy and random transmit power strategy,the proposed algorithm improves the information capacity by at least 15.9% and the satisfaction of users’ QoS by at least 10.7%. Moreover,compared with the algorithm without considering the improvement of user’s QoS,the proposed algorithm improves the user’s QoS by appropriately reducing the information capacity.

参考文献/References:

[1] 新天域互联. 全球网络流量分析市场将以23.05%CAGR成长[Z/OL]. [2020-06-23]https://www.sohu.com/a/403637502_100161396.
[2]OLWAL T O,DJOUANI K,KURIEN A M. A survey of resource management toward 5G radio access networks[J]. IEEE Communications Surveys and Tutorials,2016,18(3):1656-1686.
[3]NAVARRO-ORTIZ J,ROMERO-DIAZ P,SENDRA S,et al. A survey on 5G usage scenarios and traffic models[J]. IEEE Communications Surveys and Tutorials,2020,22(2):905-929.
[4]KHURPADE J M,RAO D,SANGHAVI P D. A survey on IOT and 5G network[C]//Proceedings of the 2018 International Conference on Smart City and Emerging Technology(ICSCET). Mumbai,India:IEEE,2018:1-3
[5]SHEN K M,YU W. A coordinated uplink scheduling and power control algorithm for multicell networks[C]//Proceedings of the 2015 49th Asilomar Conference on Signals,Systems and Computers. Pacific Grove,USA:IEEE,2015:1305-1309.
[6]ELWEKEIL M,ALGHONIEMY M,MUTA O. Dynamic autonomous frequency reuse for uplink cellular networks[C]//Proceedings of the 2018 IEEE International Conference on Consumer Electronics(ICCE). Las Vegas,USA:IEEE,2018:1-5.
[7]VISALI M,SAKURU K L V S. Power control based resource allocation in LTE uplinks[C]//Proceedings of the 2015 International Conference on Communications and Signal Processing. Melmaruvathur,India:IEEE,2015:0579-0582.
[8]SHEN K M,YU W. Fractional programming for communication systems—Part I:power control and beamforming[J]. IEEE Transactions on Signal Processing,2018,66(10):2616-2630.
[9]NINGOMBAM D D,SHIN S. Radio resource allocation and power control scheme to mitigate interference in device-to-device communications underlaying LTE-A uplink cellular networks[C]//Proceedings of the 2017 International Conference on Information and Communication Technology Convergence(ICTC). Jeju,Korea:IEEE,2017:961-963.
[10]ZEINEDDINE K,HONIG M L,NAGARAJ S. Uplink power allocation for distributed interference cancellation with channel estimation error[J]. IEEE Transactions on Wireless Communications,2016,15(10):6785-6796.
[11]王云,韩伟. 一种基于划分和集成思想的多智能体强化学习[J]. 南京师范大学学报(工程技术版),2008,8(4):59-62.
[12]MENG F,CHEN P,WU L N,et al. Power allocation in multi-user cellular networks:deep reinforcement learning approaches[J]. IEEE Transactions on Wireless Communications,2020,19(10):6255-6267.
[13]GHADIMI E,CALABRESE F D,PETERS G,et al. A Reinforcement learning approach to power control and rate adaptation in cellular networks[C]//Proceedings of the 2017 IEEE International Conference on Communications. Paris,France:IEEE,2016:1-7.
[14]NASIR Y S,GUO D N. Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks[J]. IEEE Journal on Selected Areas in Communications,2019,37(10):2239-2250.
[15]TAN J J,ZHANG L,LIANG Y C. Deep reinforcement learning for channel selection and power control in D2D networks[C]//Proceedings of the 2019 IEEE Global Communications Conference(GLOBECOM). Waikoloa,USA:IEEE,2020:1-6.
[16]ZHANG R C,XIONG K,GUO W,et al. Q-learning-based adaptive power control in wireless RF energy harvesting heterogeneous networks[J]. IEEE Systems Journal,2020,15(2):1861-1872.
[17]DENT P,BOTTOMLEY G E. Jakes fading model revisited[J]. Electronics Letters,1993,29(13):1162-1163.
[18]O’SHEA T,HOYDIS J. An introduction to deep learning for the physical layer[J]. IEEE Transactions on Cognitive Communications and Networking,2017,3(4):563-575.

备注/Memo

备注/Memo:
收稿日期:2021-08-31.
基金项目:国家自然科学基金项目(62071033)、 国家重点研发计划项目(2020YFB1806903)、国网能源研究院有限公司研究项目(526700190002).
通讯作者:熊轲,博士,教授,研究方向:无线网络、物联网、网络信息论. E-mail:kxiong@bjtu.edu.cn
更新日期/Last Update: 2022-03-15