[1]黄江涛,刘 刚,周 攀,等.基于深度强化学习技术的舰载无人机自主着舰控制研究[J].南京师范大学学报(工程技术版),2022,22(03):063-71.[doi:10.3969/j.issn.1672-1292.2022.03.009]
 Huang Jiangtao,Liu Gang,Zhou Pan,et al.Research on Autonomous Landing Control of Carrier-borne UCAV Based on Deep Reinforcement Learning Technology[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(03):063-71.[doi:10.3969/j.issn.1672-1292.2022.03.009]
点击复制

基于深度强化学习技术的舰载无人机自主着舰控制研究
分享到:

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
22卷
期数:
2022年03期
页码:
063-71
栏目:
计算机科学与技术
出版日期:
2022-09-15

文章信息/Info

Title:
Research on Autonomous Landing Control of Carrier-borne UCAV Based on Deep Reinforcement Learning Technology
文章编号:
1672-1292(2022)03-0063-09
作者:
黄江涛刘 刚周 攀章 胜杜 昕
(中国空气动力研究与发展中心空天技术研究所,四川 绵阳 621000)
Author(s):
Huang JiangtaoLiu GangZhou PanZhang ShengDu Xin
(Aerospace Technology Institute,China Aerodynamics Research and Development Center,Mianyang 621000,China)
关键词:
强化学习舰载无人机智能着舰舵偏指令深度神经网络
Keywords:
reinforcement learningcarrier-borne UAVintelligent carrier landingcontrol surface commanddeep neural network
分类号:
V211.3
DOI:
10.3969/j.issn.1672-1292.2022.03.009
文献标志码:
A
摘要:
自主着舰是未来舰载无人机面临的重要难题与关键技术. 基于TD3算法结合舰载飞机六自由度运动以及航空母舰运动模型,构建了交互式深度强化学习仿真环境. 针对典型海况进行了舰载无人机自主着舰训练,仿真训练过程中综合考虑海况以及航空母舰纵荡、横荡和沉浮3个线扰动,滚转、俯仰和偏航3个角扰动等因素,建立对应简化运动模型; 基于某型飞机气动数据进行气动力建模,建立六自由度运动学/动力学模型; 基于TD3强化学习算法,结合前馈型深度神经网络技术,在高性能GPU工作站上建立舰载机着舰交互训练环境. 通过某型舰载无人机在无模型环境中“试错”训练,验证了AI技术在舰载无人机自主着舰控制中的可行性.
Abstract:
Autonomous landing is an important problem and a key technology for future Carrier-borne UAV. Based on the TD3 algorithm,combined with the 6-DOF motion model of carrier aircraft and the motion model of aircraft carrier,an interactive deep reinforcement learning simulation environment is constructed. In the process of simulation training,the corresponding simplified motion model is established by considering the sea conditions,three line disturbances of aircraft carrier including surge,sway and heave,and three angular disturbances of roll,pitch and yaw. Based on the aerodynamic data of a certain type of aircraft,the aerodynamic model is established,and the six degree of freedom dynamics model is also established. Based on TD3 reinforcement learning algorithm,this paper further introduces an auxiliary network,an adaptive variance and learning step adjustment algorithm to accelerate convergence and improve training stability. Furthermore,combined with feed forward deep neural network technology,an interactive training environment for carrier based aircraft landing is established on high performance GPU workstation. Through the "trial and error" training of a certain type of Carrier-borne UAV in interactive environment,the feasibility of AI technology in Carrier-borne UAV autonomous landing control is verified.

参考文献/References:

[1]安军. 航母尾流模拟及舰载机着舰控制的初步研究[D]. 武汉:华中科技大学,2012.
[2]巩鹏潇,詹浩,柳子栋. 舰尾流影响下的舰载机着舰控制与仿真研究[J]. 航空工程进展,2013,4(3):339-345,357.
[3]张孝伟. 飞翼舰载无人机着舰控制技术研究[D]. 南京:南京航空航天大学,2017.
[4]郑峰婴,龚华军,甄子洋. 基于坐标系动态变化的无人机着舰引导算法[J]. 中南大学学报(自然科学版),2016,47(8):2685-2693.
[5]ANDERSON M R. Inner and outer loop manual control of automatic carrier landing[C]//Proceedings of 1996 Guidance,Navigation,and Control Conference(AIAA). San Diego,USA:AIAA,1996.
[6]黄旭,柳嘉润,贾晨辉,等. 深度确定性策略梯度算法用于无人飞行器控制[J]. 航空学报,2021,42(11):397-407.
[7]余洋. 面向飞行器自主着舰问题的行动者-评论家算法模型研究与实现[D]. 北京:北京交通大学,2019.
[8]吴昭欣,李辉,王壮,等. 基于深度强化学习的智能仿真平台设计[J]. 战术导弹技术,2020(4):193-200.
[9]方振平,陈万春,张曙光. 航空飞行器飞行动力学[M]. 北京:北京航空航天大学出版社,2005.
[10]GARNETT T S. Investigation to study the aerodynamic ship wake turbulence generated by a DD963 Destroyer:ADA083663[R]. Philadelphia,USA:Boeing Vetrol Co.,1979.
[11]COLAS C,SIGAUD O,OUDEYER P Y. GEP-PG:decoupling exploration and exploitation in deep reinforcement learning algorithms[C]//Proceedings of the 35th International Conference on Machine Learning. Stockholm,Sweden:PMLR,2018.
[12]LILLICRAP T P,HUNT J J,PRITZEL A,et al. Continuous control with deep reinforcement learning[J]. Computer Science,2015,8(6):A187.
[13]焦李成,赵进,杨淑媛,等. 深度学习、优化与识别[M]. 北京:清华大学出版社,2017.
[14]谭浪. 强化学习在多智能体对抗中的应用研究[D]. 北京:中国运载火箭技术研究院,2019.
[15]SUTTONR S,BARTO A G. Reinforcement learning:an introduction[M]. Cambridge,England:MIT Press,2018.

相似文献/References:

[1]吴卿源,谭晓阳.基于UCB算法的交替深度Q网络[J].南京师范大学学报(工程技术版),2022,22(01):024.[doi:10.3969/j.issn.1672-1292.2022.01.004]
 Wu Qingyuan,Tan Xiaoyang.Alternated Deep Q Network Based on Upper Confidence Bound[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(03):024.[doi:10.3969/j.issn.1672-1292.2022.01.004]
[2]王哲超,傅启明,陈建平,等.小样本场景下的强化学习研究综述[J].南京师范大学学报(工程技术版),2022,22(01):086.[doi:10.3969/j.issn.1672-1292.2022.01.013]
 Wang Zhechao,Fu Qiming,Chen Jianping,et al.Review of Research on Reinforcement Learning in Few-Shot Scenes[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(03):086.[doi:10.3969/j.issn.1672-1292.2022.01.013]

备注/Memo

备注/Memo:
收稿日期:2021-08-31.
通讯作者:杜昕,博士,助理研究员,研究方向:飞行控制技术. E-mail:f_yforever@126.com
更新日期/Last Update: 2022-09-15