[1]张 翔,谢 天,曹 健,等.煤-电双目标下基于有模型强化学习的回转窑工艺参数优化[J].南京师范大学学报(工程技术版),2023,23(01):075-83.[doi:10.3969/j.issn.1672-1292.2023.01.010]
 Zhang Xiang,Xie Tian,Cao Jian,et al.Optimization of Process Parameters of Rotary Kiln Based on Model-Based Reinforcement Learning Under the Dual Objectives of Coal and Electricity[J].Journal of Nanjing Normal University(Engineering and Technology),2023,23(01):075-83.[doi:10.3969/j.issn.1672-1292.2023.01.010]
点击复制

煤-电双目标下基于有模型强化学习的回转窑工艺参数优化
分享到:

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
23卷
期数:
2023年01期
页码:
075-83
栏目:
计算机科学与技术
出版日期:
2023-03-15

文章信息/Info

Title:
Optimization of Process Parameters of Rotary Kiln Based on Model-Based Reinforcement Learning Under the Dual Objectives of Coal and Electricity
文章编号:
1672-1292(2023)01-0075-09
作者:
张 翔1谢 天1曹 健1朱 毅2
(1.朗坤智慧科技股份有限公司,江苏 南京 210005) (2.扬州大学信息工程学院,江苏 扬州 225000)
Author(s):
Zhang Xiang1Xie Tian1Cao Jian1Zhu Yi2
(1.Luculent Smart Technology Co.,Ltd,Nanjing 210005,China) (2.College of Information Engineering,Yangzhou University,Yangzhou 225000,China)
关键词:
回转窑工艺参数优化概率神经网络基于模型的离线策略优化煤-电双目标
Keywords:
rotary kilnprocess parameter optimizationprobabilistic neural networkmodel-based offline strategy optimizationcoal-electricity dual objective
分类号:
TP181
DOI:
10.3969/j.issn.1672-1292.2023.01.010
文献标志码:
A
摘要:
基于煤-电双目标下回转窑工艺参数优化问题,提出了有模型强化学习的解决方法. 首先,以固定时间间隔为单位对历史工艺参数和运行目标进行数据处理与聚合. 其次,搭建概率神经网络建立回转窑控制参数与影响参数、运行目标值的关系模型,该模型被用作为后期强化学习框架中的奖励模型. 然后,利用基于模型的离线策略优化的强化学习算法构建控制参数推荐智能体,同时优化回转窑生产过程的煤电消耗. 最后,给出一个案例证明所提方法对回转窑工艺参数优化的适应性、高效性.
Abstract:
Aiming at the optimization problem of rotary kiln process parameters under the dual objectives of coal and electricity, this paper proposes a model-based reinforcement learning solution. Firstly, data processing and aggregation were performed on historical process parameters and operating targets in units of fixed time intervals. Secondly, a probabilistic neural network is built to establish the relationship model between the control parameters of the rotary kiln, the influencing parameters, and the operating target value, which was used as the reward model in the later reinforcement learning framework. Then, a reinforcement learning algorithm based on model-based offline strategy optimization was used to construct a control parameter recommendation agent, and at the same time, the coal and electricity consumption of the rotary kiln production process was optimized. Finally, a case analysis was given to prove the adaptability and high efficiency of the proposed method for optimizing the process parameters of rotary kiln.

参考文献/References:

[1]RADWAN A M. Different possible ways for saving energy in the cement production[J]. Advances in Applied Science Research,2012,3(2):1162-1174.
[2]CHATTERJEE A,SUI T B. Alternative fuels- effects on clinker process and properties[J]. Cement and Concrete Research,2019,123:105777.
[3]ZHENG J Q,ZHAO L,DU W L. Hybrid model of a cement rotary kiln using an improved attention-based recurrent neural network[J]. ISA Transactions,2022,129:631-643.
[4]LV S Z,YU H L,WANG X H,et al. Multi-control strategy combinatorial control of burning temperature of cement rotary kiln[C]//2018 IEEE 4th Information Technology and Mechatronics Engineering Conference,2018:86-90.
[5]张荣,刘小燕,武伟宁,等. 回转窑筒体热损失测量系统的研究[J]. 电子测量与仪器学报,2017,31(11):1843-1848.
[6]GENG F,LI Y M,WANG X Y,et al. Simulation of dynamic processes on flexible filamentous particles in the transverse section of a rotary dryerand its comparison with ideo-imaging experiments[J]. Powder Technology,2011,207:175-182.
[7]袁芷晨,杨永斌,李骞,等. 球团回转窑建模与仿真的研究进展[J]. 钢铁研究学报,2022,34(11):1187-1196.
[8]李庆峰. 新型干法水泥回转窑烧成带温度建模与控制研究[D]. 合肥:合肥工业大学,2020.
[9]殷润. 基于数据驱动的水泥生产能耗系统建模与优化[D]. 南京:南京邮电大学,2021.
[10]林满山,梁欣. 回转窑煅烧配置参数的预测模型设计[J]. 科技创新与应用,2017,6(14):49-50.
[11]张成华,雷玉成,刘伟. 应用遗传算法优化铝合金穿孔型等离子弧立焊工艺参数[J]. 扬州大学学报(自然科学版),2004(3):32-35.
[12]曹丽茹,王晓强,王排岗,等. 基于NSGAⅡ算法的超声滚挤压工艺参数优化[J]. 塑性工程学报,2022,29(7):19-25.
[13]郭飞,汪汝健,张云,等. 塑料注射成型工艺参数优化的模糊规则网络模型[J]. 机械工程学报2022,58(20):206-220.
[14]李瑞. 多种群优化算法研究及在水泥回转窑中的应用[D]. 秦皇岛:燕山大学,2019.
[15]HASSAN A,SEYED S H,JAFAR H. Improvement of a cement rotary kiln performance using artificial neural network[J]. Journal of Ambient Intelligence and Humanized Computing,2021,12:7765-7776.
[16]JANNER M,FU J,ZHANG M,et al. When to trust your model:model-based policy optimiz-ation[C]//33rd Conference on Neural Information Processing Systems. Canada,Vancouver,2019.
[17]YU T H,THOMAS G,YU L,et al. MOPO:Model-based offline policy optimization[C]//34th Conference on Neural Information Processing Systems. Canada,Vancouver,2020.
[18]周剑平. 水泥生产工艺[M]. 西安:西北大学出版社,2008.
[19]CHUA K,CALANDRA R,MCALLISTER R,et al. Deep reinforcement learning in a handful of trials using probabilistic dynamics models[C]//32nd Conference on Neural Information Processing Systems. Canada,Monteal,2018.
[20]LAKSHMINARAYANAN B,PRITZEL A,BLUNDELL C. Simple and scalable predictive uncertainty estimation using deep ensembles[C]//31st Conference on Neural Information Processing Systems. Long Beach,CA,USA,2017.
[21]HAARNOJA T,ZHOU A,Abbeel P,et al. Soft actor-critic:off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//International Conference on Machine Learning. Sweden,Stockholm,2018:1861-1870.
[22]HAARNOJA T,ZHOU A,HARTIKAINEN R,et al. Soft actor-critic algorithms and applications[J]. arXiv Preprint arXiv:1812.05905,2019.
[23]KUMAR A,ZHOU A,TUCKER G,et al. Conservative Q-learning for offline reinforcement learning[C]//34th Conference on Neural Information Processing Systems. Canada,Vancouver,2020.
[24]DABNEY W,OSTRONSKI G,SILVER D,et al. Implicit quantile networks for distributional reinforcement learning[C]//International Conference on Machine Learning. Sweden,Stockholm,2018.
[25]YU Y,SI X,HU C,et al. A review of recurrent neural networks:LSTM cells and network architectures[J]. Neural Computation,2019,31(7):1235-1270.
[26]SALINAS D,FLUNKERT V,GASTHAUS J,et al. DeepAR:Probabilistic forecasting with autoregressive recurrentnetworks[J]. International Journal of Forecasting,2020,36(3):1181-1191.
[27]CHEN T,HE T,BENESTY M,et al. Xgboost:extreme gradient boosting[J]. R Package Version 0.4-2,2015,1(4):1-4.
[28]CHERKASSKY V,MA Y. Practical selection of SVM parameters and noise estimation for SVM regression[J]. Neural Networks,2004,17(1):113-126.
[29]CHALUPKA K,WILLIAMS C K I,MURRAY I. A framework for evaluating approximation methods for Gaussian process regression[J]. Journal of Machine Learning Research,2013,14:333-350.
[30]SCHULMAN J,WOLSKI F,DHARIWAL P,et al. Proximal policy optimization algorithms[J]. arXiv Preprint arXiv:1707.06347,2017.
[31]SYED U,BOWLING M,SCHAPIRE R E. Apprenticeship learning using linea programming[C]//Proceedings of the 25th International Conference on Machine Learning. Finland,Helsinki,2008.

备注/Memo

备注/Memo:
收稿日期:2022-09-15.
通讯作者:朱毅,博士,研究方向:数据挖掘与知识图谱. E-mail:zhuyi@yzu.edu.cn
更新日期/Last Update: 2023-03-15