«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.issn.1672-1292.2022.02.006]
点击复制

一种基于决策层融合的多模态情感识别方法

分享到：

南京师范大学学报（工程技术版）[ISSN:1006-6977/CN:61-1281/TN]

卷:: 22卷
期数:: 2022年02期

页码:: 035-40

栏目:: 计算机科学与技术

出版日期:: 2022-06-30

文章信息/Info

Title:: A Multimodal Emotion Recognition Method Based on Decision Level Fusion

文章编号:: 1672-1292(2022)02-0035-06

作者:: 韩天翊¹; 2; 林荣恒¹; 2; (1.北京邮电大学计算机学院(国家示范性软件学院),北京 100876)(2.北京邮电大学网络与交换技术国家重点实验室,北京 100876)

Author(s):: Han Tianyi¹; 2; Lin Rongheng¹; 2; (1.School of Computer Science(National Pilot Software Engineering School),Beijing University of Posts and Telecommunications,Beijing 100876,China)(2.State Key Laboratory of Networking and Switching Technology,Beijing University of Posts and Telecommunications,Beijing 100876,China)

关键词:: 情感识别; 卷积神经网络; 软硬结合; 多模态; 决策层融合

Keywords:: emotion recognition; convolutional neural network; combination of software and hardware; multimodal; decision-level fusion

分类号:: TP391

DOI:: 10.3969/j.issn.1672-1292.2022.02.006

文献标志码:: A

摘要:: 设计了一种软硬结合的多模态情感识别系统,使用语音和面部表情两个模态,通过梅尔频率倒谱系数与卷积神经网络对情感进行识别和分类,同时将语音情感识别迁移到神经网络计算棒以降低环境负载. 在模态融合时,采用决策层融合的方式来提高识别准确率. 实验结果表明,系统拥有较高的识别准确率,且能够在性能较差的运行环境中保持运行速度.

Abstract:: This paper designs a multimodal emotion recognition system that combines software and hardware. The system uses Mel-Frequency Cepstrum Coefficient and convolutional neural networks to recognize and classify emotions on speech and facial expressions. At the same time,emotion recognition of speech is transferred to neural network computing sticks to reduce the environmental load. In modal fusion,the method of decision-level fusion is used to improve the recognition accuracy. Experimental results show that the system has high recognition accuracy and can maintain running speed in the environment with poor performance.

参考文献/References:

[1] ANG J,DHILLON R,KRUPSKI A,et al. Prosody-based automatic detection of annoyance and frustration in human-computer dialog[C]//Seventh International Conference on Spoken Language Processing. Denver,USA:DBLP,2002.
[2]LEE C M,NARAYANAN S S,PIERACCINI R. Combining acoustic and language information for emotion recognition[C]//Seventh International Conference on Spoken Language Processing. Denver,USA:DBLP,2002.
[3]GRAVES A,FERNáNDEZ S,SCHMIDHUBER J. Bidirectional LSTM networks for improved phoneme classification and recognition[C]//International Conference on Artificial Neural Networks. Berlin,Germany:Springer,2005.
[4]EYBEN F,W?LLMER M,GRAVES A,et al. On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues[J]. Journal on Multimodal User Interfaces,2010,3:7-19.
[5]陈闯,CHELLALI R,邢尹. 改进遗传算法优化BP神经网络的语音情感识别[J]. 计算机应用研究,2019,36(2):344-346,361.
[6]EKMAN P,FRIESEN W V. Manual for the Facial Action Coding System[M]. Palo Alto:Consulting Psychologists Press,1978.
[7]ZHANG Z Y. Feature-based facial expression recognition:sensitivity analysis and experiments with a multilayer perceptron[J]. International Journal of Pattern Recognition and Artificial Intelligence,1999,13(6):893-911.
[8]SHAN C F,GONG S G,MCOWAN P W. Robust facial expression recognition using local binary patterns[C]//IEEE International Conference on Image Processing 2005. Genova,Italy:IEEE,2005.
[9]KO B C. A brief review of facial emotion recognition based on visual information[J]. Sensors,2018,18(2):401.
[10]谢非,龚俊,王元祥,等. 基于肤色增强和分块PCA的人脸表情识别方法[J]. 南京师范大学学报(工程技术版),2017,17(2):49-56.
[11]KIM Y,LEE H,PROVOST E M. Deep learning for robust feature generation in audiovisual emotion recognition[C]//2013 IEEE International Conference on Acoustics,Speech and Signal Processing. Vancouver,Canada:IEEE,2013.
[12]HOSSAIN M S,MUHAMMAD G. Audio-visual emotion recognition using multi-directional regression and Ridgelet transform[J]. Journal on Multimodal User Interfaces,2016,10:325-333.
[13]闫静杰,卢官明,李海波,等. 基于人脸表情和语音的双模态情感识别[J]. 南京邮电大学学报(自然科学版),2018,38(1):60-65.
[14]JIANG D M,CUI Y L,ZHANG X J,et al. Audio visual emotion recognition based on triple-stream dynamic bayesian network models[C]//International Conference on Affective Computing and Intelligent Interaction. Berlin,Germany:Springer,2011.
[15]KAYA H,GüRPINAR F,SALAH A A. Video-based emotion recognition in the wild using deep transfer learning and score fusion[J]. Image and Vision Computing,2017,65:66-75.
[16]KRIZHEVSKY A,SUTSKEVER I,HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM,2017,6(6):84-90.
[17]LIENHART R,MAYDT J. An extended set of Haar-like features for rapid object detection[C]//Proceedings of the International Conference on Image Processing 2002. Rochester,USA:IEEE,2002.
[18]LEE C M,NARAYANAN S S. Toward detecting emotions in spoken dialogs[J]. IEEE Transactions on Speech and Audio Processing,2005,13(2):293-303.
[19]DATCU D,ROTHKRANTZ L. Multimodal recognition of emotions in car environments[C]//Proceedings of the Driver Car Internation & Interface 2009. Prague,Czech:DCI&I,2009.

相似文献/References:

[1]曹金梦,倪蓉蓉,杨彪.面向面部表情识别的双通道卷积神经网络[J].南京师范大学学报(工程技术版),2018,18(03):001.[doi:10.3969/j.issn.1672-1292.2018.03.001]
　Cao Jinmeng,Ni Rongrong,Yang Biao.Binary-Channel Convolutional Neural Network forFacial Expression Recognition[J].Journal of Nanjing Normal University(Engineering and Technology),2018,18(02):001.[doi:10.3969/j.issn.1672-1292.2018.03.001]
[2]陈扬,曾诚,程成,等.一种基于CNN的足迹图像检索与匹配方法[J].南京师范大学学报(工程技术版),2018,18(03):039.[doi:10.3969/j.issn.1672-1292.2018.03.006]
　Chen Yang,Zeng Cheng,Cheng Cheng,et al.A CNN-based Approach to Footprint Image Retrieval and Matching[J].Journal of Nanjing Normal University(Engineering and Technology),2018,18(02):039.[doi:10.3969/j.issn.1672-1292.2018.03.006]
[3]成杰,叶文武,徐寅林.回转库档案实时定位中基于鱼眼镜头图像的处理识别技术[J].南京师范大学学报(工程技术版),2019,19(02):075.[doi:10.3969/j.issn.1672-1292.2019.02.010]
　Cheng Jie,Ye Wenwu,Xu Yinlin.Processing and Recognition Technology Based on Fisheye Lens Image in Real-Time Positioning of Rotary Library Files[J].Journal of Nanjing Normal University(Engineering and Technology),2019,19(02):075.[doi:10.3969/j.issn.1672-1292.2019.02.010]
[4]任媛媛,张显峰,马永建,等.基于卷积神经网络的无人机遥感影像农村建筑物目标检测[J].南京师范大学学报(工程技术版),2019,19(03):029.[doi:10.3969/j.issn.1672-1292.2019.03.005]
　Ren Yuanyuan,Zhang Xianfeng,Ma Yongjian,et al.Target Detection of Rural Buildings in UAV Remote Sensing ImagesBased on Convolutional Neural Network[J].Journal of Nanjing Normal University(Engineering and Technology),2019,19(02):029.[doi:10.3969/j.issn.1672-1292.2019.03.005]
[5]许博鸣,刘晓峰,业巧林,等.基于卷积神经网络面向自然场景建筑物识别技术的移动端应用[J].南京师范大学学报(工程技术版),2019,19(03):037.[doi:10.3969/j.issn.1672-1292.2019.03.006]
　Xu Boming,Liu Xiaofeng,Ye Qiaolin,et al.A Convolutional Neural Network Based on Mobile Application forIdentification of Buildings in Natural Scene[J].Journal of Nanjing Normal University(Engineering and Technology),2019,19(02):037.[doi:10.3969/j.issn.1672-1292.2019.03.006]
[6]王飞,陈亮杰,王梨,等.基于卷积神经网络的仓储物体检测算法研究[J].南京师范大学学报(工程技术版),2019,19(04):099.[doi:10.3969/j.issn.1672-1292.2019.04.017]
　Wang Fei,Chen Liangjie,Wang Li,et al.Research on Warehouse Object Detection AlgorithmBased on Convolutional Neural Network[J].Journal of Nanjing Normal University(Engineering and Technology),2019,19(02):099.[doi:10.3969/j.issn.1672-1292.2019.04.017]
[7]梁秦嘉,刘怀,陆飞.基于改进YOLOv3模型的交通视频目标检测算法研究[J].南京师范大学学报(工程技术版),2021,21(02):047.[doi:10.3969/j.issn.1672-1292.2021.02.008]
　Liang Qinjia,Liu Huai,Lu Fei.Traffic Video Target Detection Algorithm Based on Improved YOLOv3[J].Journal of Nanjing Normal University(Engineering and Technology),2021,21(02):047.[doi:10.3969/j.issn.1672-1292.2021.02.008]
[8]梁秦嘉,刘怀,陆飞.基于改进YOLOv3的运动目标分类检测算法研究[J].南京师范大学学报(工程技术版),2021,21(04):027.[doi:10.3969/j.issn.1672-1292.2021.04.005]
　Liang Qinjia,Liu Huai,Lu Fei.Moving Target Classification and Detection AlgorithmBased on Improved YOLOv3[J].Journal of Nanjing Normal University(Engineering and Technology),2021,21(02):027.[doi:10.3969/j.issn.1672-1292.2021.04.005]
[9]尚文倩,曹原.FastGR:一种基于神经协同过滤的群组推荐算法[J].南京师范大学学报(工程技术版),2022,22(02):029.[doi:10.3969/j.issn.1672-1292.2022.02.005]
　Shang Wenqian,Cao Yuan.FastGR:A Group Recommendation Algorithm Based on Neural Collaborative Filtering[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(02):029.[doi:10.3969/j.issn.1672-1292.2022.02.005]
[10]张宇苏,吴小俊,李辉,等.基于无监督深度学习的红外图像与可见光图像融合算法[J].南京师范大学学报(工程技术版),2023,23(01):001.[doi:10.3969/j.issn.1672-1292.2023.01.001]
　Zhang Yusu,Wu Xiaojun,Li Hui,et al.Infrared Image and Visible Image Fusion Algorithm Based on Unsupervised Deep Learning[J].Journal of Nanjing Normal University(Engineering and Technology),2023,23(02):001.[doi:10.3969/j.issn.1672-1292.2023.01.001]

备注/Memo

备注/Memo:: 收稿日期:2021-08-31.
基金项目:江西省重点研发计划项目(20212BBE51002).
通讯作者:林荣恒,博士,副教授,研究方向:强化学习与生成对抗研究、云计算边缘计算、工业大数据分析、大数据与人工智能. E-mail:rhlin@bupt.edu.cn

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed2067
全文下载/Downloads2186
评论/Comments

更新日期/Last Update: 1900-01-01