[1]吴燕如,珠 杰,管美静.基于深度学习的藏文现代印刷物版面检测技术研究[J].南京师范大学学报(工程技术版),2021,21(01):044-48.[doi:10.3969/j.issn.1672-1292.2021.01.007]
 Wu Yanru,Zhu Jie,Guan Meijing.Research on Layout Inspection Technology of ModernTibetan Prints Based on Deep Learning[J].Journal of Nanjing Normal University(Engineering and Technology),2021,21(01):044-48.[doi:10.3969/j.issn.1672-1292.2021.01.007]
点击复制

基于深度学习的藏文现代印刷物版面检测技术研究
分享到:

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
21卷
期数:
2021年01期
页码:
044-48
栏目:
计算机科学与技术
出版日期:
2021-03-15

文章信息/Info

Title:
Research on Layout Inspection Technology of ModernTibetan Prints Based on Deep Learning
文章编号:
1672-1292(2021)01-0044-05
作者:
吴燕如12珠 杰12管美静12
(1.西藏大学信息科学技术学院,西藏 拉萨 850000)(2.藏文信息技术国家地方联合中心,西藏 拉萨 850000)
Author(s):
Wu Yanru12Zhu Jie12Guan Meijing12
(1.School of Information Science and Technology,Tibet University,Lhasa 850000,China)(2.National and Local Joint Center for Tibetan Information Technology,Lhasa 850000,China)
关键词:
深度学习藏文现代印刷物Faster R-CNN版面检测
Keywords:
deep learningmodern Tibetan printsFaster R-CNNlayout detection
分类号:
TP391
DOI:
10.3969/j.issn.1672-1292.2021.01.007
文献标志码:
A
摘要:
针对藏文现代图书版面中的文本行分布不均匀、现代藏文字体差异较大的问题,提出了一种基于Faster R-CNN的版面文本行检测算法. 通过在整理标注的数据集上训练,用ResNet-50网络提取出藏文现代图书版面特征信息. 为了有效提高模型的泛化能力,在COCO数据集下的网络模型中进行迁移学习. 实验结果表明,该方法可对藏文现代印刷物的版面实现文本行的定位,检测准确率为83%,召回率为95%,明显提高了版面检测的精确度.
Abstract:
Aimed at the uneven distribution of text lines in the layout of modern Tibetan books and the large differences in modern Tibetan fonts,a layout text line detection algorithm based on Faster R-CNN is proposed. By training on collated and labeled data set,we use the ResNet-50 network to extract the feature information of the Tibetan modern book layout. In order to effectively improve the generalization ability of the model,transfer learning is performed in the network model under the COCO dataset. The experimental results show that this method can realize text line positioning on the layout of modern Tibetan printed materials,with a detection accuracy rate of 83% and the recall rate of 95%,which significantly improves the accuracy of layout detection.

参考文献/References:

[1] 索南草. 浅谈藏文典籍文化的传承与保护[J]. 时代教育,2014(13):116-117.
[2]张西群. 面向藏文历史文献的版面分割方法研究[D]. 北京:北京工业大学,2018.
[3]EPSHTEIN B,OFEK E,WEXLER Y. Detecting text in natural scenes with stroke width transform[C]//IEEE Conference on Computer Vision and Pattern Recognition. San Francisco,USA:IEEE,2010.
[4]PAN Y F,HOU X W,LIU C L. A hybrid approach to detect and localize texts in natural scene images[J]. IEEE Trans Image Process,2011,20(3):800-813.
[5]ZHU J,CHEN X J,YUILLE A L,et al. DeePM:a deep part-based model for object detection and semantic part localization[C]//ICLR 2016. San Juan,Puerto Rico,2016.
[6]张学鹏. 基于深度学习的图像语义分割方法研究与实现[D]. 成都:电子科技大学,2018.
[7]CHEN C Y,LIU M Y,ONCEL T,et al. R-CNN for small object detection[C]//ACCV2016. Taipei,China,2016.
[8]赵丹凤. 基于通用对象估计的目标检测与模糊车牌识别算法研究[D]. 南京:南京邮电大学,2016.
[9]GIRSHICK R,LANDOLA F,DARRELL T,et al. Deformable part models are convolutional neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Boston,USA:IEEE Computer Society,2015.
[10]GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision(ICCV). Santiago,Chile:IEEE,2015.
[11]张勋,陈亮,朱雪婷,等. 基于区域卷积神经网络Faster R-CNN的手势识别方法[J]. 东华大学学报(自然科学版),2019,45(4):559-563.
[12]李响,苏娟,杨龙. 基于改进YOLOv3的合成孔径雷达图像中建筑物检测算法[J]. 兵工学报,2020,41(7):1347-1359.
[13]蒋强卫. 基于卷积神经网络的双目视觉物体识别与定位研究[D]. 哈尔滨:哈尔滨工程大学,2018.
[14]曾健. 基于深度学习的汽车门板焊点识别算法研究及应用[D]. 广州:华南理工大学,2019.
[15]张新,郭福亮,梁英杰,等. 基于R-CNN算法的海上船只的检测与识别[J]. 计算机应用研究,2020,37(增刊1):314-315,319.
[16]孙朝云,裴莉莉,李伟,等. 基于改进Faster R-CNN的路面灌装封裂缝检测方法[J]. 华南理工大学学报(自然科学版),2020,48(2):84-93.
[17]贺颖. 变换逼近理论指导下的卷积神经网络及其应用[D]. 唐山:华北理工大学,2019.
[18]SHI J H,CHANG Y J,XU C H,et al. Real-time leak detection using an infrared camera and Faster R-CNN technique[J]. Computer & Chemical Engineering,2020,135:106780.
[19]孙海铭,时兆峰,李晗,等. 基于Faster R-CNN的精密零部件的识别方法[J]. 飞控与探测,2020,3(2):26-36.
[20]王莹,丁鹏. 基于深度学习的交通信号灯检测及分类方法[J]. 汽车实用技术,2018(17):89-91.
[21]周华平,殷凯,桂海霞,等. 基于改进的Faster R-CNN目标人物检测[J]. 无线电通信技术,2020,46(6):712-716.
[22]朱旭,马淏,姬江涛,等. 基于Faster R-CNN的蓝莓冠层果实检测识别分析[J]. 南方农业学报,2020,51(6):1493-1501.

相似文献/References:

[1]程显毅,胡海涛,季国华,等.基于深度学习监控场景下的多尺度目标检测算法研究[J].南京师范大学学报(工程技术版),2018,18(03):033.[doi:10.3969/j.issn.1672-1292.2018.03.005]
 Cheng Xianyi,Hu Haitao,Ji Guohua,et al.Research on Algorithm of Multi-Scale Target DetectionBased on Deep Learning in Monitoring Scenario[J].Journal of Nanjing Normal University(Engineering and Technology),2018,18(01):033.[doi:10.3969/j.issn.1672-1292.2018.03.005]
[2]陈 扬,曾 诚,程 成,等.一种基于CNN的足迹图像检索与匹配方法[J].南京师范大学学报(工程技术版),2018,18(03):039.[doi:10.3969/j.issn.1672-1292.2018.03.006]
 Chen Yang,Zeng Cheng,Cheng Cheng,et al.A CNN-based Approach to Footprint Image Retrieval and Matching[J].Journal of Nanjing Normal University(Engineering and Technology),2018,18(01):039.[doi:10.3969/j.issn.1672-1292.2018.03.006]
[3]王俊淑,张国明,胡 斌.基于深度学习的推荐算法研究综述[J].南京师范大学学报(工程技术版),2018,18(04):033.[doi:10.3969/j.issn.1672-1292.2018.04.006]
 Wang Junshu,Zhang Guoming,Hu Bin.A Survey of Deep Learning Based Recommendation Algorithms[J].Journal of Nanjing Normal University(Engineering and Technology),2018,18(01):033.[doi:10.3969/j.issn.1672-1292.2018.04.006]
[4]郝 坤,张天坤,史振威.基于时空特征的热带气旋强度预测方法[J].南京师范大学学报(工程技术版),2019,19(03):001.[doi:10.3969/j.issn.1672-1292.2019.03.001]
 Hao Kun,Zhang Tiankun,Shi Zhenwei.An Tropical Cyclone Intensity Prediction MethodBased on Spatial-Temporal Features[J].Journal of Nanjing Normal University(Engineering and Technology),2019,19(01):001.[doi:10.3969/j.issn.1672-1292.2019.03.001]
[5]任媛媛,张显峰,马永建,等.基于卷积神经网络的无人机遥感影像农村建筑物目标检测[J].南京师范大学学报(工程技术版),2019,19(03):029.[doi:10.3969/j.issn.1672-1292.2019.03.005]
 Ren Yuanyuan,Zhang Xianfeng,Ma Yongjian,et al.Target Detection of Rural Buildings in UAV Remote Sensing ImagesBased on Convolutional Neural Network[J].Journal of Nanjing Normal University(Engineering and Technology),2019,19(01):029.[doi:10.3969/j.issn.1672-1292.2019.03.005]
[6]许博鸣,刘晓峰,业巧林,等.基于卷积神经网络面向自然场景建筑物识别技术的移动端应用[J].南京师范大学学报(工程技术版),2019,19(03):037.[doi:10.3969/j.issn.1672-1292.2019.03.006]
 Xu Boming,Liu Xiaofeng,Ye Qiaolin,et al.A Convolutional Neural Network Based on Mobile Application forIdentification of Buildings in Natural Scene[J].Journal of Nanjing Normal University(Engineering and Technology),2019,19(01):037.[doi:10.3969/j.issn.1672-1292.2019.03.006]
[7]梁秦嘉,刘 怀,陆 飞.基于改进YOLOv3模型的交通视频目标检测算法研究[J].南京师范大学学报(工程技术版),2021,21(02):047.[doi:10.3969/j.issn.1672-1292.2021.02.008]
 Liang Qinjia,Liu Huai,Lu Fei.Traffic Video Target Detection Algorithm Based on Improved YOLOv3[J].Journal of Nanjing Normal University(Engineering and Technology),2021,21(01):047.[doi:10.3969/j.issn.1672-1292.2021.02.008]
[8]苏 叶,李 婧,徐寅林.手骨X光片骨龄预测中图像预处理的研究[J].南京师范大学学报(工程技术版),2021,21(02):054.[doi:10.3969/j.issn.1672-1292.2021.02.009]
 Su Ye,Li Jing,Xu Yinlin.Research on Image Preprocessing in Predicting the Bone Age ofHand Bone X-ray Films[J].Journal of Nanjing Normal University(Engineering and Technology),2021,21(01):054.[doi:10.3969/j.issn.1672-1292.2021.02.009]
[9]王立凯,曲维光,魏庭新,等.基于深度学习的中文零代词识别[J].南京师范大学学报(工程技术版),2021,21(04):019.[doi:10.3969/j.issn.1672-1292.2021.04.004]
 Wang Likai,Qu Weiguang,Wei Tingxin,et al.Identification of Chinese Zero Pronouns Based on Deep Learning[J].Journal of Nanjing Normal University(Engineering and Technology),2021,21(01):019.[doi:10.3969/j.issn.1672-1292.2021.04.004]
[10]李庆涛,林培光,王基厚,等.基于板块效应的深度学习股价走势预测方法[J].南京师范大学学报(工程技术版),2022,22(01):030.[doi:10.3969/j.issn.1672-1292.2022.01.005]
 Li Qingtao,Lin Peiguang,Wang Jihou,et al.Deep Learning Stock Price Forecasting Method Based on Plate Effect[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(01):030.[doi:10.3969/j.issn.1672-1292.2022.01.005]

备注/Memo

备注/Memo:
收稿日期:2020-08-08.
基金项目:西藏大学研究生“高水平人才培养计划”项目(2017-GSP-131)、西藏自治区高等教育教学改革研究重点项目、多学科融合的新工科创新创业教育体系研究项目、藏语文传承与发展之藏汉双向机器翻译平台建设项目、计算机及藏文信息技术国家团队及重点实验室建设项目(藏大财指[2018]81号)、国家重点研发计划重点专项(2017YFB140220).
通讯作者:珠杰,博士,教授,博士生导师,研究方向:藏文信息处理、数据挖掘. E-mail:790139756@qq.com
更新日期/Last Update: 2021-03-15