[1]LI K,WANG S J,ZHANG X,et al. Pose recognition with cascade transformers[C]//IEEE Conference on Computer Vision and Pattern Recognition. Nashville,TN,USA,2021.
[2]CAO Z,SIMO T,WEI S,et al. Realtime multi-person 2d pose estimation using part affinity fields[C]//IEEE Conference On Computer Vision And Pattern Recognition. Honolulu,HI,USA,2017.
[3]KOCABAS M,KARAGOZ S,AKBAS E. Multiposenet:Fastmulti-person pose estimation using pose residual network[C]//European Conference on Computer Vision. Munich,Germany,2018.
[4]PAPANDREOU G,ZHU T,GIDARIS S,et al. Personlab:Person pose estimation and instance segmentation with a bottom-up,part-based,geometric embedding model[C]//European Conference on Computer Vision. Munich,Germany,2018.
[5]NEWELL A,HUANG Z A,DENG J. Associative embedding:End-to-end learning for joint detection and grouping[C]//NeurIPS. Long Beach,CA,USA,2017.
[6]INSAFUTDINOV E,PISHCHULIN L,ANDRES B,et al. Deepercut:A deeper,stronger,and faster multi-person pose estimation model[C]//European Conference on Computer Vision,Amsterdam,The Netherlands,Amsterdam. The Netherlands,2016.
[7]孔英会,秦胤峰,张珂. 深度学习二维人体姿态估计方法综述[J].中国图象图形学报,2023,28(7):1965-1989.
[8]CHENG B W,XIAO B,et al. HigherHRNet:Scale-aware representation learning for bottom-up human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle,WA,USA,2020.
[9]邹宇翔,何宁,郭宇昕,等. 基于深度学习的人体姿态估计综述[C]//中国计算机用户协会网络应用分会2023年第二十七届网络新技术与应用年会. 镇江,江苏,2023.
[10]XIAO B,WU H P,WEI Y C. Simple baselines for human pose estimation and tracking[C]//European Conference on Computer Vision. Munich,Germany,2018.
[11]WANG J D,SUN K,CHENG T H,et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligenc,2021,43(10):3349-3361.
[12]CHENG B W,WEI Y C,FERIS R,et al. Decoupled classification refinement:Hard false positive suppression for object detection[J]. arXiv Preprint arXiv:1810.04002,2018.
[13]CHENG B W,WEI Y C,SHI H H,et al. Revisiting rcnn:On awakening the classification power of faster rcnn[C]//European Conference on Computer Vision. Munich,Germany,2018.
[14]NEWELL A,YANG K,DENG J. Stacked hourglass networks for human poseestimation[C]//European Conference on Computer Vision. Amsterdam,The Netherlands,2016.
[15]CHU X,YANG W,OUYANG W,et al. Multi-context attention for human pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA,2017.
[16]SUN K,XIAO B,LIU Det al. Deep high-resolution representation learning for human pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Long Beach,CA,USA,2019.
[17]CARREIRA J,AGRAWAL P,FRAGKIADAKI K. Human pose estimation with iterative error feedback[C]//IEEE Conference On Computer Vision And Pattern Recognition. Las Vegas,NV,USA,2016.
[18]TOSHEV A,SZEGEDY C. Deeppose:Human pose estimation via deep neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Columbus,OH,USA,2014.
[19]CHU X,OUYANG W,LI H,et al. Structured feature learning for pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition. LasVegas,NV,USA,2016.
[20]YANG W,OUYANG W,LI H,et al. End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation[C]//IEEE Conference on Computer Vision And Pattern Recognition,Las Vegas,NV,USA,2016.
[21]TOMPSON J,GOROSHIN R,JAIN A,et al. Efficient object localization using convolutional networks[C]//IEEE Conference on Computer Vision and Pattern Recognition,Boston,MA,USA,2015.
[22]WOO,SANGHYUN,et al. Cbam:Convolutional block attention module[C]//European Conference on Computer Vision. Munich,Germany,2018.
[23]JIE H,SHEN L,SUN G. Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA,2018
[24]JADERBERG MAX,SIMONYAN K,ZISSERMAN A,et al. Spatial transformer networks[J]. Advances in Neural Information Processing Systems,2015,2:2017-2025.
[25]FANG H S,XIE S Q,TAI Y W,et al. Rmpe:Regional multi-person pose estimation[C]//European Conference on Computer Vision. Honolulu,HI,USA,2017.
[26]YUAN Y,FU R,HUANG L,et al. High-resolution transformer for dense prediction[J]. arXiv Preprint arXiv:2110.09408,2021.
[27]DEBAPRIYA M J,NAGORI S,MATHEW M,et al. Yolo-pose:Enhancing yolo for multi person pose estimation using object keypoint similarity loss[C]//IEEE Conference on Computer Vision and Pattern Recognition. New Orleans,LA,USA,2022.
[28]ZHU X,LVU S,WANG X,et al. TPH-YOLOv5:Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//IEEE Conference on Computer Vision Recognition. Nashville,TN,USA,2021.
[29]YUAN Y H,FU R,HUANG L,et al. HR Former:High-Resolution transformer for dense prediction[J]. arXiv Preprint arXiv:2110.09408,2021.
[30]HU J,SHEN L,SUN G. Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA,2018.
[31]WANG Q,WU B,ZHU P,et al. ECA-Net:Efficient channel attention for deep convolutional neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA,2020.
[32]GIRSHICK R,GUPTA A,et al. Non-local neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA,2018.
[33]LOSHCHILOV I,HUTTER F. Decoupled weight decay regularization[J]. arXiv Preprint arXiv:1711.05101,2017.
[34]LIN T,MAIRE M,BELONGIE S,et al. Microsoft COCO:Com-mon objects in context[C]//European Conference on Computer Vision. Zurich,Switzerland,2014.
[35]VARGHESE R,SAMBATH. YOLOv8:A novel object detection algorithm with enhance performance and robustness[C]//2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems. Chennai,India,2024.