参考文献/References:
[1]罗建豪,吴建鑫. 基于深度卷积特征的细粒度图像分类研究综述[J]. 自动化学报,2017,43(8):1306-1318.
[2]ZHAO B,FENG J S,WU X,et al. A survey on deep learning-based fine-grained object classification and semantic segmentation[J]. International Journal of Automation and Computing,2017,14(2):119-135.
[3]WEI X S,WU J X,CUI Q. Deep learning for fine-grained image analysis:a survey[J/OL]. arXiv Preprint arXiv:1907.03069v1,2019.
[4]LIN T Y,ROYCHOWDHURY A,MAJI S. Bilinear CNN models for fine-grained visual recognition[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago,Chile:IEEE,2015.
[5]LIN T Y,MAJI S. Improved bilinear pooling with CNNs[J/OL]. arXiv Preprint arXiv:1707.06772,2017.
[6]WANG Y M,MORARIU V I,DAVIS L S. Learning a discriminative filter bank within a CNN for fine-grained recognition[C]//Proceedings of the 2018 IEEE/CVF conference on Computer Vision and Pattern Recognition. Salt Lake City,USA:IEEE,2018:4148-4157.
[7]YANG Z,LUO T G,WANG D,et al. Learning to navigate for fine-grained classification[C]//Proceedings of the 15th European Conference on Computer Vision. Munich,Germany:ECCV,2018.
[8]LIN T Y,DOLLAR P,GIRSHICK R,et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Honolulu,USA:IEEE,2017.
[9]VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,USA:NIPS,2017.
[10]XIAO T J,XU Y C,YANG K Y,et al. The application of two-level attention models in deep convolutional neural network for fine-grained image classification[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Boston,USA:IEEE,2015.
[11]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al. An image is worth 16×16 words:Transformers for image recognition at scale[J/OL]. arXiv Preprint arXiv:2010.11929,2021.
[12]李佳盈,蒋文婷,杨林,等. 基于ViT的细粒度图像分类[J]. 计算机工程与设计,2023,44(3):916-921.
[13]LIU Z,LIN Y T,CAO Y,et al. Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV). Montreal,Canada:IEEE,2021.
[14]XU Y F,WEI H P,LIN M X,et al. Transformers in computational visual media:a survey[J]. Computational Visual Media,2022,8(1):33-62.
[15]CARION N,MASSA F,SYNNAEVE G,et al. End-to-end object detection with transformers[C]//Proceedings of the 16th European Conference on Computer Vision. Glasgow,UK:Springer,2020.
[16]MEINHARDT T,KIRILLOV A,LEAL-TAIXE L,et al. Trackformer:multi-object tracking with transformers[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). New Orleans,USA:IEEE,2022.
[17]YANG H H,FU Y W. Wavelet U-Net and the chromatic adaptation transform for single image dehazing[C]//Proceedings of the 2019 IEEE International Conference on Image Processing(ICIP). Taipei,China:IEEE,2019.
[18]MEI J B,WANG M M,LIN Y N,et al. TransVOS:video object segmentation with transformers[J/OL].(2021-06-01). arXiv Preprint arXiv:2106.00588,2021.
[19]HE K M,ZHANG X Y,REN S Q,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,USA:IEEE,2016.
[20]WAH C,BRANSON S,WELINDER P,et al. The caltech-UCSD birds-200-2011 dataset[R]. Pasadena,USA:California Institute of Technology,2011.
[21]MAJI S,RAHTU E,KANNALA J,et al. Fine-grained visual classification of aircraft[J/OL].(2013-06-21). arXiv Preprint arXiv:1306.5151,2013.
[22]KRAUSE J,STARK M,DENG J,et al. 3D object representations for fine-grained categorization[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops. Sydney,Australia:IEEE,2013.
[23]LIN T Y,ROYCHOWDHURY A,MAJI S. Bilinear CNN models for fine-grained visual recognition[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision(ICCV). Santiago,Chile:IEEE,2015.
[24]LI Z C,YANG Y,LIU X,et al. Dynamic computational time for visual attention[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops(ICCVW). Venice,Italy:IEEE,2017.
[25]ZHENG H L,FU J L,MEI T,et al. Learning multi-attention convolutional neural network for fine-grained image recognition[C]//Proceedings of 2017 IEEE International Conference on Computer Vision(ICCV). Venice,Italy:IEEE,2017.
[26]MOGHIMI M,BELONGIE S,SABERIAN M,et al. Boosted convolutional neural networks[C]//Proceedings of the 2016 British Machine Vision Conference(BMVC). York,UK:BMVA,2016.
[27]YANG Z,LUO T G,WANG D,et al. Learning to navigate for fine-grained classifycation[C]//Proceedings of the 15th European Conference on Computer Vision. Munich,Germany:ECCV,2018.
[28]HU T,QI H G,HUANG Q M,et al. See better before looking closer:weakly supervised data augmentation network for fine-grained visual classification[J/OL].(2019-01-26). arXiv Preprint arXiv:1901.09891,2019.