[2] EIGEN D,PUHRSCH C,FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems(NIPS). Cambridge Ma:MIT Press,2014:2366-2374.
[3] HE K M, ZHANG X Y, REN S Q ,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[4] WU K W, ZHANG S R, XIE Z. Monocular depth prediction with residual DenseASPP network[J]. IEEE Access,2020,8:129899-129910.
[5] WU J W,ZHOU W J,LUO T,et al. Multiscale multilevel context and multimodal fusion for RGB-D salient object detection[J]. Signal Process,2021,178:107766.
[6] QI F,LIN C H,SHI G M,et al. A convolutional encoder-decoder network with skip connections for saliency prediction [J]. IEEE Access,2019,7:60428-60438.
[7] ZHAO S Y,ZHANG L,SHEN Y,et al. Super-resolution for monocular depth estimation with multi-scale sub-pixel convolutions and a smoothness constraint [J]. IEEE Access,2019,7:16323-16335.
[8] RONNEBERGER O,FISCHER P,BROX T. U-Net:convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention. Berlin:Springer,2015:234-241.
[9] QIU C C,ZHANG S Y,WANG C,et al. Improving transfer learning and squeeze-and-excitation networks for small-scale fine-grained fish image classification[J]. IEEE Access,2018,6:78503-78512.
[10] LI L F, FANG Y,WU J, et al. Encoder-decoder full residual deep networks for robust regression and spatiotemporal estimation[J]. IEEE Transactions on Neural Networks and Learning Systems,2021,32(9):4217-4230.
[11] PARK J S,JEONG Y,JOO K,et al. Adaptive cost volume fusion network for multi-modal depth estimation in changing environments[J]. IEEE Robotics and Automation Letters,2022,7(2):5095-5102.
[12] LOSHCHILOV I,HUTTER F. Decoupled weight decay regularization[OL].(2019-01-04)[2023-06-29]. https://arxiv.org/abs/1711.05101.
[13] LOZA A,MIHAYLOVA L,BULL D,et al. Structural similarity-based object tracking in multimodality surveillance videos[J]. Machine Vision and Applications,2009,20(2):71-83.
[14] LIU F Y,SHEN C H,LIN G S,et al. Learning depth from single monocular images using deep convolutional neural fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(10):2024-2039.
[15] CAO Y Z H,WU Z F,SHEN C H. Estimating depth from monocular images as classification using deep fully convolutional residual networks[J]. IEEE Transactions on Circuits and Systems for Video Technology,2018,28(11):3174-3182.
[16] XU X F,CHEN Z,YIN F L. Monocular depth estimation with multi-scale feature fusion[J]. IEEE Signal Processing Letters,2021,28:678-682.
[17] KIM D,LEE S,LEE J,et al. Leveraging contextual information for monocular depth estimation[J]. IEEE Access,2020,8:147808-147817.
[18] FU H,GONG M M,WANG C H,et al. Deep ordinal regression network for monocular depth estimation[C]//Conference on Computer Vision and Pattern Recognition. New York:IEEE,2018:2002-2011.
[19] YUAN W H,GU X D,DAI Z Z,et al. NeW CRFs:neural window fully-connected CRFs for monocular depth estimation[C]//Conference on Computer Vision and Pattern Recognition. Los Alamitos:IEEE,2022:3906-3915.
[20] LEE J H,HAN M K,KO D W,et al. From big to small:multi-scale local planar guidance for monocular depth estimation[OL].(2021-09-23)[2023-06-29]. https://arxiv.org/abs/1907.10326v5