GSIS特邀论文|ISPRS主席Christian Heipke:深度学习与摄影测量和遥感学科的结合
在过去的几年里,基于深度学习的人工智能,尤其是基于卷积神经网络的人工智能,在几乎所有与摄影测量和遥感相关的任务中扮演了游戏规则的改变者的角色。
为进一步深刻了解并分析深度学习与摄影测量和遥感结合的广度、深度和未来发展,GSIS特邀国际摄影测量与遥感学会(ISPRS)主席Christian Heipke教授撰写了Deep learning for geometric and semantic tasks in photogrammetry and remote sensing,总结了摄影测量学和遥感学的深度学习基础,并举例说明了汉诺威莱布尼茨大学利用深度学习正在开展的几何任务(共轭点对的检测、描述和匹配、,三维表面重建),航空影像自动分析(土地覆盖和土地利用分类、迁移学习和弹坑探测)以及近景领域(汽车相对姿态的识别、行人检测和跟踪、文化遗产文献标准化)的应用。
a.密集匹配
b.土地覆盖和土地利用分类
c.弹坑检测
d.三维车辆重建
在文章最后,作者对深度学习和摄影测量和遥感结合的未来进行了描述。虽然深度学习在摄影测量和遥感领域有非常广泛的应用,但本质上,CNN(以及任何深度学习方法)都是分类器,深度学习 “cannot learn the unseen”。
因此,它具有与任何其他分类器相同的一般限制。目前一般采用深度学习的方法都是独立的解决方案。从长远来看,只有将不同的方法结合起来才能取得成功。
前沿观点
“原则上,CNN可以被视为分类器。
传统的分类器(随机森林、支持向量机、条件随机场、最大似然估计等)通过预处理步骤从数据集中提取代表不同类别的特征,然后基于这些特征进行分类。很明显,结果只能和选定的特征一样好。
CNN通过学习每个数据样本的特征和相应的标签来克服这个问题。
In principle, a CNN can be considered a classifier.
In traditional classifiers (random forests, support vector machines, conditional random fields, maximum likelihood estimation, etc.) features representing the different classes are extracted from the data set in a pre-processing step, and classification is then performed based on these features. It is clear then that the results can only be as good as the selected features.
CNN overcome this problem by learning the features together with the corresponding label for each data sample.
“CNN的强大之处是在分类过程中对特征表示和标签的综合估计,而且似乎只要有足够的训练数据,较深的网络实际上可以获得比浅层网络更好的结果。
The strength of CNN is the combined estimation of the feature representation and the labels during classification, and it seems that deeper networks are practically guaranteed to yield better results than shallow networks, as long as enough training data is available.
“一个CNN需要足够数量的有代表性的训练数据,与相关classes保持平衡。否则存在分类器与训练数据过度拟合的风险,结果中可能会引入偏差。
为了增加训练数据量,可以采用数据扩充、迁移学习等方法来容忍一定数量的错误标签(标签噪声)、半监督和无监督学习(聚类)等方法,并应加以研究。
在某些情况下,模拟技术也有帮助。
A CNN needs a sufficient number of representative training data, well balanced with respect to the related classes. Otherwise there is a risk of overfitting the classifier to the training data and a bias is likely to be introduced into the results.
To increase the amount of training data, data augmentation, transfer learning, approaches which are able to tolerate a certain amount of incorrect labels (label noise), semi-supervised and unsupervised learning (clustering) can be employed and should be studied.
In some cases, simulation techniques may also help.
“一个CNN“不能学习看不见的”,泛化能力仅限于以前看到的训练数据。
A CNN “cannot learn the unseen”, the generalization capabilities are limited to previously seen training data.
“增量学习和遗忘(或“忘却”)数据,例如那些由于环境变化而不再相关的数据,是一个迄今为止在我们的领域中很少受到关注的话题。
然而这一领域提供了巨大的潜力,尤其是对于多时相分析。
Incremental learning and forgetting (or “unlearning”) data, e.g. those which are not relevant anymore due to a changing environment, is a topic which has received little attention in our field so far, yet this area offers a large potential, in particular for multi-temporal analysis.
“需要做出许多设计决策,例如关于网络架构和损耗函数的设计。一般来说,不同的选择如何影响结果,以及分类器的鲁棒性如何,目前还不清楚。一些研究表明,CNN确实可以相对容易地被愚弄。
A number of design decisions need to be taken, e.g. with respect to the network architecture and the design of the loss function. It is not clear in general, how different choices influence the results, and how robust the classifiers are. Some works suggest that CNN can be indeed be fooled relatively easily.
“CNN是基于不同数据集的相关性。我们认为,理解一项任务,然后用人类的方式来推理可能的解决方案,远远超出了目前所采用的方法的范围(注意,这并不意味着推理就没有完成,例如在国际象棋或围棋游戏中。然而,这确实意味着CNN对于可能正确的解决方案和抽象的演绎学习没有直觉)。
A CNN is based on correlations of different data sets. We argue that understanding a task to then reason about possible solutions in a way humans do is far beyond the scope of the currently employed methods (note that this does not mean that reasoning is not done, e.g. in a game of chess or Go. It does mean, however, that CNN does not have an intuition for possibly correct solutions and abstract deductive learning).
“CNN在很大程度上是一个黑匣子。虽然它可能会产生非常好的结果,但人们基本上不知道为什么以及如何达到这些结果。
除了从科学的角度来看有点令人沮丧之外,这意味着这些方法的局限性无法明确说明,这就导致人们对这些方法是否可以应用于现实世界中的安全和安保相关领域产生了一些怀疑——自动驾驶就是一个很好的例子。
A CNN is largely a black box. While it may deliver very good results, it is largely unknown why and how exactly these results are being reached. Besides being a little frustrating from a scientific point of view, this means that the limitations of these methods cannot clearly be stated, resulting in some doubts whether the methods can be employed in real-world safety- and security-related areas – autonomous driving is a good example.
“因此,在我们的领域中似乎仍然存在着一些困难的研究问题。除了要保证结果的几何和语义准确性外,提高结果的可靠性也是非常重要的。只有通过研究更好的方法来解释为什么深度学习方法能产生效果,才能做到这一点。
另一个重要的方面是将深度学习方法与其他学习范式和先前的知识相结合,这句格言是“为什么要学习我们已经知道的东西?
到目前为止,本文所讨论的方法主要是独立的解决方案。我们相信,从长远来看,只有将不同的方法结合起来才能取得成功。
Thus, it seems that a number of difficult research questions still exist in our field.
Besides taking care of a better geometric and semantic accuracy of the results, improving their reliability is of great importance. This will only be possible by investigating better ways to explain why deep learning approaches give the results they do.
Another important aspect is the integration of deep learning approaches with other learning paradigms and prior knowledge, according to the motto, “Why learn what we already know?”.
So far, the approaches discussed in this paper are mainly stand-alone solutions. We believe that in the long run, only a combination of different methods will lead to success.
文章图表
图3 标准分类器(顶部)和CNN分类器(底部)的概念。后者的优点是可以同时从训练数据中学习特征和模型参数。
Figure 3. Concept of a standard classifier (top) and a CNN classifier (bottom). The advantage of the latter is that the features and the model parameters are learned simultaneously from the training data.
图4。用于图像分析的典型卷积神经网络结构。该图显示了卷积和合并的连续步骤,以生成在最后一步中分类的特征向量,通常使用softmax分类器(非线性激活函数未被描述)
Figure 4. Architecture of a typical Convolutional Neural Network for image analysis. The figure shows the successive steps of convolution and pooling to generate a feature vector which is classified in the final step, typically using the softmax classifier (the non-linear activation function is not depicted).
图5 U-net体系结构(an example of an encoder network with skip connections)
作者简介
Christian Heipke 汉诺威莱布尼兹大学摄影测量和遥感学教授,目前他领导着一个大约25人的研究团队。其研究领域包括摄影测量学、遥感、图像解译以及它们与计算机视觉和地理信息系统的联系。他撰写(合著)了逾300篇科技论文,其中70多篇发表在同行评议的国际期刊上。他先后获得1992年ISPRS Otto von Gruber奖;2012年ISPRS Fred Doyle奖和2013年ASPR摄影测量Fairchild奖。同时,他任职于诸多学术团体。2004—2009年,他担任欧盟地理空间数据委员会(EuroSDR)副主席。2011—2014年,他担任德国大地测量委员会(DGK)主席,2012—2016年担任ISPRS秘书长。自2016年7月至今,他担任国际摄影测量与遥感学会(ISPRS)主席。
Christian Heipke is a professor of photogrammetry and remote sensing at Leibniz University Hannover, where he currently leads a group of about 25 researchers. His professional interests comprise all aspects of photogrammetry, remote sensing, image understanding and their connection to computer vision and GIS. His has authored or coauthored more than 300 scientific papers, more than 70 of which appeared in peer-reviewed international journals. He is the recipient of the 1992 ISPRS Otto von Gruber Award, the 2012 ISPRS Fred Doyle Award, and the 2013 ASPRS Photogrammetric (Fairchild) Award. He is an ordinary member of various learnt societies. From 2004 to 2009, he served as vice president of EuroSDR. From 2011-2014 he was chair of the German Geodetic Commission (DGK), from 2012-2016 ISPRS Secretary General. Currently he serves as ISPRS President.
Franz Rottensteiner 汉诺威莱布尼兹大学(LUH)副教授,现为摄影测量与地理空间信息研究所(IPI)“摄影测量图像分析”研究小组负责人。他在奥地利维也纳技术大学(TUW)取得博士学位。其研究方向包括图像定位、图像分类、基于图像和点云的自动目标检测和重建以及遥感数据的变化检测等方面。在2008年加入LUH之前,他分别在TUW和澳大利亚的新南威尔士大学和墨尔本大学工作。他撰写或合著了150多篇科学论文,其中36篇发表在同行评议的国际期刊上。他于2004年获得奥地利大地测量委员会的Karl Rinner奖,2017年获得Leica Geosystems公司赞助的Carl Pulfrich Award for Photogrammetry。自2011年起,他一直担任Photogrammetrie Fernerkundung Geoinformation副主编。作为ISPRS第II/4工作组主席,他发起并实施了ISPRS城市目标检测和三维建筑重建数据集。
Franz Rottensteiner is an Associate Professor and leader of the research group “Photogrammetric Image Analysis” at Leibniz University Hannover. He received the Dipl.-Ing. degree in surveying and the Ph.D. degree and venia docendi in photogrammetry, all from Vienna University of Technology (TUW), Vienna, Austria. His research interests include all aspects of image orientation, image classification, automated object detection and reconstruction from images and point clouds, and change detection from remote sensing data. Before joining LUH in 2008, he worked at TUW and the Universities of New South Wales and Melbourne, respectively, both in Australia. He has authored or coauthored more than 150 scientific papers, 36 of which have appeared in peer-reviewed international journals. He received the Karl Rinner Award of the Austrian Geodetic Commission in 2004 and the Carl Pulfrich Award for Photogrammetry, sponsored by Leica Geosystems, in 2017. Since 2011, he has been the Associate Editor of the ISI-listed journal “Photogrammetrie Fernerkundung Geoinformation”. Being the Chairman of the ISPRS Working Group II/4, he initiated and conducted the ISPRS benchmark on urban object detection and 3D building reconstruction.
汉诺威莱布尼茨大学摄影测量与地理信息研究所深度学习相关成果
· Albert L., Rottensteiner F., and Heipke C. 2017. “A Higher Order Conditional Random Field Model for Simultaneous Classification of Land Cover and Land Use.” ISPRS Journal for Photogrammetry and Remote Sensing 130 (2017): 63–80.
· Blott G., Yu J., and Heipke C. 2019. “Multi-View Person Re-Identification in a Fisheye Camera Network with Different Viewing Directions.” PFG. doi:10.1007/s41064-019-00083-y.
· Blott G., Takami M., and Heipke C. 2018. “Semantic Segmentation of Fisheye Images.” Computer Vision – ECCV 2018 Workshops Part I – 6th Workshop on Computer Vision for Road Scene Understanding and Autonomous Driving, Springer LNCS 11,129, Cham, 181–196.
· Chen L., Rottensteiner F., and Heipke C. 2016. “Invariant Descriptor Learning Using a Siamese Convolutional Neural Network.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences III-3, Prague, Czech Republic, July 12–19.
· Clermont D., Kruse C., Rottensteiner F., and Heipke C. 2019. “Supervised Detection of Bomb Craters in Historical Aerial Images Using Convolutional Neural Networks.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W16: 67–74. doi:10.5194/isprs-archives-XLII-2-W16-67-2019.
· Coenen M., Rottensteiner F., and Heipke C. 2019. “Precise Vehicle Reconstruction for Autonomous Driving Applications.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W5: 21–28. doi:10.5194/isprs-annals-IV-2-W5-21-2019.
· Dorozynski M., Clermont D., and Rottensteiner F., 2019. “Multi-task Deep Learning with Incomplete Training Samples for the Image-based Prediction of Variables Describing Silk Fabrics.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W6: 47–54.
·i.c.sens. 2019. Accessed 20 November 2019. https://www.icsens.uni-hannover.de/start.html?&L=1
· Kang J., Chen L., Deng F., and Heipke C. 2019. “Context Pyramidal Network for Stereo Matching Regularized by Disparity Gradients.” ISPRS Journal of Photogrammetry and Remote Sensing 157 (2019): 201–215.
· Mehltretter M., and Heipke C. 2019. “CNN-based Cost Volume Analysis as Confidence Measure for Dense Matching.” ICCV Workshop on 3D Reconstruction in the Wild (3DRW2019). http://openaccess.thecvf.com/content_ICCVW_2019/papers/3DRW/Mehltretter_CNN-Based_Cost_Volume_Analysis_as_Confidence_Measure_for_Dense_Matching_ICCVW_2019_paper.pdf
· Nguyen U., Rottensteiner F., and Heipke C. 2019. “Confidence-aware Pedestrian Tracking Using a Stereo Camera.” ISPRS Annals IV-2/W5, Enschede, The Netherlands, June 10–14, 53–60.
·SILKNOW. 2019. Accessed 20 November 2019. http://silknow.eu/
· Wittich D., and Rottensteiner F. 2019. “Adversarial Domain Adaptation for the Classification of Aerial Images and Height Data Using Convolutional Neural Networks.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W7: 197–204.
· Yang C., Rottensteiner F., and Heipke C. 2018. “Classification of Land Cover and Land Use Based on Convolutional Neural Networks.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences IV-3: 251–258. doi:10.5194/isprs-annals-IV-3-251-2018.
· Yang C., Rottensteiner F., and Heipke C. 2019. “Classification of Land Cover and Land Use Based on Convolutional Neural Networks.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences III-3: 251–258.
关于 Geo-spatial Information Science
Geo-spatial Information Science(GSIS)是由武汉大学主办的测绘遥感专业英文期刊,主编为中国科学院院士、中国工程院院士李德仁教授。2020年9月被SCIE收录。
GSIS 采用开放获取的出版模式,就是大家所说的开源期刊/OA期刊(Open Access),文章一经发表,可马上被全球读者免费全文下载,这种模式可以让你的文章有更多的曝光度。
目前,在GSIS发表文章不需缴纳审稿费、论文处理费等任何费用,完全免费。欢迎广大测绘遥感学科的科研工作者投稿。如果您有需要抢首发权的高质量文章,可与我们联系gsis@whu.edu.cn,主编/国际副主编亲自为您处理,编辑部提供随时随地的疑问解答与状态跟踪。
期刊官网:
https://www.tandfonline.com/tgsi
投稿网址:
https://rp.tandfonline.com/submission/create?journalCode=TGSI
虚拟专辑
GSIS虚拟专辑|地球空间信息科学的趋势与挑战,UPINLBS、VGI
热点专刊
论文推荐
GIS的未来是什么?——ESRI总裁Jack Dangermond和美国科学院院士Michael F. Goodchild的思考
专家报告
学术报告|李德仁院士:从对地观测到对人观测——论社会地理计算
学术报告|龚健雅院士:位置关联的多网数据叠加协议与智能服务技术
长按二维码 关注GSIS微信号
GSIS-WHU
Geo-Spatial Information Science
SCIE数据库收录期刊
中国最具国际影响力学术期刊
中国科技期刊卓越行动计划入选期刊
https://www.tandfonline.com/tgsi
推特账号|GSISOffice