查看原文
其他

【IJAC专题】AI & 图像处理

2017-09-19 IJAC编辑部 IJAC

9月17~20日,2017IEEE图像处理国际会议(ICIP2017,全称:the 2017 IEEE International Conference on Image Processing)在国家会议中心隆重举行,作为图像处理界的一大盛会,ICIP吸引了来自世界各地的专家学者。大牛在哪,IJAC当然就在哪!话不多说,容小编刷一波新鲜出炉的现场大图

蓝天白云下的国家会议中心


座无虚席的主会场


Springer展台上,快找找IJAC的身影

图像处理(Image Processing)同样也是IJAC刊载的重点选题之一。近两年,不少学者在IJAC上贡献了许多精彩的研究成果,小编本期隆重推荐以下10篇,详细目录及全文阅读可点击文末“阅读原文”免费获取:


1美国麻省理工Tomaso Poggio院士

Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review

深层网络而非浅层网络可以避开维度诅咒的原因和时间:一篇综述

Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao

Open Access:

https://link.springer.com/article/10.1007/s11633-017-1054-2


推荐理由
来自麻省理工Tomaso Poggio教授,截止目前,论文下载量突破2.4KAltmetric分数高达60!深度学习架构和机器学习模式的搭建,来自于神经学方面的研究进展,换句话说,同样的架构是存在于大脑皮质当中的。关于深度学习,已经有成千上万的研究者在不同领域进行这方面的研究,比如无人驾驶、语音识别等等。可我们还不清楚,为什么深度学习在这些工程应用中会起作用,深度学习的机理是什么?回答这一问题将有助于我们理解“为什么大脑皮质会存在不同的层次”。Poggio教授在这篇文章中,将为您解读深度学习的关键理论、最新成果和开放式研究问题


【英文摘要】

The paper reviews and extends an emerging body of theoretical results on deep learning including the conditions under which it can be exponentially better than shallow learning. A class of deep convolutional networks represent an important special case of these conditions, though weight sharing is not the main reason for their exponential advantage. Implications of a few key theorems are discussed, together with new results, open problems and conjectures.

【关键词】

Machine learning, neural networks, deep and shallow networks, convolutional neural networks, function approximation, deep learning.

【往期精彩】

【IJAC热文】MIT Tomaso Poggio教授探讨深度学习机理

http://mp.weixin.qq.com/s/AwmQyhREjpIew0beIuj6yA


2北京大学黄铁军教授

Imitating the Brain with Neurocomputer - aNewWay towards Artificial General Intelligence

强AI的“仿真主义”和神经计算机的“五原则”

Tie-Jun Huang

Open Access:

https://link.springer.com/article/10.1007/s11633-017-1082-y


推荐理由

来自北京大学计算机系主任黄铁军教授。过去,大概有四种方法来实现某些人工智能:符号主义,联结主义,行为主义和统计学,这四种方法在不同的角度抓住了智能的一些特点。黄教授提出:仿真主义可谓人工智能方法中第五流派。这篇综述成为IJAC优先发表论文中的亮点之一,在Twitter上被国外学者转发数次。文章不仅阐述了如何打破走向通用人工智能的研究僵局、探索研制类脑神经计算机的新思路,还详细描述了类脑神经计算机三个重要技术层次和国内外研究进展


【英文摘要】

To achieve the artificial general intelligence (AGI), imitate the intelligence? or imitate the brain? This is the question! Most artificial intelligence (AI) approaches set the understanding of the intelligence principle as their premise. This may be correct to implement specific intelligence such as computing, symbolic logic, or what the AlphaGo could do. However, this is not correct for AGI, because to understand the principle of the brain intelligence is one of the most difficult challenges for our human beings. It is not wise to set such a question as the premise of the AGI mission. To achieve AGI, a practical approach is to build the so-called neurocomputer, which could be trained to produce autonomous intelligence and AGI. A neurocomputer imitates the biological neural network with neuromorphic devices which emulate the bio-neurons, synapses and other essential neural components. The neurocomputer could perceive the environment via sensors and interact with other entities via a physical body. The philosophy under the “new” approach, so-called as imitationalism in this paper, is the engineering methodology which has been practiced for thousands of years, and for many cases, such as the invention of the first airplane, succeeded. This paper compares the neurocomputer with the conventional computer. The major progress about neurocomputer is also reviewed.

【关键词】

Artificial general intelligence (AGI), neuromorphic computing, neurocomputer, brain-like intelligence, imitationalism.

【往期精彩】

【IJAC热文】北京大学黄铁军:走向通用人工智能,到底是先“理解智能”还是先“制造智能”?

http://mp.weixin.qq.com/s/bM_9_6AzFl2QlQ-KQPzXjA


3360首席科学家颜水成团队

A Survey on Deep Learning-based Fine-grained Object Classification and Semantic Segmentation

基于“深度学习”的细粒度图像分类法及语义分割法

Bo Zhao, Jiashi Feng, Xiao Wu, Shuicheng Yan


1) SpringerLink:

https://link.springer.com/article/10.1007/s11633-017-1053-3

2) IJAC官网:

http://www.ijac.net/EN/abstract/abstract1901.shtml


推荐理由

来自360首席科学家、人工智能研究院院长颜水成团队。文章回顾了基于“深度学习”的4种细粒度图像分类方法,以及基于“深度学习”的语义分割方法。如何让机器学会“认识”各种各样的鸟?如何让机器能够“看图说话”?答案就在文中~


【英文摘要】

The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively.

【关键词】

Deep learning,fine-grained image classification,semantic segmentation,convolutional neural network (CNN),recurrent neural network (RNN).

【往期精彩】

【IJAC推文】颜水成团队解读“高智商”机器人的终极杀器——深度学习

https://mp.weixin.qq.com/s/KRBTTycNve3GY8T9AkvoSA


4中科院自动化所顾庆毅研究员

Review of Some Advances and Applications in Real-time High-speed Vision: Our Views and Experiences

实时高速视觉的发展与应用:我们的观点和经验

Qing-Yi Gu, Idaku Ishii


1) SpringerLink:

https://link.springer.com/article/10.1007/s11633-016-1024-0

2) IJAC官网:

http://www.ijac.net/EN/abstract/abstract1808.shtml


推荐理由

视觉相关的研究已经成为学术界最炙手可热的话题。然而,多数视觉系统中使用的相机的处理速度仍受限于视频信号格式(例如,NTSC 30 fps, PAL 25 fps)。普通的视觉系统并不能用来分析高速运动和高速物理现象。然而,大量人类视觉系统不能感知的高速现象等待我们解开它们神秘的面纱。当前,很多应用领域对实时高速视觉传感器的需求十分旺盛,如工厂自动化、生物医学、机器人等。文章综述了实时高速视觉系统在不同领域的应用及发展,包括智能日志系统、震动动态传感、基于视觉的机械控制、三维测量/计算机视觉检测、基于视觉的人机接口及生物医学应用


【英文摘要】

The frame rate of conventional vision systems is restricted to the video signal formats (e.g., NTSC 30 fps and PAL 25fps) that are designed on the basis of the characteristics of the human eye, which implies that the processing speed of these systems is limited to the recognition speed of the human eye.However, there is a strong demand for real-time high-speed vision sensors in many application fields, such as factory automation, biomedicine, and robotics, where high-speed operations are carried out.These high-speed operations can be tracked and inspected by using high-speed vision systems with intelligent sensors that work at hundreds of Hertz or more, especially when the operation is difficult to observe with the human eye.This paper reviews advances in developing real-time high speed vision systems and their applications in various fields, such as intelligent logging systems, vibration dynamics sensing, vision-based mechanical control, three-dimensional measurement/automated visual inspection, vision-based human interface, and biomedical applications.

【关键词】

Real-time high-speed vision, target tracking, abnormal behavior detection, behavior mining, vibration analysis, 3D shape measurement, cell sorting.


5英国帝国理工学院郭毅可团队

Inferring Functional Connectivity in fMRI Using Minimum Partial Correlation

最小偏相关法推出fMRI中的脑功能连接

Lei Nie, Xian Yang, Paul M. Matthews, Zhi-Wei Xu, Yi-Ke Guo


1) SpringerLink: 

https://link.springer.com/article/10.1007/s11633-017-1084-9

2) IJAC官网:

http://www.ijac.net/EN/abstract/abstract1895.shtml


推荐理由

每个人都携带着独一无二的生理名片,当前应用最广泛的就是指纹识别。同样地,人类所有智慧的中枢---大脑也蕴藏着天下无双的神经联络图。继指纹识别后,扫描大脑是否也可以作为识别人的手段?如何才能更好地构建脑功能连接图?该研究采用最小偏相关法来构建机能性核共振成像(fMRI)中的脑功能连接图,无需参数即可实现。通过控制其他区域所有可能的子集,最终,两区域间的最小偏相关系数就是偏相关系数绝对值的最小值。仿真结果表明,文章所提出的方法在大多数情况下优于其他方法,通过人类连接组项目中的静息状态功能磁共振成像数据集证实,该方法具有很大的应用价值


【英文摘要】

Functional connectivity has emerged as a promising approach to study the functional organisation of the brain and to define features for prediction of brain state. The most widely used method for inferring functional connectivity is Pearson-s correlation, but it cannot differentiate direct and indirect effects. This disadvantage is often avoided by computing the partial correlation between two regions controlling all other regions, but this method suffers from Berkson-s paradox. Some advanced methods, such as regularised inverse covariance, have been applied. However, these methods usually depend on some parameters. Here we propose use of minimum partial correlation as a parameter-free measure for the skeleton of functional connectivity in functional magnetic resonance imaging (fMRI). The minimum partial correlation between two regions is the minimum of absolute values of partial correlations by controlling all possible subsets of other regions. Theoretically, there is a direct effect between two regions if and only if their minimum partial correlation is non-zero under faithfulness and Gaussian assumptions. The elastic PC-algorithm is designed to efficiently approximate minimum partial correlation within a computational time budget. The simulation study shows that the proposed method outperforms others in most cases and its application is illustrated using a resting-state fMRI dataset from the human connectome project.

【关键词】

Functional connectivity, functional magnetic resonance imaging (fMRI), network modelling, partial correlation, PC-algorithm, resting-state networks.

【往期精彩】

【IJAC推文】帝国理工郭毅可·最小偏相关法画出“神经指纹”

http://mp.weixin.qq.com/s/b9_VgwyVFOqYyozaoogvJQ


6国家重点实验室谭民研究员

PLS-CCA Heterogeneous Features Fusion-based Low-resolution Human  Detection Method for Outdoor Video Surveillance

户外视频监控中基于PLS-CCA异构特征融合的人脸检测方法

Hong-Kai Chen, Xiao-Guang Zhao, Shi-Ying Sun, Min Tan


1) SpringerLink:

https://link.springer.com/article/10.1007/s11633-016-1029-8

2) IJAC官网: 

http://www.ijac.net/EN/abstract/abstract1902.shtml


推荐理由

来自中科院自动化所国家重点实验室。文章提出一种新的视觉测量方法:用一个摄像头估算出地面物体的三维位置。摄像头安装在一个倾斜的机器人上。通过构建测量模型,可准确计算并校准地面物体的位置。相比于当前应用在飞机上的传统测位方法,本文提出的方法更进一步,可得到物体的3D位置,室内实验证实了该方法的准确性和可靠性。


【英文摘要】

In this paper, we focus on low-resolution human detection and propose a partial least squares-canonical correlation analysis (PLS-CCA) for outdoor video surveillance. The analysis relies on heterogeneous features fusion-based human detection method. The proposed method can not only explore the relation between two individual heterogeneous features as much as possible, but also can robustly describe the visual appearance of humans with complementary information. Compared with some other methods, the experimental results show that the proposed method is effective and has a high accuracy, precision, recall rate and area under curve (AUC) value at the same time, and offers a discriminative and stable recognition performance.

【关键词】

Low-resolution human detection, partial least squares, canonical correlation analysis,   heterogeneous features, outdoor video surveillance.     

【往期精彩】

《国重记》| 出品人:中科院自动化所

http://mp.weixin.qq.com/s/IhPcqo65SDLIBnbRZ0pDvQ


7日本东京工业大学

Real-time 3D Microtubule Gliding Simulation Accelerated by GPU Computing

基于GPU加速的3D微管滑行实时仿真

Gregory Gutmann, Daisuke Inoue, Akira Kakugo, Akihiko Konagaya


1) SpringerLink:

https://link.springer.com/article/10.1007/s11633-015-0947-1 

2) IJAC官网:

http://www.ijac.net/EN/abstract/abstract1748.shtml


推荐理由

微管滑行实验是一种生物实验,旨在观察玻璃表面受马达蛋白驱动的微管动力性能。滑行实验中,如果活动良好,微管通常可以产生和创造高级别动力(higher-level dynamics),如环形结构和管状结构(ring and bundle structures)。为在电脑上产生高级别动力,本文主要集中于3D实时微管仿,通过实时调整仿真参数,可以得到更多微管动力学及群移动的信息和知识。实验时,需要平衡3D绘图与计算性能,这成为一大技术难题。而GPU可有效平衡上千万个任务,实现实时3D仿真。此外,借助GPGPU可保证仿真在高度平行的姿态下(massively parallel fashion)进行。最后,基于仿真分析,文章还建立了模型用以测试性能。


【英文摘要】

A microtubule gliding assay is a biological experiment observing the dynamics of microtubules driven by motor proteins fixed on a glass surface. When appropriate microtubule interactions are set up on gliding assay experiments, microtubules often organize and create higher-level dynamics such as ring and bundle structures. In order to reproduce such higher-level dynamics on computers, we have been focusing on making a real-time 3D microtubule simulation. This real-time 3D microtubule simulation enables us to gain more knowledge on microtubule dynamics and their swarm movements by means of adjusting simulation parameters in a real-time fashion. One of the technical challenges when creating a real-time 3D simulation is balancing the 3D rendering and the computing performance. Graphics processor unit (GPU) programming plays an essential role in balancing the millions of tasks, and makes this real-time 3D simulation possible. By the use of general-purpose computing on graphics processing units (GPGPU) programming we are able to run the simulation in a massively parallel fashion, even when dealing with more complex interactions between microtubules such as overriding and snuggling. Due to performance being an important factor, a performance model has also been constructed from the analysis of the microtubule simulation and it is consistent with the performance measurements on different GPGPU architectures with regards to the number of cores and clock cycles.

【关键词】

Microtubule gliding assay, 3D computer graphics and simulation, parallel computing, performance analysis, generalpurpose computing on graphics processing units (GPGPU), compute unified device arshitecture (CUDA), DirectX. 


8面部识别相关

Robust Face Recognition Against Expressions and Partial Occlusions

部分遮挡或表情变化下的面部识别研究

Fadhlan Kamaru Zaman, Amir Akramin Shafie, Yasir Mohd Mustafah


1) SpringerLink:

https://link.springer.com/article/10.1007/s11633-016-0974-6

2) IJAC官网:

http://www.ijac.net/EN/abstract/abstract1650.shtml


推荐理由

在面部表情变化或部分遮挡时,面部识别的整体性能会下降。为解决这一问题,在最后分类时应确定这些面部特征所起的作用。文章提出了特征筛选过程(feature selection process),把面部特征描述为本地独立分量分析(local independent component analysis)特征。借助LLS(Locally lateral subspace)策略,可得到这些本地特征。而后,通过LDA(linear discriminant analysis),文章又分析了每个本地ICA特征同类及组内表征。在由不同面部表情或面部部分遮挡的图像组成的数据集中测试后发现,本文提出的方法准确率高达90.7%


【英文摘要】

Facial features under variant-expressions and partial occlusions could have degrading effect on overall face recognition performance. As a solution, we suggest that the contribution of these features on final classification should be determined. In order to represent facial features' contribution according to their variations, we propose a feature selection process that describes facial features as local independent component analysis (ICA) features. These local features are acquired using locally lateral subspace (LLS) strategy. Then, through linear discriminant analysis (LDA) we investigate the intraclass and interclass representation of each local ICA feature and express each feature's contribution via a weighting process. Using these weights, we define the contribution of each feature at local classifier level. In order to recognize faces under single sample constraint, we implement LLS strategy on locally linear embedding (LLE) along with the proposed feature selection. Additionally, we highlight the efficiency of the implementation of LLS strategy. The overall accuracy achieved by our approach on datasets with different facial expressions and partial occlusions such as AR, JAFFE, FERET and CK+ is 90.70%. We present together in this paper survey results on face recognition performance and physiological feature selection performed by human subjects.

【关键词】

Face recognition, facial expressions, dimensionality reduction, single sample, feature selection. 


9英国艾塞克斯大学胡豁生团队

Real-time Object Subspace Searching Based on Discrete Searching Paths and Local Energy

基于离散搜索路径和局部能量的实时物体子空间搜索

Wen-Ju Zhou, Zi-Xiang Fei, Huo-Sheng Hu, Li Liu, Jing-Na Li, Peter James Smith


1) SpringerLink: 

https://link.springer.com/article/10.1007/s11633-015-0946-2

2) IJAC官网:

http://www.ijac.net/EN/abstract/abstract1746.shtml


推荐理由

本文提出一种全新的策略:采用离散径向搜索路线 (discrete radial search paths),用以替代搜索图片中的所有点,由此可潜在节省搜索时间。为降低因工业环境带来的影响,本文还提出另一种可更准确有效定位物体子空间的方法:局部能量水平集分割(local energy level set segmentation)。文章以探测“金属瓶盖”为例,对不同探测方法的探测效果及计算时间进行了对比,并分析了其中的探测机制。


【英文摘要】

In automatic visual inspection, the object image subspace should be segmented and matched quickly so that the affine relationship can be built between the template image and the sample image. When the interference is strong and the illumination is uneven, for example in an industrial application, this can make it difficult to obtain an objects subspace quickly and accurately in real-time. In this paper, a novel strategy is proposed to adopt discrete radial search paths instead of searching all points in an image. Therefore, the searching time can be substantially reduced. In order to reduce the influence coming from the industrial environment, the paper proposes another method that is local energy level set segmentation, which can locate the object subspace more efficiently and accurately. The detection of "crown caps" is presented as an example in this paper. Detection effects and computing time are compared between several detection methods, and the mechanisms of inspection have also been analyzed.

【关键词】

Real-time, object subspace, discrete paths, fast match, level set, local energy function.


10医学相关

Color Medical Image Enhancement Based on Adaptive Equalization of Intensity Numbers Matrix Histogram

基于强数字矩阵直方图的自适应均衡的彩色医学图像增强法

Ju-Ping Gu, Liang Hua, Xiao Wu, Hui Yang, Zhen-Tao Zhou


1) SpringerLink:

https://link.springer.com/article/10.1007/s11633-014-0871-9

2) IJAC官网:

http://www.ijac.net/EN/abstract/abstract1785.shtml


推荐理由

彩色医学图像增强技术可有效提高原图的分辨率和准确度。本文提出了一种新的图像增强方法,该方法将扬-赫(Young-Helmholtz (Y-H) transformation)与强数字矩阵直方图的自适应均衡(adaptive equalization of intensity numbers matrix histogram)相结合。最后的实验结果表明:本方法对于低计算复杂度图像(low computational complexity)有很好的增强效果,这为医学诊断医学图像的进一步处理提供了理论基础。


【英文摘要】

The enhancement technique for color medical images is conductive to improve the resolution and accuracy of the original image. A new enhancement method combining the Young-Helmholtz (Y-H) transformation with the adaptive equalization of intensity numbers matrix histogram is proposed in this paper. The adaptive histogram equalization method is applied to strengthen the details, enhance the contrast, and suppress the noise of the original image effectively. The enhanced image can be displayed in the red-greenblue (RGB) color space through inverse Y-H transformation with the same hue and saturation. The experiment results demonstrate that the method has the enhancement effect with low computational complexity, which provides the foundation for the medical diagnosis and further processing of medical images.

【关键词】

Youag-Helmhotz (Y-H) color space, adaptive histogram equalization, medical image enhancement,    image processing, color space transformation.


本文摘要系IJAC小编编译,若内容或翻译有失偏颇,欢迎后台留言指正,点击文末“阅读原文”即可进入下载详细目录,免费阅读全文。


更多精彩内容,欢迎关注

1) IJAC官方网站:

http://link.springer.com/journal/11633

http://www.ijac.net

2) Linkedin: Int. J. of Automation and Computing

3) 新浪微博: IJAC-国际自动化与计算杂志

4) Twitter: IJAC_Journal

5) Facebook: ijac journal

本文编辑:欧梨成

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存