【专知荟萃25】文字识别OCR知识资料全集(入门/进阶/论文/综述/代码/专家,附查看)
点击上方“专知”关注获取专业AI知识!
【导读】主题荟萃知识是专知的核心功能之一,为用户提供AI领域系统性的知识学习服务。主题荟萃为用户提供全网关于该主题的精华(Awesome)知识资料收录整理,使得AI从业者便捷学习和解决工作问题!在专知人工智能主题知识树基础上,主题荟萃由专业人工编辑和算法工具辅助协作完成,并保持动态更新!另外欢迎对此创作主题荟萃感兴趣的同学,请加入我们专知AI创作者计划,共创共赢!专知为大家呈送专知主题荟萃知识资料大全集荟萃 (入门/进阶/综述/视频/代码/专家等),请大家查看!专知访问www.zhuanzhi.ai, 或关注微信公众号后台回复" 专知"进入专知,搜索感兴趣主题查看。此外,我们也提供该文网页桌面手机端(www.zhuanzhi.ai)完整访问,可直接点击访问收录链接地址,以及pdf版下载链接,请文章末尾查看!此为初始版本,请大家指正补充,欢迎在后台留言!欢迎大家转发分享~
OCR文字,车牌,验证码识别 专知荟萃
入门学习
论文及代码
文字识别
文字检测
验证码破解
手写体识别
车牌识别
实战项目
视频
入门学习
端到端的OCR:基于CNN的实现
blog: [http://blog.xlvector.net/2016-05/mxnet-ocr-cnn/]
如何用卷积神经网络CNN识别手写数字集?
blog: [http://www.cnblogs.com/charlotte77/p/5671136.html]
OCR文字识别用的是什么算法?
[https://www.zhihu.com/question/20191727]
基于计算机视觉/深度学习打造先进OCR工作流 Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning
[https://blogs.dropbox.com/tech/2017/04/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning/]
车牌识别中的不分割字符的端到端(End-to-End)识别
[http://m.blog.csdn.net/Relocy/article/details/52174198]
端到端的OCR:基于CNN的实现
[http://blog.xlvector.net/2016-05/mxnet-ocr-cnn/]
腾讯OCR—自动识别技术,探寻文字真实的容颜
[http://blog.xlvector.net/2016-05/mxnet-ocr-cnn/]
Tesseract-OCR引擎 入门
[http://blog.csdn.net/xiaochunyong/article/details/7193744]
汽车挡风玻璃VIN码识别
[https://github.com/DoctorDYL/VINOCR]
车牌识别算法的关键技术及其研究现状
[http://www.siat.cas.cn/xscbw/xsqk/201012/W020101222564768411838.pdf]
端到端的OCR:验证码识别
[https://zhuanlan.zhihu.com/p/21344595?refer=xlvector]
论文及代码
文字识别
Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
intro: Google. Ian J. Goodfellow
arxiv: [https://arxiv.org/abs/1312.6082]
End-to-End Text Recognition with Convolutional Neural Networks
paper: [http://www.cs.stanford.edu/~acoates/papers/wangwucoatesng_icpr2012.pdf]
PhD thesis: [http://cs.stanford.edu/people/dwu4/HonorThesis.pdf]
Word Spotting and Recognition with Embedded Attributes
paper: [http://ieeexplore.ieee.org.sci-hub.org/xpl/articleDetails.jsp?arnumber=6857995&filter%3DAND%28p_IS_Number%3A6940341%29]
Reading Text in the Wild with Convolutional Neural Networks
arxiv: [http://arxiv.org/abs/1412.1842]
homepage: [http://www.robots.ox.ac.uk/~vgg/publications/2016/Jaderberg16/]
demo: [http://zeus.robots.ox.ac.uk/textsearch/#/search/]
code: [http://www.robots.ox.ac.uk/~vgg/research/text/]
Deep structured output learning for unconstrained text recognition
arxiv: [http://arxiv.org/abs/1412.5903]
Deep Features for Text Spotting
paper: [http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14/jaderberg14.pdf]
bitbucket: [https://bitbucket.org/jaderberg/eccv2014_textspotting]
gitxiv: [http://gitxiv.com/posts/uB4y7QdD5XquEJ69c/deep-features-for-text-spotting]
Reading Scene Text in Deep Convolutional Sequences
arxiv: [http://arxiv.org/abs/1506.04395]
DeepFont: Identify Your Font from An Image
arxiv: [http://arxiv.org/abs/1507.03196]
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
intro: Convolutional Recurrent Neural Network
arxiv: [http://arxiv.org/abs/1507.05717]
github: [https://github.com/bgshih/crnn]
github: [https://github.com/meijieru/crnn.pytorch]
Recursive Recurrent Nets with Attention Modeling for OCR in the Wild
arxiv: [http://arxiv.org/abs/1603.03101]
Writer-independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks
arxiv: [http://arxiv.org/abs/1604.00974]
DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images
arxiv: [http://arxiv.org/abs/1605.07314]
End-to-End Interpretation of the French Street Name Signs Dataset
paper: [http://link.springer.com/chapter/10.1007%2F978-3-319-46604-0_30]
github: [https://github.com/tensorflow/models/tree/master/street]
End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance
arxiv: [https://arxiv.org/abs/1611.06159]
Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading
arxiv: [https://arxiv.org/abs/1611.07385]
Improving Text Proposals for Scene Images with Fully Convolutional Networks
intro: Universitat Autonoma de Barcelona & University of Florence
intro: International Conference on Pattern Recognition - DLPR workshop
arxiv: [https://arxiv.org/abs/1702.05089]
Scene Text Eraser
[https://arxiv.org/abs/1705.02772]
Attention-based Extraction of Structured Information from Street View Imagery
intro: University College London & Google Inc
arxiv: [https://arxiv.org/abs/1704.03549]
github: [https://github.com/tensorflow/models/tree/master/attention_ocr]
STN-OCR: A single Neural Network for Text Detection and Text Recognition
arxiv: [https://arxiv.org/abs/1707.08831]
github: [https://github.com/Bartzi/stn-ocr]
Sequence to sequence learning for unconstrained scene text recognition
intro: master thesis
arxiv: [http://arxiv.org/abs/1607.06125]
Drawing and Recognizing Chinese Characters with Recurrent Neural Network
arxiv: [https://arxiv.org/abs/1606.06539]
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
intro: correct rates: Dataset-CASIA 97.10% and Dataset-ICDAR 97.15%
arxiv: [https://arxiv.org/abs/1610.02616]
Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition
arxiv: [https://arxiv.org/abs/1610.04057]
Visual attention models for scene text recognition
[https://arxiv.org/abs/1706.01487]
Focusing Attention: Towards Accurate Text Recognition in Natural Images
intro: ICCV 2017
arxiv: [https://arxiv.org/abs/1709.02054]
Scene Text Recognition with Sliding Convolutional Character Models
[https://arxiv.org/abs/1709.01727]
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition
[https://arxiv.org/abs/1710.03425]
A New Hybrid-parameter Recurrent Neural Networks for Online Handwritten Chinese Character Recognition
[https://arxiv.org/abs/1711.02809]
Arbitrarily-Oriented Text Recognition
intro: A method used in ICDAR 2017 word recognition competitions
arxiv: [https://arxiv.org/abs/1711.04226]
文字检测
Object Proposals for Text Extraction in the Wild
intro: ICDAR 2015
arxiv: [http://arxiv.org/abs/1509.02317]
github: [https://github.com/lluisgomez/TextProposals]
Text-Attentional Convolutional Neural Networks for Scene Text Detection
arxiv: [http://arxiv.org/abs/1510.03283]
Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network
arxiv: [http://arxiv.org/abs/1603.09423]
Synthetic Data for Text Localisation in Natural Images
intro: CVPR 2016
project page: [http://www.robots.ox.ac.uk/~vgg/data/scenetext/]
arxiv: [http://arxiv.org/abs/1604.06646]
paper: [http://www.robots.ox.ac.uk/~vgg/data/scenetext/gupta16.pdf]
github: [https://github.com/ankush-me/SynthText]
Scene Text Detection via Holistic, Multi-Channel Prediction
arxiv: [http://arxiv.org/abs/1606.09002]
Detecting Text in Natural Image with Connectionist Text Proposal Network
intro: ECCV 2016
arxiv: [http://arxiv.org/abs/1609.03605]
github: [https://github.com/tianzhi0549/CTPN]
github: [https://github.com/qingswu/CTPN]
demo: [http://textdet.com/]
github: [https://github.com/eragonruan/text-detection-ctpn]
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
intro: AAAI 2017
arxiv: [https://arxiv.org/abs/1611.06779]
github: [https://github.com/MhLiao/TextBoxes]
github: [https://github.com/xiaodiu2010/TextBoxes-TensorFlow]
Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection
intro: CVPR 2017
intro: F-measure 70.64%, outperforming the existing state-of-the-art method with F-measure 63.76%
arxiv: [https://arxiv.org/abs/1703.01425]
Detecting Oriented Text in Natural Images by Linking Segments
intro: CVPR 2017
arxiv: [https://arxiv.org/abs/1703.06520]
github: [https://github.com/dengdan/seglink]
Deep Direct Regression for Multi-Oriented Scene Text Detection
arxiv: [https://arxiv.org/abs/1703.08289]
Cascaded Segmentation-Detection Networks for Word-Level Text Spotting
[https://arxiv.org/abs/1704.00834]
WordFence: Text Detection in Natural Images with Border Awareness
intro: ICIP 2017
arcxiv: [https://arxiv.org/abs/1705.05483]
SSD-text detection: Text Detector
intro: A modified SSD model for text detection
github: [https://github.com/oyxhust/ssd-text_detection]
R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection
intro: Samsung R&D Institute China
arxiv: [https://arxiv.org/abs/1706.09579]
R-PHOC: Segmentation-Free Word Spotting using CNN
intro: ICDAR 2017
arxiv: [https://arxiv.org/abs/1707.01294]
Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks
[https://arxiv.org/abs/1707.03985]
EAST: An Efficient and Accurate Scene Text Detector
intro: CVPR 2017
arxiv: [https://arxiv.org/abs/1704.03155]
github: [https://github.com/argman/EAST]
Deep Scene Text Detection with Connected Component Proposals
intro: Amap Vision Lab, Alibaba Group
arxiv: [https://arxiv.org/abs/1708.05133]
Single Shot Text Detector with Regional Attention
intro: ICCV 2017
arxiv: [https://arxiv.org/abs/1709.00138]
github: [https://github.com/BestSonny/SSTD]
code: [http://sstd.whuang.org]
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
[https://arxiv.org/abs/1709.03272]
Deep Residual Text Detection Network for Scene Text
intro: IAPR International Conference on Document Analysis and Recognition 2017. Samsung R&D Institute of China, Beijing
arxiv: [https://arxiv.org/abs/1711.04147]
Feature Enhancement Network: A Refined Scene Text Detector
intro: AAAI 2018
arxiv: [https://arxiv.org/abs/1711.04249]
ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene
[https://arxiv.org/abs/1711.11249]
验证码破解
Using deep learning to break a Captcha system
intro: "Using Torch code to break simplecaptcha with 92% accuracy"
blog: [https://deepmlblog.wordpress.com/2016/01/03/how-to-break-a-captcha-system/]
github: [https://github.com/arunpatala/captcha]
Breaking reddit captcha with 96% accuracy
blog: [https://deepmlblog.wordpress.com/2016/01/05/breaking-reddit-captcha-with-96-accuracy/]
github: [https://github.com/arunpatala/reddit.captcha]
I’m not a human: Breaking the Google reCAPTCHA
paper: [https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf]
Neural Net CAPTCHA Cracker
slides: [http://www.cs.sjsu.edu/faculty/pollett/masters/Semesters/Spring15/geetika/CS298%20Slides%20-%20PDF]
github: [https://github.com/bgeetika/Captcha-Decoder]
demo: [http://cp-training.appspot.com/]
Recurrent neural networks for decoding CAPTCHAS
blog: [https://deepmlblog.wordpress.com/2016/01/12/recurrent-neural-networks-for-decoding-captchas/]
demo: [http://simplecaptcha.sourceforge.net/]
code: [http://sourceforge.net/projects/simplecaptcha/]
Reading irctc captchas with 95% accuracy using deep learning
github: [https://github.com/arunpatala/captcha.irctc]
I Am Robot: Learning to Break Semantic Image CAPTCHAs
intro: automatically solving 70.78% of the image reCaptchachallenges, while requiring only 19 seconds per challenge. apply to the Facebook image captcha and achieve an accuracy of 83.5%
paper: [http://www.cs.columbia.edu/~polakis/papers/sivakorn_eurosp16.pdf]
SimGAN-Captcha
intro: Solve captcha without manually labeling a training set
github: [https://github.com/rickyhan/SimGAN-Captcha]
手写体识别
High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps
arxiv: [http://arxiv.org/abs/1505.04925]
github: [https://github.com/zhongzhuoyao/HCCR-GoogLeNet]
Recognize your handwritten numbers
[https://medium.com/@o.kroeger/recognize-your-handwritten-numbers-3f007cbe46ff#.jllz62xgu]
Handwritten Digit Recognition 40 35602 40 14363 0 0 8665 0 0:00:04 0:00:01 0:00:03 8662using Convolutional Neural Networks in Python with Keras
blog: [http://machinelearningmastery.com/handwritten-digit-recognition-using-convolutional-neural-networks-python-keras/]
MNIST Handwritten Digit Classifier
github: [https://github.com/karandesai-96/digit-classifier]
LeNet – Convolutional Neural Network in Python
blog: [http://www.pyimagesearch.com/2016/08/01/lenet-convolutional-neural-network-in-python/]
Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
arxiv: [http://arxiv.org/abs/1604.03286]
MLPaint: the Real-Time Handwritten Digit Recognizer
blog: [http://blog.mldb.ai/blog/posts/2016/09/mlpaint/]
github: [https://github.com/mldbai/mlpaint]
demo: [https://docs.mldb.ai/ipy/notebooks/_demos/_latest/Image%20Processing%20with%20Convolutions.html]
Training a Computer to Recognize Your Handwriting
[https://medium.com/@annalyzin/training-a-computer-to-recognize-your-handwriting-24b808fb584#.gd4pb9jk2]
Using TensorFlow to create your own handwriting recognition engine
blog: [https://niektemme.com/2016/02/21/tensorflow-handwriting/]
github: [https://github.com/niektemme/tensorflow-mnist-predict/]
Building a Deep Handwritten Digits Classifier using Microsoft Cognitive Toolkit
blog: [https://medium.com/@tuzzer/building-a-deep-handwritten-digits-classifier-using-microsoft-cognitive-toolkit-6ae966caec69#.c3h6o7oxf]
github: [https://github.com/tuzzer/ai-gym/blob/a97936619cf56b5ed43329c6fa13f7e26b1d46b8/MNIST/minist_softmax_cntk.py]
Hand Writing Recognition Using Convolutional Neural Networks
intro: This CNN-based model for recognition of hand written digits attains a validation accuracy of 99.2% after training for 12 epochs. Its trained on the MNIST dataset on Kaggle.
github: [https://github.com/ayushoriginal/HandWritingRecognition-CNN]
Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling
intro: 0.57 MB, performance is decreased only by 0.91%.
arxiv: [https://arxiv.org/abs/1705.05207]
Handwritten digit string recognition by combination of residual network and RNN-CTC
[https://arxiv.org/abs/1710.03112]
车牌识别
Reading Car License Plates Using Deep Convolutional Neural Networks and LSTMs
arxiv: [http://arxiv.org/abs/1601.05610]
Number plate recognition with Tensorflow
blog: [http://matthewearl.github.io/2016/05/06/cnn-anpr/]
github: [https://github.com/matthewearl/deep-anpr]
end-to-end-for-plate-recognition
github: [https://github.com/szad670401/end-to-end-for-chinese-plate-recognition]
Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN
intro: International Workshop on Advanced Image Technology, January, 8-10, 2017. Penang, Malaysia. Proceeding IWAIT2017
arxiv: [https://arxiv.org/abs/1701.06439]
License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks
arxiv: [https://arxiv.org/abs/1703.07330]
api: [https://www.sighthound.com/products/cloud]
Adversarial Generation of Training Examples for Vehicle License Plate Recognition
[https://arxiv.org/abs/1707.03124]
Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks
[https://arxiv.org/abs/1709.08828]
实战项目
多标签分类,端到端基于mxnet的中文车牌识别
[https://github.com/szad670401/end-to-end-for-chinese-plate-recognition]
中国二代身份证光学识别
[https://github.com/KevinGong2013/ChineseIDCardOCR]
EasyPR 一个开源的中文车牌识别系统
[https://github.com/liuruoze/EasyPR]
汽车挡风玻璃VIN码识别
[https://github.com/DoctorDYL/VINOCR]
CLSTM : A small C++ implementation of LSTM networks, focused on OCR
github: [https://github.com/tmbdev/clstm]
OCR text recognition using tensorflow with attention
github: [https://github.com/pannous/caffe-ocr]
github: [https://github.com/pannous/tensorflow-ocr]
Digit Recognition via CNN: digital meter numbers detection
github: [https://github.com/SHUCV/digit]
Attention-OCR: Visual Attention based OCR
github: [https://github.com/da03/Attention-OCR]
umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm
github: [https://github.com/edward-zhu/umaru]
Tesseract.js: Pure Javascript OCR for 62 Languages
homepage: [http://tesseract.projectnaptha.com/]
github: [https://github.com/naptha/tesseract.js]
DeepHCCR: Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet
github: [https://github.com/chongyangtao/DeepHCCR]
deep ocr: make a better chinese character recognition OCR than tesseract
[https://github.com/JinpengLI/deep_ocr]
Practical Deep OCR for scene text using CTPN + CRNN
[https://github.com/AKSHAYUBHAT/DeepVideoAnalytics/blob/master/notebooks/OCR/readme.md]
Text-Detection-using-py-faster-rcnn-framework
github: [https://github.com/jugg1024/Text-Detection-with-FRCN]
ocropy: Python-based tools for document analysis and OCR
github: [https://github.com/tmbdev/ocropy]
Extracting text from an image using Ocropus
blog: [http://www.danvk.org/2015/01/09/extracting-text-from-an-image-using-ocropus.html]
视频
LSTMs for OCR
youtube: [https://www.youtube.com/watch?v=5vW8faXvnrc]
初步版本,水平有限,有错误或者不完善的地方,欢迎大家提建议和补充,会一直保持更新,敬请关注http://www.zhuanzhi.ai 和关注专知公众号,获取第一手AI相关知识
特别提示-专知文字识别主题:
请PC登录www.zhuanzhi.ai或者点击阅读原文,注册登录,顶端搜索“文字识别” 主题,查看评论获得专知荟萃全集知识等资料,直接PC端访问体验更佳!如下图所示~
此外,请关注专知公众号(扫一扫最下面专知二维码,或者点击上方蓝色专知),
后台回复“文字识别”或者“OCR” 就可以在手机端获取专知文字识别知识资料查看链接地址,直接打开荟萃资料的链接地址~~
请扫描专知小助手,加入专知人工智能群交流~
专知荟萃知识资料全集获取(关注本公众号-专知,获取下载链接),请查看:
【专知荟萃01】深度学习知识资料大全集(入门/进阶/论文/代码/数据/综述/领域专家等)(附pdf下载)
【专知荟萃02】自然语言处理NLP知识资料大全集(入门/进阶/论文/Toolkit/数据/综述/专家等)(附pdf下载)
【专知荟萃03】知识图谱KG知识资料全集(入门/进阶/论文/代码/数据/综述/专家等)(附pdf下载)
【专知荟萃04】自动问答QA知识资料全集(入门/进阶/论文/代码/数据/综述/专家等)(附pdf下载)
【专知荟萃05】聊天机器人Chatbot知识资料全集(入门/进阶/论文/软件/数据/专家等)(附pdf下载)
【专知荟萃06】计算机视觉CV知识资料大全集(入门/进阶/论文/课程/会议/专家等)(附pdf下载)
【专知荟萃07】自动文摘AS知识资料全集(入门/进阶/代码/数据/专家等)(附pdf下载)
【专知荟萃08】图像描述生成Image Caption知识资料全集(入门/进阶/论文/综述/视频/专家等)
【专知荟萃09】目标检测知识资料全集(入门/进阶/论文/综述/视频/代码等)
【专知荟萃10】推荐系统RS知识资料全集(入门/进阶/论文/综述/视频/代码等)
【专知荟萃11】GAN生成式对抗网络知识资料全集(理论/报告/教程/综述/代码等)
【专知荟萃12】信息检索 Information Retrieval 知识资料全集(入门/进阶/综述/代码/专家,附PDF下载)
【专知荟萃13】工业学术界用户画像 User Profile 实用知识资料全集(入门/进阶/竞赛/论文/PPT,附PDF下载)
【专知荟萃14】机器翻译 Machine Translation知识资料全集(入门/进阶/综述/视频/代码/专家,附PDF下载)
【专知荟萃15】图像检索Image Retrieval知识资料全集(入门/进阶/综述/视频/代码/专家,附PDF下载)
【专知荟萃16】主题模型Topic Model知识资料全集(基础/进阶/论文/综述/代码/专家,附PDF下载)
【专知荟萃17】情感分析Sentiment Analysis 知识资料全集(入门/进阶/论文/综述/视频/专家,附查看)
【专知荟萃18】目标跟踪Object Tracking知识资料全集(入门/进阶/论文/综述/视频/专家,附查看)
【专知荟萃19】图像识别Image Recognition知识资料全集(入门/进阶/论文/综述/视频/专家,附查看)
【专知荟萃20】图像分割Image Segmentation知识资料全集(入门/进阶/论文/综述/视频/专家,附查看)
【专知荟萃21】视觉问答VQA知识资料全集(入门/进阶/论文/综述/视频/专家,附查看)
-END-
专 · 知
人工智能领域主题知识资料查看获取:【专知荟萃】人工智能领域22个主题知识资料全集(入门/进阶/论文/综述/视频/专家等)
请PC登录www.zhuanzhi.ai或者点击阅读原文,注册登录专知,获取更多AI知识资料!
请关注我们的公众号,获取人工智能的专业知识。扫一扫关注我们的微信公众号。
点击“阅读原文”,使用专知!