神经网络、AI 很简单!所以......别再装逼、佯称自己是个天才!
这篇文章也许让人觉得是在炮轰现状,但那不是初衷,本文的目的是找出为何在短短时间内AI专家从凤毛麟角变成过江之鲫的原因。
作者简介:Rcognant公司首席执行官兼创始人Brandon Wirtz
经常有人告诉我他们如何运用AI取得了不起的成就,但这些成就当中99%其实完全很愚蠢。这篇文章也许让人觉得是在炮轰现状,但那不是初衷,本文的目的是找出为何在短短时间内AI专家从凤毛麟角变成过江之鲫的原因,同时还要揭露这个事实:大多数这些专家看起来很专业,完全是由于很少有人指出他们搞出来的东西纯粹是狗屎。
假设你从头开始构建了一个神经网络,它还可以在手机上运行……
很棒。你把一件T恤上都放得下的11行Python代码转换成了Java、C或C++代码。你对交叉编译器在3秒内能完成的事情了然于胸。
大多数人不知道神经网络其实很简单,他们以为神经网络异常复杂。与数学领域的分形相似,神经网络可以处理似乎很复杂的任务,但这种复杂性来自重复以及随机数生成器。
假设你构建了一个有20层的深度神经网络……
恭喜你!你拿来上述代码,再次对循环语句进行循环。这肯定非常难,这取决于将另一个For语句和冒号放在何处。
“深度学习”和N层深度只是一个通过自己运行输出结果的神经网络。由于你对循环语句进行循环,所以这被称为递归神经网络(RNN)。
这与学开车相类似,但只能够右转弯,但你还是可以到达任何地方。这可能不是最有效的,但这比左转弯来得容易。
假设你使用英伟达GPU训练了神经网络,后来将神经网络移到了手机上……
在上面那11行代码中,出岔子(或未实现)的是种子未设置。不设置种子,我就无法保证第二次能和第一次获得同样的随机数,因而结果可能会大不一样。由于手机和台式机无法给出同样的随机数,而不同的手机芯片都会有不同的随机数,在基于GPU的系统上训练的神经网络到了手机上很有可能无法运行。
由于相比在锁系统中分类,训练需要多花几百万、甚至几十亿倍的时间,构建面向手机的神经网络几乎是不可能的。更何况不同设备之间始终存在差异。对于语音识别来说,相差正负5%关系不大,但是对于癌症检测/诊断来说,关系就很大了。
假设你训练了一个神经网络完成人类尚无法完成的事情……比如仅凭一张照片来判断某人是不是同性恋。
不,你做不到。神经网络是“哑”黑盒子系统。如果你折磨它们,你可以对测试数据进行很好的拟合,但是无法从随机来源测试中获得很好的结果。AI很擅长伪相关(spurious correlations),肯塔基州的结婚率并不提高溺水率。
而且,照片近距离拍摄这个事实也无法证明照片里的动物是猫而不是狮子。所以,地平线的形状并没有导致什么东西是狮子或猫。
人们想为AI赋予神话色彩,但总的来说,人类做不了的事情,AI也无能为力。有些时候例外,但仅限于透明AI。神经网络并不透明,即使在透明系统中,人类也只能复制最终的结果。
假设你使用TensorFlow来……
还记得上面那11行代码吗?TensorFlow只是那11行代码的封装器。它的擅长之处在于,帮助你直观地显示那11行代码中发生的事情。它在很多方面类似谷歌分析(Google Analytics)。要完成谷歌分析所做工作的所有数据可能都在服务器日志里,但查看那些日志很困难,而查看谷歌分析很容易。与此同时,虽然谷歌分析会告诉你服务器速度很慢,但不会告诉你原因。
了解神经网络的那些人之所以不想要或不需要TensorFlow,是由于我们不依赖那些花哨的图表和动画即可直观显示数据;因为我们查看原始数据和代码,就能搞清楚服务器速度变慢的原因。
假设你用神经网络进行自然语言处理(NLP)/自然语言理解(NLU)……
神经网络模拟起来其实比鼻涕虫的智力水平强不了多少。你教鼻涕虫理解英语的可能性有多大?
如果构建一个记下英语中每个单词一个特征的神经网络,这个网络就需要使用与整个谷歌一样强大的计算能力。如果还要记下英语中每个词义的一个特征,那么就需要地球上所有云服务计算能力的总和。
可以构建处理出色任务的AI,但神经网络有其限制。
假设你有一个自定义的神经网络……
恭喜你,你知道如何将11行神经网络代码封装在11行遗传算法代码中,或者是封装在44行分布式演进算法代码中。好撰写一份新闻稿了,因为你的55行代码可以……噢,稍等……
假设你针对……任何情形训练了一个神经网络。
恭喜你,你是数据牧人(data wrangler,意为数据管理员)。虽然这听起来很了不起,但你就是狗狗训练员。而你的狗拥有鼻涕虫一般的智能,唯一有利的方面就是你可以有好多狗。拥有训练集并不神奇。别让自己或别让他人以为你只不过是美其名曰的鼻涕虫训练员。
假设你结合了神经网络和区块链……
恭喜你,你知道如何大搞声势。遗憾的是,哈希挖掘和神经网络根本没有共同之处,想通过区块链农场的所有节点来运行所有数据集行不通。如果你以逾16种方式来“切分”负载(数据集是正常大小),神经网络就会开始出现问题。如果你有几十亿个记录,或者你在搞反向传播(Back Propagation),想测试多个级序的数据显示,可能会碰到更棘手的情况,但这些技术无法将规模扩大到成千上万个节点的环境。
我用神经网络做不了多少事。
我的工具箱里有神经网络代码,不过本该就是这样。它是可供选择的工具,而不是整个产品的基础。我的大多数工作在认识论中是自定义启发法。多种技术的结合称为思维模拟(Mind Simulation)。思维模拟用来用软件模拟大脑的软件,就是过去所说的大脑模拟器,而神经网络本该用软件模仿大脑的硬件(实则不然)。思维模拟的历史只有10年左右,而神经网络已经存在50多年了。思维模拟的不同之处还在于它是透明的,需要几百万行代码,而不是几十行代码。
英文原文:Neural network AI is simple. So... Stop pretending you are a genius.
On a regular basis people tell me about their impressive achievements using AI. 99% of these things are completely stupid. This post may come off as a rant, but that’s not so much its intent, as it is to point out why we went from having very few AI experts, to having so many in so little time. Also to convey that most of these experts only seem experty because so few people know how to call them on their bull shit.
So you built a neural network from scratch… And it runs on a phone…
Great. So you converted 11 lines of python that would fit on a t-shirt to Java or C or C++. You have mastered what a cross compiler can do in 3 seconds.
Most people don’t know that a neural network is so simple. They think it is super complex. Like fractals a neural network can do things that seem complex, but that complexity comes from repetition and a random number generator.
So you built a neural network that is 20 layers deep…
Congrats! You took the above code, and looped the loop again. That must have been so hard, deciding where to put another For and a Colon.
“Deep Learning” and n-Layers of depth is just a neural network that runs its output through itself. This is called Recursive Neural Networks (RNN), because you loop the loop.
This is similar to learning to drive, and only being able to make right turns. You can get to almost anywhere doing this. It may not be the most efficient, but it is easier than making left turns.
So you trained a neural network using Nvidia GPUs and moved it to the phone…
In that above 11 lines of code something that is wrong (or not implemented) is that the seed is not set. Without setting the seed I can’t guarantee that I will get the same random numbers in a second pass as in the first pass. As a result I could have dramatically different results. Since your phone and your desktop won’t give the same random numbers, and different phone chips could all have different random numbers, your training from a GPU based system to a mobile system has a high probability of not working.
Since training can take millions to billions of times longer than classifying in a locked system, building a neural network for a phone is pretty much impossible. There will always be differences between devices. Plus or minus 5% is not a big deal for voice recognition. It is a big deal for things like cancer detection/diagnosis.
So you trained a neural network to do something no human has been able to do…Like detect if people are gay just from a photo.
No. No you didn’t. Neural networks are dumb black box systems. If you torture them enough you can get great fit of test data, but you won’t get great results from randomly sourced tests. AI is really good at spurious correlations. The marriage rate in Kentucky is not driving the drowning rate.
Nor is the fact that a picture is taken close up a proof that the animal in the photo is a cat instead of a lion. So the shape of the horizon didn’t cause something to be a lion or a cat.
People want to ascribe magic powers to AI, but for the most part AI can’t do anything a human can’t. There are some exceptions, but only for transparent AI. Neural Networks aren’t transparent, and even in the transparent systems a human would be able to replicate the final result.
So you use TensorFlow to…
Remember those 11 lines from above? TensorFlow is just a wrapper for those 11 lines. What it does well is help you visualize what is happening in those 11 lines. In many ways it is like Google Analytics. All of the data to do what Google Analytics does is probably in your server log, but looking at those logs is hard, and looking at Google Analytics is easy. At the same time while Google Analytics will tell you that your server is slow it won’t tell you why.
Those of us who understand neural networks don’t want or need tensor flow because we visualize the data without their fancy charts and animations, and because we look at the data and code raw, we can figure out the equivalent of why the server is slow.
So you use neural networks to do NLP/NLU…
Common sense, people. Neural networks are not simulating much more than a Slug’s level of intelligence. What are the odds you taught a slug to understand English?
Building a neural network with 1 trait for every word in the English language would require a network that used as much computing power as all of Google. Upping that to 1 trait for each word sense in the English language would be all of the computing in all of the cloud services on the planet.
AI can be built to do great things. Neural networks have limitations.
So you have a self-defining neural networks…
Congrats, you know how to wrap the 11 lines of neural network code in the 9 lines of code for a genetic algorithm. Or the 44 lines for a distributed evolutionary algorithm. Write a press release because your 55 lines of code are going to... Oh, wait...
So you trained a neural network to…anything.
Congrats, you are a data wrangler. While that sounds impressive you are a dog trainer. Only your dog has the brains of a slug, and the only thing it has going for it, is that you can make lots of them. There is no magic in owning a training set. It might have been hard to track down, but don't fool yourself (or others) into thinking you are anything more than a glorified slug trainer.
So you combined neural networks and blockchain…
Congrats, you know how to make hype stack. Unfortunately, hash mining and neural networks don’t have anything in common, and trying to run all of a data sets through all of the nodes of a blockchain farm wouldn’t work. Neural network start to have problems when you “slice” the load more than about 16 ways with data sets of normal size. You can go larger of if you have billions of records, or if you are doing Back Propagation and want to test multiple orders of data presentation, but these techniques don’t scale to 1000s or millions of nodes.
I don't do much with neural networks.
There is neural network code in my tool box. but that is what it should be. A tool in the selection, not the basis for an entire product. Most of my work is in epistemology an self-defining heuristics. The combination of technologies is called Mind Simulation, because rather than neural networks that are supposed to be modeled after the hardware of the brain, in software (which they aren't), Mind Simulation is about modeling the software of the brain, in software. A brain emulator as it were. Mind Simulation has only been a thing for about 10 years, where as neural networks have been around for 50+. Mind Simulation also differs in that it is transparent, and takes millions of lines of code not dozens.
To learn more about AI that isn't neural network based, check out my follow up article:
https://www.linkedin.com/pulse/8-ai-technologies-aint-neural-networks-brandon-wirtz/
相关阅读:
教育部:AI、算法、开源硬件等进入全国高中新课标,2018秋季学期执行
国家发改委:2018年“互联网+”、AI、数字经济重大工程拟支持项目名单公示
搞 AI/ML 公司中 90% 从事的业务与 AI/ML 根本不沾边
刚刚,国家发布 AI 的三年发展计划(2018年-2020)(附全文)