
文摘|大数据如何造成虚假信心(How Big Data Creates False Confidence)

集智俱乐部 2018-12-21





“基于热点定价策略来缓解城市拥堵问题(Decongestion of urban areas with hotspot-pricing)”


“从成功中理清表现(Untangling performance from success)”

译者 jeffersonchou

“开放创新2.0的十二条准则 (Twelve principles for open innovation 2.0)”

译者 胡鹏博

“一只同时生活在两个盒子中的薛定谔猫(A Schrödinger cat living in two boxes)”

译者 王继康

“贫穷与表观遗传变异及精神疾病的关系(Poverty linked to epigenetic changes and mental illness)”

译者 李宇峰



5、大数据如何造成虚假信心(How Big Data Creates False Confidence) 



(Translated by -)

Although no one can quite agree how to define it, thegeneral idea is to find datasets so enormous that they can reveal patterns invisibleto conventional inquiry. The data are often generated by millions of real-worlduser actions, such as tweets or credit-card purchases, and they can takethousands of computers to collect, store, and analyze. To many companies andresearchers, though, the investment is worth it because the patterns can unlockinformation about anything from genetic disorders to tomorrow’s stock prices. But there’s a problem: It’s tempting to think that with such anincredible volume of data behind them, studies relying on big data couldn’t bewrong. But the bigness of the data can imbue the results with a false sense ofcertainty. Many of them are probably bogus—and the reasons why should give uspause about any research that blindly trusts big data.


6、非神经生物体中的习惯性学习:来自黏菌的证据 (Habituation in non-neuralorganisms: evidence from slime moulds) 



(Translated by - 高德edited by 傅渥成)

Learning, defined as a change in behaviour evoked byexperience, has hitherto been investigated almost exclusively in multicellularneural organisms. Evidence for learning in non-neural multicellular organismsis scant, and only a few unequivocal reports of learning have been described insingle-celled organisms. Here we demonstrate habituation, an unmistakable formof learning, in the non-neural organism Physarumpolycephalum. In ourexperiment, using chemotaxis as the behavioural output and quinine or caffeineas the stimulus, we showed that P. polycephalum learnt to ignore quinine orcaffeine when the stimuli were repeated, but responded again when the stimuluswas withheld for a certain time. Our results meet the principle criteria thathave been used to demonstrate habituation: responsiveness decline andspontaneous recovery. To distinguish habituation from sensory adaptation ormotor fatigue, we also show stimulus specificity. Our results point to thediversity of organisms lacking neurons, which likely display a hithertounrecognized capacity for learning, and suggest that slime moulds may be anideal model system in which to investigate fundamental mechanisms underlyinglearning processes. Besides, documenting learning in non-neural organisms suchas slime moulds is centrally important to a comprehensive, phylogeneticunderstanding of when and where in the tree of life the earliest manifestationsof learning evolved.

学习,作为由经验所诱发的一种行为上的变化,迄今已经在几乎所有的多细胞神经生物中得到研究。但是,有关多细胞非神经生物体学习的证据仍存在不足,而只有少数确切的报告对单细胞生物体中的学习作了描述。本文中,我们展示了多头绒泡菌(Physarumpolycephalum)这一非神经生物体中的习惯性行为,并将其看作是一种毋容置疑的学习形式。在我们的实验中,通过将趋化性(chemotaxis)作为行为的输出并且将奎宁(quinine)或咖啡因(caffeine)作为刺激物,我们发现:当刺激重复出现时,多头绒泡菌学会了忽略奎宁或咖啡因的作用。但是当这些刺激被抑制特定的时间之后,它们又能够重新作出反应。我们的研究结果符合反应性下降(responsiveness decline)和自然恢复(spontaneous recovery)等被运用于证实适应性的原则标准。为了将习惯性学习与感觉适应或运动性疲劳相区分,我们也对刺激的特异性(stimulus specificity)进行了研究。相关的结果显示:缺乏神经元的生物体的多样性,可能表现出一种至今仍未被识别的学习能力;并且认为,黏菌可能是用来研究学习过程的基本机制的理想模型系统。此外,对诸如黏菌之类的非神经生物体学习机制的揭示,对于全面、系统地理解生命之树(tree of life)中的学习的早期表现形式在何时何地发生的演化也至关重要。

7、众包解决城里的罗宾汉效应(Crowdsourcing the Robin Hood effect in cities) 



(Translated by - F7edited by 傅渥成)

Socioeconomic inequalities in cities are embedded inspace and result in neighborhood effects, whose harmful consequences haveproved very hard to counterbalance efficiently by planning policies alone.Considering redistribution of money flows as a first step toward improvedspatial equity, we study a bottom-up approach that would rely on a slightevolution of shopping mobility practices. Building on a database of anonymizedcredit card transactions in Madrid and Barcelona, we quantify the mobilityeffort required to reach a reference situation where commercial income isevenly shared among neighborhoods. The redirections of shopping trips preservekey properties of human mobility, including travel distances. Surprisingly, forboth cities only a small fraction (∼5%) of trips need to be altered to reach equity situations,improving even other sustainability indicators. The method could be implementedin mobile applications that would assist individuals in reshaping theirshopping practices, to promote the spatial redistribution of opportunities inthe city.

城市中的社会经济学的不平等内嵌于地理空间之中,并将导致邻里效应(neighborhood effects),其弊端已经被证明单用计划性政策是很难去有效平衡解决的。考虑到资金流的再分配可以作为改善不平等的的第一步,我们研究提出了一个自下而上的解决方法,该方法需要一些购物流动性上的改变。通过建立马德里和巴塞罗那的匿名信用卡交易数据库,我们量化了达到一个在邻里之间商业收入比较均等的参考水平所需要的移动耗费量。购物出行的重新定向保存了人类移动的一些关键性质,包括移动的距离。令人惊奇的是,这两个城市都只需要非常少(大约5%)的移动改变就能达到较为平等的水平,同时甚至也改进了其他的可持续性指标。该方法可以在移动应用程序中实现,并且将有助于个人重塑购物习惯,以促进城市内的机会在空间上的重新分布。

8、综合复杂网络和数据挖掘:原因和方法 (Combining complex networksand data mining: why and how) 



(Translated by 蔡嘉文, edited by 傅渥成)

The increasing power of computer technology does notdispense with the need to extract meaningful information out of data sets ofever growing size, and indeed typically exacerbates the complexity of thistask. To tackle this general problem, two methods have emerged, atchronologically different times, that are now commonly used in the scientificcommunity: data mining and complex network theory. Not only do complex networkanalysis and data mining share the same general goal, that of extractinginformation from complex systems to ultimately create a new compactquantifiable representation, but they also often address similar problems too.In the face of that, a surprisingly low number of researchers turn out toresort to both methodologies. One may then be tempted to conclude that thesetwo fields are either largely redundant or totally antithetic. The startingpoint of this review is that this state of affairs should be put down tocontingent rather than conceptual differences, and that these two fields can infact advantageously be used in a synergistic manner. An overview of both fieldsis first provided, some fundamental concepts of which are illustrated. Avariety of contexts in which complex network theory and data mining have be usedin a synergistic manner are then presented. Contexts in which the appropriateintegration of complex networks metrics can lead to improved classificationrates with respect to classical data mining algorithms and, conversely,contexts in which data mining can be used to tackle important issues in complexnetwork theory applications are illustrated. Finally, ways to achieve a tighterintegration between complex networks and data mining, and open lines ofresearch are discussed.


9、研究多部门(Multi-sector)协调及自组织的一个演化博弈理论方法 (An Evolutionary GameTheoretic Approach to Multi-Sector Coordination and Self-Organization) 



(Translated by -凤兰-ATC-ABM-Canton)

Coordination games provide ubiquitous interactionparadigms to frame human behavioral features, such as information transmission,conventions and languages as well as socio-economic processes and institutions.By using a dynamical approach, such as Evolutionary Game Theory (EGT), one isable to follow, in detail, the self-organization process by which a populationof individuals coordinates into a given behavior. Real socio-economicscenarios, however, often involve the interaction between multiple co-evolvingsectors, with specific options of their own, that call for generalized and moresophisticated mathematical frameworks. In this paper, we explore a general EGTapproach to deal with coordination dynamics in which individuals from multiplesectors interact. Starting from a two-sector, consumer/producer scenario, weinvestigate the effects of including a third co-evolving sector that we callpublic. We explore the changes in the self-organization process of all sectors,given the feedback that this new sector imparts on the other two.



10、作为一种气候信号关键滤波器的河流网络自组织过程 (Self-organization of riverchannels as a critical filter on climate signals) 



(Translated by Cicely)

Large floods should seemingly influence the depth and width of rivers. Phillips and Jerolmack, however, suggest that the self-organization of bedrock river channels blunts the impact of extreme rainfall events. River channel geometries from a wide range of course-grained rivers across the United States show that larger floods have very limited additional impact on channel geometry. River channel sculpting does increase as flood size increases, but the effect is most pronounced for moderate floods. This relationship may explain the long-term stability of rivers across shifts in climate.






