查看原文
其他

ARP科研导师专访 | 哈佛大学教授Luke Miratrix:为什么统计学让我着迷

京领 京领新国际 2020-02-04


近日,京领新国际采访了哈佛大学教育学与统计学教授卢克·米拉特里克斯(Luke Miratrix),就其学术经历、教学理念、教学案例进行了分享,并对于数据学科、创造力、创新能力等概念做了阐释。


Luke Miratrix本科毕业于加州理工大学的计算机系,并进入麻省理工学院继续深造,在结束了硕士的学习后,他又进入加州大学伯克利分校的数学教育专业,并拿到了博士学位。


目前,Luke在哈佛大学担任教育学与统计学教授,根据U.S.News 排名,哈佛大学的统计学专业位居全美第3,教育学专业列全美第1名。


他曾参与众多研究项目,包括选择与投票系统,大尺度教育领域研究评估,媒体分析,知名机构OSHA的管理有效性研究,教师发展有效性研究项目,人机交互等。参与合著的论文《价值加权?如何在调查实验中考虑和使用权重》,获得了2019沃伦·米勒奖(Warren Miller Prize),并被授予教育有效性研究(SREE)早期职业奖。


教授教学经验丰富,在进入哈佛大学任教之前,曾在美国高中任教7年,对教育事业拥有极大的热情。


ARP项目导师


卢克·米拉特里克斯

Luke Miratrix


哈佛大学教育学与统计学助理教授

加州大学-伯克利,数学教育博士

麻省理工学院,计算机硕士

加州理工大学,计算机本科



Q1

你能先谈一谈您的学术生涯吗?


我的大学教育生涯从计算机和数学开始,当我在麻省理工学院学习人工智能并忙于论文时,我意识到在全身心投入某个特定领域之前,我想接触更多的领域以便于真正做出选择。


我用了七年的时间里在两所不同的中学里教授计算机科学和数学,在这期间我不仅对教育的复杂性有了深刻的理解,而且对教育到底如何发挥作用以及如何发现它们何时起作用充满了好奇心。


这些问题促使我参加了加州大学伯克利分校的教育博士学位课程,在那里我了解了教育研究和该领域的主要问题(特别侧重于测量和学习)。我很快发现要从该领域中获得更多的知识需要对许多研究背后的统计方法学进行深入了解。因此,我转入了第三个研究生课程,这次是统计学


我的论文工作主要集中在随机实验上,同时继续使用着教育方面的数据。 


在毕业之后,我加入了哈佛大学的统计学院,在那里我吸收更多因果推理方法的知识。哈佛大学的统计学院以拥有众多该领域的伟大思想家而闻名,是一个完美的训练场。我开始将我的研究议程重新聚焦到到偏重于能够实际应用的领域。在此过程中,我获得了我在哈佛大学教育学院的最新职位。


我现在的工作回归到了我最初研究的问题,即真正改善教育的研究实践,以帮助我们的企业了解人们的学习方式,了解教育计划何时起作用,以及如何改善给每一个人提供的教育。


从某种意义上说,我掌握着两种“语言”,包括教育学和统计学。这使我可以接触这两种学科的文献,让我得以与教育学的同事们一起去探究什么是亟待解决的紧迫问题,然后运用统计学来找出最适合的解决工具和方法。 


由于接受了严格的统计学训练,我也有能力在不同类型的教育问题研究中灵活运用这些理论工具。


我现在继续在哈佛大学进行科研,所学习的知识与经验不断推动我的研究向前发展,并将研究对象聚焦到真正重要的问题上面。


Q2

能谈谈关于您最感到骄傲的一些学生吗?


我有太多令我感到骄傲的学生了,所以我很难去专门挑选出某个学生的故事进行分享,我教过高中生,本科生和研究生,实际上,能够和那些极度优秀的学生一起度过学习的时间,我感到非常荣幸。


当我还是一名高中老师时,我教授计算机科学和编程的课程。为此,我写了一本名为“Java, bots, and you”的教科书,这本书用来教授学生的第一门Java编程课程。


通过这种方式,学生可以掌握自己的学习节奏,在课堂上,学生们总是愿意尝试解决超过既定学习范围的问题,用于拓展和发现。


我的一个学生J对此感到非常兴奋,以至于他以最快的速度学完了这本书,得到了非常多的收获。随后他便开始“现学现卖”,用实验室中所有的计算机编写分布式程序来解决“魔术方块”(一种数学问题)。 


看着他将课程中的工具与资源以及学习后掌握的知识直接应用到解决实际问题上,我真的感到非常高兴。


我在哈佛教授统计学课程时,会要求学生在学期末完成一个“最终项目”, 学生将为这些项目做各种令人兴奋事情。


 我最印象深刻的是,一群学生决定研究有机食品的价格比传统食品到底贵多少。他们首先去了几个不同的食品市场,制定了一些规范化的杂货清单,以生成有关食品价格的大型数据库。然后他们写了一份报告,研究了这些食品的价格如何随各种因素的变化而变化,例如商店的类型、食品是否为有机食品等等。


 看着学生们能够用自己所学的知识与掌握的数据来回答实际生活中的问题,真让人感到骄傲。


我同时还指导高年级学生们进行论文的写作,其中一些学生的表现让惊叹。 


例如,有一个学生曾为地震后海地的救济方案进行评估体系的设计,令人敬佩的是,她不仅只是“纸上谈兵”,她真的去了海地,并与在当地的国际救援组织乐施会进行合作,以确保评估方案得以实施。其实很多救援组织都没有意识这类评估工作的重要性,因为它使人们能够了解哪种援助方案有效。


我也为我的所有研究生们感到骄傲,从数理理论到实践应用,学生们都产生了各种非常重要的成果。


例如,我的一位学生对于非常感兴趣我们从针对学生的大规模管理数据中了解到哪些信息,比如说,如果我们知道某些学校正在尝试一种新的教学方法,那么我们如何得知该教学方法是有效的呢?


其中一种方法通过分组比较,即找到一些学校,这些学校除了使用新的教学方法外,其他方面与传统学校相似。然后进行对比分析,以查看学生呈现出的成果是否存在系统性的差异(例如考察学生考试成绩)。如果这种方法行得通,那么这确实可以为学习很多不同的教育实验打开大门。


但这种办法行得通吗?


我的学生正在研究一大堆随机对照试验,并将它们作为这些基于比较的方法的基准,以检查该方法的效果。这是一项令人期待的工作,这需要学生对正在使用的数据以及对评估所必需的统计方法有深刻的理解。


Q3

您的教育哲学是什么?


我教学的主要目标是通过为学生提供学习世界的工具来增强他们的能力。


统计学和数据科学是实现这个目标的理想学科。实际上,整个统计学和数据科学的本质就是是向数据学习。


为了实现我的目标,我的教学必须始终聚焦于我们解决的实际问题,只有专注于一个真实存在的问题,所有的理论和编程都有一个明确而具体的观点。实际上,只有我们关注我们所使用的工具的真正用途,那么我们就能够更容易地理解这个工具本身。


实际上,统计学和数据科学实际上只是用于了解世界的工具的集合。它们具有巨大的潜力,因为它们非常灵活并且可以以多种不同方式来运用。当我教书时,我试图传达这个想法是多么令人兴奋,我希望学生能够真正理解我的这个想法,那么他们更好地激发激发自己的灵感,在这条路上走得更远。


Q4

您如何与学生交流数据科学的创造力?


数据科学几乎完全是关于创造力的一门学科!


 一旦有了数据科学的基础工具,能够获取数据并建构出这些数据的摘要、实现可视化,那么学生一般会面临以下问题:应该进行什么样的汇总或可视化? 


从根本上讲,这是一种创造性的选择,有效解决上述问题需要一个人形成达到一种“美学”的境界,真正实现清晰有效地撰写和描述数据,这才是创造力!


我与学生讨论此问题的方式是让他们在过程中不断练习,然后讨论结果。学生们将看到其他的同学在面临同样问题时给出的不同回答,并且答案的多样性引发了我们该如何应用给定的数据的交流与分享,这就使学生们意识到了数据科学的创造性。


Q5

您在数据科学领域有哪些成就与创新?


我是一名统计学家和数据科学家,致力于教育和社会科学领域的研究。我希望通过明确的并生成尽可能公开透明的方法来实现这一目标,同时针对给定问题的各种特质量身定制解决方法。


我认为我的角色本质上是提供一种服务:我想通过提供可用于解决问题的工具以及提高科学论述的质量来帮助社会科学家实现他们的目标。


我认为我作为教育领域的老师的工作也直接为实现这些目标服务。我主要在大规模随机实验的环境下工作,在这些实验中,我建立了从此类实验中提取更多信息而又不损害实验本身完整性的方法。我也将我在这项工作中开发的概念框架带到了其他与数据科学相关的领域,例如地理或空间数据以及文本分析。


我正在研究一种统计学和定量研究的方式,这种研究建立在明确的假设、着重理解和描述的重要性以及了解存在的局限性的核心原则之上。在我看来,这些原则直接影响统计学研究思维方式以及方法论。我试图了解我们何时可以对数据进行观点的提炼,以及如何使用最简单,最清晰的工具提出这些观点。


我们只有了解了统计工具,才能适当选择和使用它。因此,我的许多工作都在研究不同统计方法的使用范围,以便确定它们何时能够被使用。我的大部分理论工作都是描述一些研究方法如何被已经存在的概念证实,例如随机分配机制或采样机制,而不是关于建模的假设。


我其他的工作还包括研究了灵活的、适应性强的统计制度在实践中的应用趋势,通过帮助人们对工具有真正的了解,他们可以最佳地识别和正确使用这些工具。


我设计了这些工具来分析实际感兴趣的问题,当这些成果能够被提供给其他人使用时,帮助人们用自己的数据解决自己的问题,我就实现了我的目标。


Q6

是什么让您产生了对于数据科学这个领域的热情?


从小开始,我一直喜欢构建和创造新的事物,我也一直喜欢数学中蕴含的美丽与迷惑。这意味着我上大学时会尝试各种不同的专业,包括文学,化学,物理学,最后我选择了学习数学。


我发现数学具有令人难以置信的启发性和趣味性:用简单的规则和逻辑能够推断出越来越深入的观点,这让人感到非常奇妙。


我还非常喜欢编写计算机程序,因为我可以创建出供其他人使用的东西。


因此,我进入了计算机科学研究生院,从那以后对教学产生了兴趣(帮助其他人学习确实是一种荣幸)。我对教学的兴趣最终让我产生了学习统计学并使用数据进行学习的想法。


说起来,我对于统计学和数据科学真的是一见钟情。这个学科满足了我所有的核心爱好:我喜欢创建用于理解世界的工具,而这些工具植根于数学思想。为了有效地使用这些工具,我必须深刻理解它们,这满足了我对重要事物进行深入思考的愿望。


总体而言,我喜欢统计学和数据科学,因为它确实是跨学科的,并且可以激发我的大脑的所有区域!


Q7

请谈谈为何您如此有创造力的原因?


因为我很好奇。我想以各种方式了解这个世界,实际上我花了很多时间思考我认为是真实的,正确的,或者美丽的事物。所有这三件事都是人之所以为人背后的意义,我相信我的创造力直接来自所有这些事物。


通过以不同的方式看待事物可以激发创造力,因此我保持着开放的心态寻找解决问题的最佳方法。我总是尝试从许多不同的角度看问题。



为给中国广大青少年有更多机会接触前沿的统计学与数据科学理论与实践,京领ARP藤校科研项目特意邀请了卢克·米拉特里克斯作为该学科的科研导师,他将来到中国亲自带领学生完成科研项目。


导师研究领域

社科应用;因果分析;数据分析等


项目内容


学生将在米拉特里克斯教授的亲自指导下,完成初步具有名校本科科研水准的报告或论文。在获得教授书面认可的同时,学生在参与国际化科研项目的过程中将能够获得科研能力、小组合作能力、领导力、英语沟通能力、专业技能与全球视野等一系列综合素养的提升。


项目主题:运用数据科学与统计学探索发现社会问题


适合方向:统计学,数据科学,教育学


项目时间:2020年2月


ARP项目简介


ARP,即 Advanced Research Program,是一个全球化、高端化、专业化的学生科研领军项目。基于5 年来成功服务经验,京领科研项目全新升级为 ARP 藤校科研项目。


ARP 由国内外著名大学杰出学者、世界知名公司高管以及教育行业资深专家为主的专家团队共同发起,旨在推动国际化创新性人才培养,促进社会创新发展!


项目价值

注:

教授私人推荐信并非教授与项目合同中需要履行的义务。


ARP项目导师

ARP导师以哈佛大学、美国藤校现任终身正教授为主。



项目安排

课程时间

寒假


项目时长

2 周


目标人群

高中生 、本科生

注:

课程安排均由导师亲自安排,最终计划可能会根据实际情况修改。


申请流程



长按下方二维码或点击阅读原文


或者直接拨打电话010-56269331


立即报名与哈佛大学导师一起做科研




以下为采访原文:


KingLead:

Could you talk about your academic career? 


Luke Miratrix:

My own education began in computer science and mathematics. As I faced my dissertation in artificial intelligence at MIT, however, I realized that I wanted to discover more before I committed myself to a particular field. By happy accident I ended up teaching computer science and mathematics in two different high schools over the course of seven years, which gave me both a deep appreciation of the complexities of education as well as strong curiosity as to how education and learning worked, and how we could tell when they worked.

These questions motivated me to enroll in an education doctorate program at UC Berkeley, where I learned about education research and the major questions of the field (with a particular focus on measurement and learning). I quickly identified that the main way I could benefit the field would be to acquire a deep knowledge of the statistical methodology behind much of the research, and then work on improving this methodology. Because of this I transferred to yet a third graduate program, this time in statistics.

My dissertation work primarily focused on randomized experiments, and I also continued to work with education data. Upon graduation, I joined the statistics department at Harvard, where I developed my knowledge in causal inference methods. Harvard statistics, renowned for housing some of the great thinkers in this area, was a perfect training ground. I also began bringing my research agenda back to a more applied focus, and as part of this I obtained my current post in the HGSE. I am now in the process of truly engaging with my original agenda of improving research practices in education in order to help our collective enterprise of understanding how people learn, when education programs are effective, and how we can improve education for everyone.

In a sense, I am bilingual, speaking the languages of education and of statistics. This allows me to engage with both literatures, working with colleagues in the former to understand what the pressing problems are, and then turning to the latter to identify what tools are the closest matches. Due to my intensive statistical training, I am also equipped to take those tools and customize them for the specific types of work required in the context of education. I recognize that I am simultaneously quite advanced, yet still a beginner, in the world of education research. One reason why I want to join a diverse cadre of Spencer fellows is to continue to have active discourse with others engaged in a common larger agenda. I believe that I have a lot to offer as well as a lot to learn.

Treatment heterogeneity itself has been an ongoing interest. It began as a purely theoretical question for me, when I was a more abstract statistical methodologist just beginning my career. In collaboration with two of my students at the time (now colleagues) this interest blossomed into an array of statistical methods.

 At the onset of this work, and associated grant, I was relatively naïve with respect to how theory engages with practice.  I am now splitting my time with MDRC and Harvard, embedding myself in actual evaluation programs in order to continue to develop and understand these ties. The real-world experiences I am becoming involved with are catapulting my research forward, giving it focus and direction into the questions that truly matter. This work is the next step in engaging deeply with these questions.


KingLead:

Could you tell some about the students you are most proud of?  


Luke Miratrix:

I have so many students that I'm proud of that it is very difficult to pick any given story. I have taught high school students, college students, graduate students – and I've been privileged to have taught some of the very best minds I have ever met.

When I was a high school teacher I taught several courses in computer science and programming. As part of that I wrote a textbook, "Java, bots, and you" that are used to teach my students their first Java programming course. This allowed students to work at their own pace, they were always invited to try the extension problems and keep moving forward if they were able to. One of my students, J, was so excited by this that he raced through the book at top speed, learning an incredible amount. He then started using those tools to write distributed programs that used all of the computers in the computer lab to solve "magic squares" which is a type of math problem. Watching him take the material of the course and carry it directly to his own interests, using the resources at his disposal, was a true delight.

In my introduction to statistics courses that I taught in the Harvard statistics department I would always end the semester with a final project. Students would do all sorts of amazing things for these projects. One of my favorite was a group of students who decided to investigate how much more expensive organic food was conventional food. They first went to several markets, generating a list of canonical grocery items, to generate a large data set about food prices. They then wrote a report where they investigated how these prices depended on various factors such as the type of store and whether the food was organic or not. Getting to watch students answer real questions about the world they lived in using data was a real privilege.

I also mentored undergraduate students as they wrote their senior theses. Some of these theses were truly incredible. One of them, for example, work on designing an evaluation of a relief program for Haiti after the earthquake. The student paid so much attention to really critical issues having to do with this evaluation, and she actually spent time in Haiti working with Oxfam to ensure that this evaluation was carried through. This kind of work allows us to understand what kinds of aid programs are effective, which is an incredibly important problem as so many aid efforts do not realize there intent.

I am so proud of all of my graduate students. The students have all generated different kinds of very important work, ranging from pure statistics to very substantive. For example, one of my students is very interested in how much we can learn from large-scale administrative data on students. In particular, if we know that some group of schools is trying out a new kind of teaching method, how do we learn that teaching method is effective?

One approach is to find a new, control group of schools that are otherwise similar to our group of experimenting schools,  except the use of this teaching method. We will then compare these two groups to see if there are systematic differences student well-being or outcomes (such as student test scores). If this kind of approach works, this can really open the door to learning a lot about different educational experiments. But does it work? My student is examining how well this method works by looking at a whole bunch of randomized controlled trials, using them as a baseline for these more comparison based methods. It is a beautiful set of work, involving careful understanding of the data that he is using, and a deep understanding of the statistical methodology necessary for evaluating it.


KingLead:

What is your teaching philosophy?


Luke Miratrix:

My primary goal with teaching is to empower my students by giving them tools to learn about the world. Statistics and data science are the perfect areas for this. The entire point of statistics and data science is to learn from data. To achieve my goal, it is vital that my teaching is always anchored in actual problems that we are trying to solve. If we are focused on a real problem, all of the theory and programming has a concrete point. It becomes a lot easier to understand what a tool is if we are also attending to what a tool is for.

In truth, statistics and data science is really just a collection of tools for learning about the world. They are exciting tools that have so much potential because they are so flexible and can be used in so many different ways. When I teach, I try to communicate how exciting this idea is – I hope that by seeing how powerful this topic is, students will be inspired to learn as much as they can and push themselves as far as they can!

I also believe that students learn best when they are practicing rather than passively listening and taking in information. We really only understand something when we use it for some purpose. Therefore when I teach I try to have lots of hands-on activities, or group demonstrations of concepts, in order to bring all of the different concepts to life and make them deeply understood by my students. I also believe that seeing things in different ways increases one's understanding of those things. Therefore I try to explain important concepts using multiple vantage points to give students a truly multifaceted and complete understanding of the material.


KingLead:

How did you tell your students about creativity in data science?

 

Luke Miratrix:

Data sciences almost entirely about creativity! Once you have the foundational tools of

data science, once you are able to take data sets and create summaries and visualizations of those data, you are then faced with the question of what summary or what visualization should you make? This is fundamentally a creative choice – to be effective one has to develop an aesthetic, one has to write and describe clearly and effectively. This is all creativity! 

The way I talk about this with my students is by having them practice this process, and then discussing the results altogether. Then students will see how different students did different things in response to the same question, and that range of answers starts the conversation of how do we decide how to engaged with the given data set, or communicate a given idea. This creates awareness of the creative aspect of data science.

 

KingLead:

What is your achievement innovation in data science?  

 

Luke Miratrix:

I am a statistician and data scientist dedicated to improving the quality of research in education and the social sciences. I hope to achieve this end by identifying and generating methodology that is as transparent as possible while still being tailored to the various idiosyncrasies of a given problem so as to be as efficient as possible. I view my role as essentially one of service: I want to help social scientists achieve their goals both by providing tools that they can use to solve their problems and by increasing the quality of scientific discourse. I view my work as a mentor and teacher in the field of education as directly serving these goals as well.  I have primarily worked in the context of large-scale randomized experiments, where I have built approaches for extracting more information from such experiments without compromising the integrity of the experiments themselves. But I have also brought the conceptual framing I have developed in this work to other more data-science related areas, such as geographic or spatial data and text analysis. 

I am trying to advance a style of doing statistics and of thinking quantitatively. My work is built on the core principles of making assumptions explicit, understanding the importance of description, and knowing one's limitations. These principles, in my mind, directly lead to patterns of statistical thinking that have direct consequence for which methodologies are preferable.  I seek to understand when we can make claims about data, and how to make these claims using the simplest and clearest tools possible.

We can only select and use statistical tools appropriately if we understand them. Much of my work therefore investigates the scope of different statistical approaches in order to identify when they will do what they are supposed to do. Most of my theoretical work is about demonstrating how certain methods are justified by known aspects of a study, such as a random assignment mechanism or sampling mechanism, rather than assumption about modeling. Other aspects of my work investigate how flexible and adaptable procedures tend to work in practice. By arming practitioners with real understanding of their tools, they are best equipped to identify and use these tools correctly in context.

I design these tools and then use them to analyze real problems of interest.  This work, when published and made available to others, serves as a set of templates and ideas of what people can do with their own data and their own questions. I hope that my work helps others achieve their own goals of learning about the world.


KingLead:

Which lead to your passion to data science?

 

Luke Miratrix:

Since a child, I have always loved building and creating things, I have always wanted to understand things, and I have always loved the beauty and puzzle of mathematics. This meant that when I went to college I tried all sorts of different majors – I studied literature, chemistry, physics, and finally mathematics. I found mathematics to be incredibly inspiring and interesting: seeing how simple rules and relationships can lead to deeper and deeper insight was wonderful. But mathematics was not quite enough: I also really enjoyed writing computer programs because it was so satisfying to create things that other people could use. I therefore went to graduate school in computer science, and from there got interested in teaching (it is truly a pleasure to help others learn). And my interest in teaching eventually brought me to statistics and the idea of using data to learn. It was love at first sight. Statistics and data science manages to serve all of my core passions: I am creating tools to understand the world, and these tools are rooted in the ideas of mathematics. To use the tools effectively I have to understand them deeply, which fills my desire to deeply think about things that are important. Overall, I love of statistics and data science because it is truly interdisciplinary, and stimulates all the areas of my brain!


KingLead:

Which could be the most 1-2 reasons of why you could be so creative?


Luke Miratrix:

I believe I am creative because I'm curious. I want to know about the world in all sorts of ways – I spend a lot of time thinking about what I believe to be true, what I believe to be right, and what I believe to be beautiful. All three of these things are behind what it means to be human and I believe my creativity comes directly out of all of these things. Usually, a flash of creativity comes from seeing something in a different way and so I try to be open-minded in what the best approach to a problem is. I always try and look at a problem from many different perspectives.


电话:010-56269331

微信:Kinglead-edu


【办公地址】

北京

海淀区中关村天创科技大厦218 219

上海

浦东新区国金中心汇丰银行大楼15层

深圳

南山区科发路19号华润置地大厦D座6层

南山区中山公园内青少年活动中心5楼


京领排名

中国公办大学国际化竞争力排行榜

中国民办大学国际化竞争力排行榜

中国国际学校竞争力排行榜

京领传媒

京领说

京领专访

京领周刊

京领数据

京领研究

京领论坛

京领资源

专家学者资源库

海外大学资源库

海外中学资源库 

行业人才资源库

国际学校资源库

国内大学资源库

京领服务


企业教育咨询

政府教育规划

行业人才猎聘

高端留学咨询 

大学国际化办学

K12国际化学校

推荐阅读

Modified on

    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存