资源推荐|陈松:《数字时代的中国研究》教学大纲(2023)
授课教室:明法东楼A301
授课时间:
7月24日~7月28日上午8:30至中午12时(每天4课时)
7月31日~8月3日上午9:00至中午12时(每天3课时)
授课对象:
有政治学、公共管理量化研究或者量化历史研究基础的研究生(限15人)
授课教师:陈松(美国巴克内尔大学中国史副教授)
助教:杨端程(华东政法大学政府管理学院讲师)
Song Chen(陈松),美籍华人,美国哈佛大学东亚语言与文明博士,中国历代人物传记数据库( CBDB )管理委员会暨执行委员会委员;现任美国巴克内尔大学( Bucknell University )东亚研究系中国史副教授。主要研究领域为唐宋时期的社会政治史、民间信仰与数字人文。
本课程旨在帮助学生掌握与中国研究有关的各种数字分析方法和工具,使他们可以有效地利用计算器技术生成、编排、分析大批量的历史数据,进而从新的角度提出问题、解答问题。具体来说,本课程需要学生掌握以正则表达式为基础的文本标注与数据提取、理解关系型数据库的基本结构和查询方法,掌握空间分析和网络分析的基本方法,使用常用软件对数据进行可视化处理和量化分析,并批判性地理解数字工具的优势与局限。
This course is designed to help students master a diversity of digital methods and tools useful for Chinese studies. With study in this course, students will be able to make effective use of computational methods to create, organize, and analyze large quantities of historical data, asking questions and answering them from new perspectives. Specifically, with study in this course, students will learn to use regular expressions to transform texts into annotated files, build relational databases to organize and query large quantities of structured data, and apply the methods of spatial and network analysis in their research. They will learn to use software for data visualization and quantitative analysis, and in doing so, develop a critical understanding of the strengths and limits of digital methods.
本课程分为四个主要教学模块进行讲授。第一个模块为文本标注和数据提取,讲授正则表达式和XML文件的基本概念,介绍相关文本标注平台,指导学生使用正则表达式对文本中的命名实体与实体关系进行标注,进而使用Excel等工具从已标注文本中提取结构化数据。第二个模块为关系型数据库,侧重介绍关系型数据的编排规则与方法及其与图数据库的差异。第三个模块为空间分析,介绍中国历史地理信息系统(CHGIS)、哈佛WorldMap、浙江大学学术地图发布平台等相关项目,学习QGIS/ArcGIS Pro 制图技术,利用空间可视化发掘数据的空间分布特征和空间关系。第四个模块为网络分析,介绍社会网络分析的历史和基本概念,指导学生使用Gephi、UCINET等软件绘制网络关系图,分析网络的结构性特征。在四个模块的教学过程中,将穿插介绍中国研究领域中近年来使用相关工具和方法的主要研究成果。
The course is divided into four modules. The topic of the first module is text annotation and data extraction. This module introduces students to the concepts of regular expressions and the XML file format, as well as the most common annotation platforms (e.g., Markus). Students will learn to use regular expressions to identify named entities and entity relationships in texts, and then use Microsoft Excel and similar software to extract structured data from the annotations. The second module concerns relational databases. It discusses the data structure and organizational logic behind relational databases and compares them to graph databases. The third module focuses on spatial analysis. It introduces students to several widely acclaimed projects, such as China Historical Geographic Information System (CHGIS), Harvard WorldMap, and the Academic Map Publishing Platform at Zhejiang University. Students will learn to create maps in QGIS/ArcGIS Pro and use spatial visualization to uncover patterns of spatial distribution and spatial relationships. The fourth module teaches network analysis. It introduces students to the history and core concepts of social network analysis. Students will learn to use software (e.g., Gephi and UCINET) to create network graphs and analyze the structural properties of historical networks. The instruction of methods and tools in all four modules will be enriched by reading and discussion of the cutting-edge scholarship that uses these methods.
必读材料(Readings)为课堂讨论内容,需于当日课程开始前完成。参考文献(References)、软件用户手册(Tutorials / Training Manuals)仅供有志深入研修的同学参考,本课程授课期间不做要求。部分软件以超链接形式提供下载链接,为保障授课时间,请于课前预先下载安装。
WEEK 1
Monday (July 24)
Topics: Data-Driven Research and Explorations 数据导向的研究与探索; Spatial Analysis 空间分析 (I)
·Text analysis vs. entity-relationship model (ERM)
·Explore spatial and network data with Palladio (timeline; facet filters)
·Introduction to GIS ((layers; raster and vector data; feature types; catalogue view; shapefile basics)
·Thematic mapping
Tools:
·Palladio (Tutorial)
·ArcGIS Pro (本软件为付费软件,只能在Windows操作系统下运行)
·QGIS (Training Manual), should ArcGIS Pro be unavailable
Readings:
·Song Chen, “Why Humanists Should Fall in Love with ‘Big Data,’ and How?” Dissertation Review, March 15, 2016. https://dissertationreviews.org/why-humanists-should-fall-in-love-with-big-data-and-how/
·GIS Commons, Chapter 1 (“Introduction”)
References:
·G. William Skinner, “The Structure of Chinese History.” Journal of Asian Studies 44.2 (1985): 271–92.
·Peter K. Bol, “What is a Geographic Perspective on China’s History?” In Chinese History in Geographical Perspective (Lantham: Lexington Books, 2013), pp. 197–204.
Tuesday (July 25)
Topics: Spatial Analysis 空间分析 (II)
·Symbology (rhetorical honesty; classification; normalization) and labeling
·Import and export layers (interoperability with Harvard WorldMap and ArcGIS Online)
·Toolkit of analysis (buffer; dissolve; spatial join)
·Layout design (legend, scale, labeling; map and project packages)
Tools:
·ArcGIS Pro / QGIS (Training Manual)
Readings:
·Mark Monmonier, “Lying with Maps,” Statistical Science 20.3(2005): 215–22.
·Peter K. Bol, “GIS, Prosopography and History,” Annals of GIS 18.1 (2012): 3–15.
·Jason Protass, “Toward a Spatial History of Chan (禅宗空间历史初探:宋灯录记载之宗派),” Review of Religion and Chinese Society 3.2 (2016): 164–88. https://doi.org/10.1163/22143955-00302003
References:
·G. William Skinner, “Marketing and Social Structure in Rural China: Part I,” Journal of Asian Studies 24.1 (1964): 3–43.
·G. William Skinner, “Marketing and Social Structure in Rural China: Part II,” Journal of Asian Studies 24.2 (1965): 195–228.
·Robert M. Hartwell, “Demographic, Political, and Social Transformations of China, 750–1550,” Harvard Journal of Asiatic Studies 42.2 (1982): 365–442.
Wednesday (July 26)
Topics: Social Network Analysis 社会网络分析 (I)
·Understanding networks and network analysis (nodes, edges, and structural properties)
·Introduction to Gephi (nodes and edges tables)
·Analyzing nodes (centrality)
·Identifying Clusters (k-core and modularity)
·Preview and export
Tools:
·Gephi (Tutorial)
Readings:
·Nicolas Tackett, “The Capital Elite Marriage Network,” in The Destruction of the Medieval Chinese Aristocracy (Cambridge, Mass: Harvard University Asia Center, 2016), pp. 93–107.
·Marcus Bingenheimer, “On the Use of Historical Social Network Analysis in the Study of Chinese Buddhism: The Case of Dao’an, Huiyuan, and Kumārajīva,” Journal of the Japanese Association for Digital Humanities 5.2 (2020): 84–131.
·Scott B. Weingart, “Demystifying Networks, Parts I & II,” Journal of Digital Humanities 1.1 (2011).
References:
·Song Chen, “Governing a Multicentered Empire: Prefects and Their Networks in the 1040s and 1210s.” In State Power in China, 900-1325, eds. Patricia Buckley Ebrey and Paul Jakov Smith (Seattle: University of Washington Press, 2016), 101–52.
·Peter K. Bol, “From Kinship to Collegiality: Changing Literati Networks, 1100–1400,” Journal of Historical Network Research. Luxembourg, 5.1 (2021): 87–113. https://doi.org/10.25517/jhnr.v5i1.121
·Song Chen and Henrike Rudolph. “Beyond Relationships and Guanxi: An Introduction to the Research of Chinese Historical Networks.” Journal of Historical Network Research 5.1 (2021): iii–xxxii. https://doi.org/10.25517/jhnr.v5i1.131
Textbooks for reference:
·Prell, Christina. Social Network Analysis: History, Theory & Methodology. SAGE Publications Inc., 2012.
·刘勇、杜一,《网络数据可视化与分析利器:Gephi中文教程(全彩)》。北京:电子工业出版社,2017年。
·Cherven, Ken. Mastering Gephi Network Visualization. Birmingham: Packt Publishing, 2015.
Thursday (July 27)
Topics: Data Extraction and Organization 数据提取与编排
·Regular expressions 正则表达式
·XML markups 文本标注与XML文檔
·Relational databases 关系型数据库
Readings:
·徐力恒、王宏苏,〈从历史记录到结构化人物传记数据:中文材料的半自动处理方式〉,《大数据与中国历史研究》第3辑。
·David J. Birnbaum, “What is XML and why should humanists care?”
·Susan Schreibman, Ray Siemens, and John Unsworth ed., A Companion to Digital Humanities (Oxford: Blackwell, 2004), Ch.15: “ Databases” (only the sections before “Schema Design” are required).
Tools:
·EmEditor (free and portable edition)
·MarkUS (Tutorial)
·DocuSky 数字人文学术研究平台 (仅供参考)
References:
·RegEx One (online lessons on regular expressions)
·彭维谦、程卉、陈诗沛,〈从全文到表格:地方志职官志中职官数据之半自动撷取〉,《数字典藏与数字人文期刊》2018年第1期,第79–125页。
·包弼德(Peter K. Bol)、王宏苏、傅君劢(Michael A. Fuller)、陈松、柳舟、朱厚权,〈中国历代人物传记数据库(CBDB)的历史、方法与未来〉,《数字人文研究》 2021年第1期,第21–33页。
Friday (July 28)
Topics: Social Network Analysis 社会网络分析 (II) & Project Consultation
·Networks as graph-theoretical representations (the varieties of networks)
·Directed networks (HITS; PageRank; in- and out-degrees)
·Two-mode networks (Multimodal transformations)
·More on Gephi (filters and workspaces; animate a network)
·Consultation on student projects
Tools:
· Gephi
Readings:
·秦颖,〈《唐语林》中对话网络的可视化和统计分析初探〉,《数字人文》2022年第1期,第53–86页。
·Kieran Healy, “Using Metadata to Find Paul Revere”
·Anne S. Chao, Zhandong Liu, and Qiwei Li, “Network of Words: A Co-Occurrence Analysis of Nation-Building Terms in the Writings of Liang Qichao and Chen Duxiu,” Journal of Historical Network Research 5.1 (2021): 154 –86. https://doi.org/10.25517/jhnr.v5i1.122
References:
·陈松、赵薇,〈从隐喻到模型——作为研究和批评路径的网络分析〉,《数字人文》2022年第1期,第1–13页。
·Henrike Rudolph, “Structures of Empowerment: A Network Exploration of the Collective Biographies of Women Activists in Twentieth-Century China,” in Elites, Knowledge, and Power in Modern China: The Formation and Transformation of Elites in Modern China, eds. Christian Henriot, Sun Huei-min and Cécile Armand (Brill, 2022), pp. 113–44. https://doi.org/10.1163/9789004520479_006
·Hilde De Weerdt, et al., “Is There a Faction in This List?” Journal of Chinese History 4.2 (2020): 347–89. https://doi.org/10.1017/jch.2020.16
·Maciej Patryk Kurzynski, “On the Technology of the Sublime in Modern Chinese Narratives,” Shuzi renwen 数字人文 5.1 (2022): 87–115.
·赵薇,〈社会网络分析与「《大波》三部曲」的人物功能〉,《山东社会科学》2018年第9期,第50–64页。
WEEK 2
Monday (July 31)
Topics: Social Network Analysis 社会网络分析 (III)
·Networks as matrices
·Prepare data for UCINET (DL editor)
·Blockmodels
·Core-periphery analysis
·Equivalence (positional analysis)
Tools:
·UCINET (UCINET为付费软件,提供为期90天的免费试用,请勿提前下载安装。本软件只能在Windows操作系统下运行)
Readings:
·David Snyder, and Edward L. Kick, “Structural Position in the World System and Economic Growth, 1955-1970: A Multiple-Network Analysis of Transnational Interactions,” The American Journal of Sociology 84.5 (1979): 1096–126. https://doi.org/10.1086/226902
References:
·John F. Padgett and Christopher K. Ansell. “Robust Action and the Rise of the Medici, 1400–1434,” The American Journal of Sociology 98.6 (1993): 1259–319. https://doi.org/10.1086/230190
·Roger V. Gould, “Multiple Networks and Mobilization in the Paris Commune, 1871,” American Sociological Review 56.6 (1991): 716–29. https://doi.org/10.2307/2096251
Textbooks for reference:
·Borgatti, Stephen P., Martin G. Everett, and Jeffrey C. Johnson. Analyzing Social Networks. SAGE Publications Inc., 2013. 相关在线学习资源,参见https://study.sagepub.com/borgatti2e.
·Hanneman, Robert A., and Mark Riddle. Introduction to Social Network Methods. http://faculty.ucr.edu/~hanneman/nettext/ (在线教材)
·刘军,《社会网络分析导论》,北京:社会科学文献出版社,2004年。
·刘军,《整体网分析:UCINET软件实用指南》(第2版),上海:格致出版社、上海人民出版社,2014年。
Tuesday (August 1)
Topics: Spatial Analysis 空间分析 (III); Networks and Space 网络与空间
·Georeferencing
·Projecting networks to space (Geo Layout; Export to Earth)
·Networks of space
·Position/collapse nodes by attribute (NetDraw)
Tools:
·Palladio; Gephi; UCINET; NetDraw
·ArcGIS Pro/QGIS
·Google Earth (在线试用,无需下载安装) or Google Earth Pro on Desktop
Readings:
·Søren Michael Sindbæk, “The Small World of the Vikings: Networks in Early Medieval Communication and Exchange,” Norwegian Archaeological Review 40.1 (2007): 59–74. https://doi.org/10.1080/00293650701327619.
·马昭仪、何捷、刘帅帅,〈从唐小说中的空间交互看都城长安的社会感知变迁〉,《数字人文》2022年第1期,第28–52页。
References:
·Song Chen, “Writing for Local Government Schools: Authors and Themes in Song-dynasty School Inscriptions,” Journal of Chinese History 4.2 (2020): 305–46. https://doi.org/10.1017/jch.2020.11 (Free access link). Chinese translation with revision: 陈松,〈为学作记——从网络分析和文本分析视角看宋代地方官学碑记的作者与主题〉,《数字人文》2021年第1期,第24–72页。
Wednesday (August 2)
Topics: Research Replication: From Data Cleaning to Data Analysis
·Replication of Chen’s “Patterns of Integration”
·Consultation on student projects
Readings:
·Song Chen, “Patterns of Integration: A Network Perspective on Popular Religious Connections in China’s Lower Yangzi, 1150–1350,” Religions 14.5 (2023): 577. https://doi.org/10.3390/rel14050577
Thursday (August 3)
Topics: Student Presentations