查看原文
其他

12G数据集 | 23w条Kickstarter项目信息

大邓 大邓和他的Python
2024-09-09


Kickstarter介绍

Kickstarter于2009年4月在美国纽约成立,是一个专为具有创意方案的企业筹资的众筹网站平台。

kickstarter平台的运作方式相对来说比较简单而有效:该平台的用户一方是有创新意渴望进行创作和创造的人,另一方则是愿意为他们出资金的人,然后见证新发明新创作新产品的出现。kickstarter网站的创意性活动包括:音乐,网页设计,平面设计,动画,作家以及所有有能力创造以及影响他人的活动。



12G数据集

2016年3月 写好的kickstarter爬虫,每月执行一次。截止2022年11月, 所有压缩文件累积11.42G。文末有数据获取方式


参考论文

该数据集研究价值,可用于研究市场营销、创新创业、信息管理等, 部分使用kickstarter作为研究对象的论文。

  • 王伟,陈伟,祝效国,王洪伟. 众筹融资成功率与语言风格的说服性-基于Kickstarter的实证研究.管理世界.2016;5:81-98.
  • Dai, Hengchen and Dennis J. Zhang. “Prosocial Goal Pursuit in Crowdfunding: Evidence from Kickstarter.Journal of Marketing Research 56 (2019): 498 - 517.
  • Gafni, H., Marom, D.M., Robb, A.M., & Sade, O. (2020). Gender Dynamics in Crowdfunding (Kickstarter): Evidence on Entrepreneurs, Backers, and Taste-Based Discrimination*. Review of Finance.
  • Jensen, Lasse Skovgaard and Ali Gürcan Özkil. “Identifying challenges in crowdfunded product development: a review of Kickstarter projects.Design Science 4 (2018): n. pag.

查看数据

任意选择一个zip文件解压会得到json文件,注意 不同json文件不太一样,所以本文的代码可能要有调整。

import pandas as pd

#读取任意一个zip解压得到的csv文件
df = pd.read_json('data/Kickstarter_2022-06-09T03_20_03_365Z.json', lines=True)

df.head()

Run


len(df)

Run

    230346

# 选中projects字段
projects = df['data']
projects

Run

    0         {'id': 947118202, 'photo': {'key''assets/029...
    1         {'
id': 426094497, 'photo': {'key': 'assets/029...
    2         {'id': 44835253, 'photo': {'key''assets/034/...
    3         {'
id': 1001767271, 'photo': {'key': 'assets/03...
    4         {'id': 1880345176, 'photo': {'key''assets/03...
                                    ...                        
    230341    {'
id': 676753351, 'photo': {'key': 'assets/012...
    230342    {'id': 1579378115, 'photo': {'key''assets/02...
    230343    {'
id': 1281094926, 'photo': {'key': 'assets/02...
    230344    {'id': 783009016, 'photo': {'key''assets/012...
    230345    {'
id': 324368296, 'photo': {'key': 'assets/012...
    Name: data, Length: 230346, dtype: object

#查看第一行,data列
df.loc[0'data']

Run

    {'id': 947118202,
     'photo': {'key''assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png',
      'full''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=560&h=315&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=26209d432871ad2e9cca642527c291d9',
      'ed''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=352&h=198&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=db82255e6639d5951506e0f2ed4d7d8b',
      'med''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=272&h=153&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=f7b43116136000c8efa892bdbdd2d956',
      'little''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=208&h=117&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=a52e3c34066a020e040c517c614a8b36',
      'small''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=160&h=90&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=6c5f1c254119ffe914b50250f8e2899f',
      'thumb''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=48&h=27&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=0222f379ed51059eb73adc7436f07b1e',
      '1024x576''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=1024&h=576&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=d01546c5e88f3f47e0dddc48b5dce9df',
      '1536x864''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=1552&h=873&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=f47505da909a374642906e6d418474e7'},
     'name''Paint Rogue',
     'blurb''Roguelike | Platformer | Shooter',
     'goal': 5000,
     'pledged': 5268.22,
     'state''successful',
     'slug''paint-rogue',
     'disable_communication': False,
     'country''AU',
     'country_displayable_name''Australia',
     'currency''AUD',
     'currency_symbol''$',
     'currency_trailing_code': True,
     'deadline': 1594247312,
     'state_changed_at': 1594247312,
     'created_at': 1591152439,
     'launched_at': 1591655312,
     'staff_pick': False,
     'is_starrable': False,
     'backers_count': 42,
     'static_usd_rate': 0.69681992,
     'usd_pledged''3671.0006389424',
     'converted_pledged_amount': 3657,
     'fx_rate': 0.7200616400000001,
     'usd_exchange_rate': 0.69423473,
     'current_currency''USD',
     'usd_type''international',
     'creator': {'id': 1018782761,
      'name''Andrew Von Stieglitz',
      'is_registered': None,
      'is_email_verified': None,
      'chosen_currency': None,
      'is_superbacker': None,
      'avatar': {'thumb''https://ksr-ugc.imgix.net/assets/024/628/691/39e3fdc8db723302f544f7161e32c4b7_original.png?ixlib=rb-4.0.2&w=40&h=40&fit=crop&v=1554207776&auto=format&frame=1&q=92&s=bf4ce960e83b57310b93c40dda68e213',
       'small''https://ksr-ugc.imgix.net/assets/024/628/691/39e3fdc8db723302f544f7161e32c4b7_original.png?ixlib=rb-4.0.2&w=80&h=80&fit=crop&v=1554207776&auto=format&frame=1&q=92&s=a862ab30490c90cd08186f448884142d',
       'medium''https://ksr-ugc.imgix.net/assets/024/628/691/39e3fdc8db723302f544f7161e32c4b7_original.png?ixlib=rb-4.0.2&w=160&h=160&fit=crop&v=1554207776&auto=format&frame=1&q=92&s=38923ac11699d68a7aae93ce126b97b6'},
      'urls': {'web': {'user''https://www.kickstarter.com/profile/1018782761'},
       'api': {'user''https://api.kickstarter.com/v1/users/1018782761?signature=1654832212.14f9df54b2643f080ad98cacb07314f94757d9c1'}}},
     'location': {'id': 1105779,
      'name''Sydney',
      'slug''sydney-au',
      'short_name''Sydney, AU',
      'displayable_name''Sydney, AU',
      'localized_name''Sydney',
      'country''AU',
      'state''NSW',
      'type''Town',
      'is_root': False,
      'expanded_country''Australia',
      'urls': {'web': {'discover''https://www.kickstarter.com/discover/places/sydney-au',
        'location''https://www.kickstarter.com/locations/sydney-au'},
       'api': {'nearby_projects''https://api.kickstarter.com/v1/discover?signature=1654814982.2fcf49a7b611d4414d14b1dbe41ac53623192e6a&woe_id=1105779'}}},
     'category': {'id': 35,
      'name''Video Games',
      'analytics_name''Video Games',
      'slug''games/video games',
      'position': 7,
      'parent_id': 12,
      'parent_name''Games',
      'color': 51627,
      'urls': {'web': {'discover''http://www.kickstarter.com/discover/categories/games/video%20games'}}},
     'profile': {'id': 4007060,
      'project_id': 4007060,
      'state''active',
      'state_changed_at': 1594267960,
      'name''Paint Rogue',
      'blurb''Roguelike | Platformer | Shooter',
      'background_color''',
      'text_color''ffffff',
      'link_background_color''',
      'link_text_color''',
      'link_text''Follow along!',
      'link_url''https://www.kickstarter.com/projects/1018782761/paint-rogue/',
      'show_feature_image': True,
      'background_image_opacity': 0.5700000000000001,
      'background_image_attributes': {'id': 29758105,
       'image_urls': {'default''https://ksr-ugc.imgix.net/assets/029/758/105/971e42c0e19ca75fbae0943aa874c3c2_original.png?ixlib=rb-4.0.2&w=1600&fit=max&v=1594267934&auto=format&frame=1&q=92&s=b91907c9e125e206a11a1bcef322c142',
        'baseball_card''https://ksr-ugc.imgix.net/assets/029/758/105/971e42c0e19ca75fbae0943aa874c3c2_original.png?ixlib=rb-4.0.2&w=460&fit=max&v=1594267934&auto=format&frame=1&q=92&s=e7af2286d1f74a51672fbb6060ad43c8'}},
      'should_show_feature_image_section': False,
      'feature_image_attributes': {'image_urls': {'default''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=1552&h=873&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=f47505da909a374642906e6d418474e7',
        'baseball_card''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=560&h=315&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=26209d432871ad2e9cca642527c291d9'}}},
     'spotlight': True,
     'urls': {'web': {'project''https://www.kickstarter.com/projects/1018782761/paint-rogue?ref=discovery_category_newest',
       'rewards''https://www.kickstarter.com/projects/1018782761/paint-rogue/rewards'}},
     'source_url''https://www.kickstarter.com/discover/categories/games/video%20games'}



字段

以第一条为例,查看每条众筹项目数据中的字段,

df.loc[0'data'].keys()

Run,运行结果#为后期加入的字段解释

    dict_keys([
    'id''photo',  #id、图片链接
    'name''blurb',  #项目名
    'goal',   #项目筹资目标金额
    'pledged'
    'state',  #项目状态
    'slug',  
    'disable_communication'
    'country''country_displayable_name',   #国家
    'currency''currency_symbol''currency_trailing_code',  #货币
    'deadline''state_changed_at',  #项目筹资截止时间(时间戳格式)
    'created_at',  #项目创建时间(时间戳格式)
    'launched_at',  #项目上架时间(时间戳格式)
    'staff_pick''is_starrable'
    'backers_count',  #资助人数
    'static_usd_rate''usd_pledged''converted_pledged_amount''fx_rate''usd_exchange_rate''current_currency''usd_type'
    'creator',  #项目发起人信息
    'location',  #地址
    'category',  #项目所属类目信息
    'profile',  #项目基本信息
    'spotlight'
    'urls',  #项目链接
    'source_url'])

以第一条数据为例,依次查看这几个字段的信息

#众筹项目具名
print('项目名', df.loc[0'data']['name'], end='\n\n')

print('项目链接\n', df.loc[0'data']['urls'], end='\n\n')

#众筹项目的目标总金额
print('目标总金额: {goal}{currency}'.format(goal=df.loc[0'data']['goal'], 
                                          currency=df.loc[0'data']['currency']))

Run

    项目名 Paint Rogue
    
    项目链接
     {'web': {'project''https://www.kickstarter.com/projects/1018782761/paint-rogue?ref=discovery_category_newest''rewards''https://www.kickstarter.com/projects/1018782761/paint-rogue/rewards'}}
    
    目标总金额: 5000AUD

#众筹项目发起人信息
print('项目发起人信息\n', df.loc[0'data']['creator'], end='\n\n')

print('项目基本信息\n', df.loc[0'data']['profile'], end='\n\n')

#众筹项目坐标
print('地址: ', df.loc[0'data']['location'], end='\n\n')

#众筹项目货币
print('货币:', df.loc[0'data']['currency'], end='\n\n')

#众筹项目所在国家
print('所在国家: ', df.loc[0'data']['country_displayable_name'], end='\n\n')

Run

    项目发起人信息
     {'id': 1018782761, 'name''Andrew Von Stieglitz''is_registered': None, 'is_email_verified': None, 'chosen_currency': None, 'is_superbacker': None, 'avatar': {'thumb''https://ksr-ugc.imgix.net/assets/024/628/691/39e3fdc8db723302f544f7161e32c4b7_original.png?ixlib=rb-4.0.2&w=40&h=40&fit=crop&v=1554207776&auto=format&frame=1&q=92&s=bf4ce960e83b57310b93c40dda68e213''small''https://ksr-ugc.imgix.net/assets/024/628/691/39e3fdc8db723302f544f7161e32c4b7_original.png?ixlib=rb-4.0.2&w=80&h=80&fit=crop&v=1554207776&auto=format&frame=1&q=92&s=a862ab30490c90cd08186f448884142d''medium''https://ksr-ugc.imgix.net/assets/024/628/691/39e3fdc8db723302f544f7161e32c4b7_original.png?ixlib=rb-4.0.2&w=160&h=160&fit=crop&v=1554207776&auto=format&frame=1&q=92&s=38923ac11699d68a7aae93ce126b97b6'}, 'urls': {'web': {'user''https://www.kickstarter.com/profile/1018782761'}, 'api': {'user''https://api.kickstarter.com/v1/users/1018782761?signature=1654832212.14f9df54b2643f080ad98cacb07314f94757d9c1'}}}
    
    项目基本信息
     {'id': 4007060, 'project_id': 4007060, 'state''active''state_changed_at': 1594267960, 'name''Paint Rogue''blurb''Roguelike | Platformer | Shooter''background_color''''text_color''ffffff''link_background_color''''link_text_color''''link_text''Follow along!''link_url''https://www.kickstarter.com/projects/1018782761/paint-rogue/''show_feature_image': True, 'background_image_opacity': 0.5700000000000001, 'background_image_attributes': {'id': 29758105, 'image_urls': {'default''https://ksr-ugc.imgix.net/assets/029/758/105/971e42c0e19ca75fbae0943aa874c3c2_original.png?ixlib=rb-4.0.2&w=1600&fit=max&v=1594267934&auto=format&frame=1&q=92&s=b91907c9e125e206a11a1bcef322c142''baseball_card''https://ksr-ugc.imgix.net/assets/029/758/105/971e42c0e19ca75fbae0943aa874c3c2_original.png?ixlib=rb-4.0.2&w=460&fit=max&v=1594267934&auto=format&frame=1&q=92&s=e7af2286d1f74a51672fbb6060ad43c8'}}, 'should_show_feature_image_section': False, 'feature_image_attributes': {'image_urls': {'default''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=1552&h=873&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=f47505da909a374642906e6d418474e7''baseball_card''https://ksr-ugc.imgix.net/assets/029/542/393/098b025d0c25cc15e5b5f673b9ec992a_original.png?ixlib=rb-4.0.2&crop=faces&w=560&h=315&fit=crop&v=1592675400&auto=format&frame=1&q=92&s=26209d432871ad2e9cca642527c291d9'}}}
    
    地址:  {'id': 1105779, 'name''Sydney''slug''sydney-au''short_name''Sydney, AU''displayable_name''Sydney, AU''localized_name''Sydney''country''AU''state''NSW''type''Town''is_root': False, 'expanded_country''Australia''urls': {'web': {'discover''https://www.kickstarter.com/discover/places/sydney-au''location''https://www.kickstarter.com/locations/sydney-au'}, 'api': {'nearby_projects''https://api.kickstarter.com/v1/discover?signature=1654814982.2fcf49a7b611d4414d14b1dbe41ac53623192e6a&woe_id=1105779'}}}
    
    货币: AUD
    
    所在国家:  Australia

#众筹项目创建时间
print('项目创建时间: ', df.loc[0'data']['created_at'])

#众筹项目上架时间
print('项目上架时间: ', df.loc[0'data']['launched_at'])

#众筹项目截止时间
print('项目截止时间: ', df.loc[0'data']['deadline'])

Run

    项目创建时间:  1591152439
    项目上架时间:  1591655312
    项目截止时间:  1594247312

时间戳转日期

1591152439是时间戳,以某时间点距1970之间的秒数作为时间。

#时间戳转日期

import datetime

def timestamp2str(timestamp):
    d = datetime.datetime.fromtimestamp(timestamp)
    return '{year}-{month}-{day} {hour}:{minute}:{second}'.format(year=d.year,
                                                                 month=d.month,
                                                                 day=d.day,
                                                                 hour=d.hour,
                                                                 minute=d.minute,
                                                                 second=d.second)

print('创建时间', timestamp2str(1591152439))
print('上架时间', timestamp2str(1591655312))
print('截止时间', timestamp2str(1594247312))

Run

创建时间 2020-6-3 10:47:19
上架时间 2020-6-9 6:28:32
截止时间 2020-7-9 6:28:32

#众筹项目产品 所属类目信息
print('众筹类目:', df.loc[0'data']['category'], end='\n\n')

# 众筹类目根链接
print('众筹类目根链接:', df.loc[0'data']['source_url'])

Run

    众筹类目: {'id': 35, 'name''Video Games''analytics_name''Video Games''slug''games/video games''position': 7, 'parent_id': 12, 'parent_name''Games''color': 51627, 'urls': {'web': {'discover''http://www.kickstarter.com/discover/categories/games/video%20games'}}}
    
    众筹类目根链接: https://www.kickstarter.com/discover/categories/games/video%20games

数据获取方法

转发分享至朋友圈,集赞50+, 加微信 372335839 , 备注「姓名-学校-专业-Kickstarter」

精选文章

管理世界 | 使用文本分析词构建并测量短视主义

管理世界 | 使用 经营讨论与分析 测量 企业数字化指标

支持开票 | Python实证指标构建与文本分析

推荐 | 社科(经管)文本分析快速指南

视频分享 | 文本分析在经管研究中的应用

转载 | 金融学文本大数据挖掘方法与研究进展

FinBERT | 金融文本BERT模型,可情感分析、识别ESG和FLS类型

BERTopic | 使用推特数据构建动态主题模型

转载 | 社会计算驱动的社会科学研究方法

JM2022综述 | 黄金领域: 为营销研究(新洞察)采集网络数据

可视化 | 绘制《三体》人物关系网络图

资料 | 量化历史学与经济学研究

长期征稿 | 欢迎各位前来投稿

17G数据集 | 深交所企业社会责任报告

70G数据集 | 上市公司定期报告数据集

27G数据集 | 使用Python对27G招股说明书进行文本分析

数据集 | 90w条中国上市公司高管数据

可视化 | 绘制《三体》人物关系网络图

Maigret库 | 查询某用户名在各平台网站的使用情况

MS | 使用网络算法识别创新的颠覆性与否

认知的测量 | 向量距离vs语义投影

Asent库 | 英文文本数据情感分析

PNAS | 文本网络分析&文化桥梁Python代码实现

PNAS | 使用语义距离测量一个人的创新力(发散思维)得分

tomotopy | 速度最快的LDA主题模型

Wow~70G上市公司定期报告数据集

100min视频 | Python文本分析与会计

安装python包出现报错:Microsoft Visual 14.0 or greater is required. 怎么办?

如何正确读入文本数据不乱码(解决文本乱码问题)

Faker库 | 生成实验数据

继续滑动看下一个
大邓和他的Python
向上滑动看下一个

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存