其他
下载sra数据库文件不仅仅是prefetch那么简单了
最近下载一个文章的数据,发现3个数据,就有3种结果:
$cat logs/down.log.4
$cat need.sra.list |while read id;do ( ~/biosoft/sratoolkit/sratoolkit.2.9.2-centos_linux64/bin/prefetch $id -O ./ );done
2019-10-03T02:25:45 prefetch.2.9.2: 1) Downloading 'SRR5908780'...
2019-10-03T02:25:45 prefetch.2.9.2: Downloading via https...
2019-10-03T02:32:55 prefetch.2.9.2 sys: timeout exhausted while reading file within network system module - mbedtls_ssl_read returned -76 ( NET - Reading information from the socket failed )
2019-10-03T02:32:55 prefetch.2.9.2 int: timeout exhausted while reading file within network system module - Cannot KStreamRead: https://sra-download.ncbi.nlm.nih.gov/traces/sra51/SRR/005770/SRR5908780
2019-10-03T02:32:55 prefetch.2.9.2: 1) failed to download SRR5908780
2019-10-03T02:32:58 prefetch.2.9.2: 1) Downloading 'SRR5908829'...
2019-10-03T02:32:58 prefetch.2.9.2: Downloading via https...
2019-10-03T02:51:48 prefetch.2.9.2: 1) 'SRR5908829' was downloaded successfully
2019-10-03T02:53:08 prefetch.2.9.2: 'SRR5908829' has 0 unresolved dependencies
2019-10-03T02:53:11 prefetch.2.9.2: 'SRR5908829' has remote vdbcache
2019-10-03T02:53:11 prefetch.2.9.2: Downloading vdbcache...
2019-10-03T02:53:11 prefetch.2.9.2: Downloading via https...
2019-10-03T02:55:07 prefetch.2.9.2: vdbcache was downloaded successfully
2019-10-03T02:55:10 prefetch.2.9.2: 1) Downloading 'SRR5908833'...
2019-10-03T02:55:10 prefetch.2.9.2: Downloading via https...
2019-10-03T03:14:08 prefetch.2.9.2: 1) 'SRR5908833' was downloaded successfully
2019-10-03T03:15:31 prefetch.2.9.2: 'SRR5908833' has 0 unresolved dependencies
2019-10-03T03:15:40 prefetch.2.9.2: 'SRR5908833' has no remote vdbcache
其中SRR5908780是确认是报错:
2019-10-03T05:17:48 prefetch.2.9.2: 1) Downloading 'SRR5908780'...
2019-10-03T05:17:48 prefetch.2.9.2: Downloading via https...
2019-10-03T05:23:25 prefetch.2.9.2 sys: timeout exhausted while reading file within network system module - mbedtls_ssl_read returned -76 ( NET - Reading information from the socket failed )
2019-10-03T05:23:25 prefetch.2.9.2 int: timeout exhausted while reading file within network system module - Cannot KStreamRead: https://sra-download.ncbi.nlm.nih.gov/traces/sra51/SRR/005770/SRR5908780
2019-10-03T05:23:25 prefetch.2.9.2: 1) failed to download SRR5908780
但实际上,我把这个网址输入到浏览器,是可以下载的:
https://sra-download.ncbi.nlm.nih.gov/traces/sra51/SRR/005770/SRR5908780
速度也还行:
不能理解为什么使用命令行就下载失败!
但是,诡异的是,第二天同样的命令又成功了
$~/biosoft/sratoolkit/sratoolkit.2.9.2-centos_linux64/bin/prefetch SRR5908780 -O ./
2019-10-04T01:22:05 prefetch.2.9.2: 1) Downloading 'SRR5908780'...
2019-10-04T01:22:05 prefetch.2.9.2: Downloading via https...
2019-10-04T01:41:19 prefetch.2.9.2: 1) 'SRR5908780' was downloaded successfully
2019-10-04T01:42:37 prefetch.2.9.2: 'SRR5908780' has 0 unresolved dependencies
2019-10-04T01:42:40 prefetch.2.9.2: 'SRR5908780' has remote vdbcache
2019-10-04T01:42:40 prefetch.2.9.2: Downloading vdbcache...
2019-10-04T01:42:40 prefetch.2.9.2: Downloading via https...
2019-10-04T01:44:38 prefetch.2.9.2: vdbcache was downloaded successfully
得到的文件如下:
3.8G Oct 4 09:41 SRR5908780.sra
3.7G Oct 3 10:51 SRR5908829.sra
3.5G Oct 3 11:14 SRR5908833.sra
如果想系统学习生信,下面的课程你可能会需要!
1
全国巡讲第17站
2
全国巡讲第18站
1 | 生信-R语言入门 |
2 | GEO数据库挖掘 |
3 | 生信-LINUX基础 |