Biotechnology Bulletin ›› 2024, Vol. 40 ›› Issue (10): 221-232.doi: 10.13560/j.cnki.biotech.bull.1985.2024-0402
Previous Articles Next Articles
WANG Shang1(), FENG Kai1, LI Tong1,2, WANG Jie3, GU Song-song1,2, YANG Xing-sheng1,2, LI Chun-ge1, DENG Ye1,2()
Received:
2024-04-26
Online:
2024-10-26
Published:
2024-11-20
Contact:
DENG Ye
E-mail:shangwang@rcees.ac.cn;yedeng@rcees.ac.cn
WANG Shang, FENG Kai, LI Tong, WANG Jie, GU Song-song, YANG Xing-sheng, LI Chun-ge, DENG Ye. Databases and Data Mining Methods for Environmental Pathogen Research[J]. Biotechnology Bulletin, 2024, 40(10): 221-232.
数据库 Database | 访问地址 Website | 特色 Characteristics | 数据类型和数据量 Data type and volume |
---|---|---|---|
细菌病原体数据库Database for bacterial pathogen | |||
Virulence Factor Database[ (VFDB) | http://www.mgc.ac.cn/VFs/main.htm | 内容最全面、数据质量最高的病原菌毒力因子数据库 | 涵盖了32个属的常见细菌病原体,并把毒力因子划分到14个基本类别和100多个子类别中,主要类别包括黏附、入侵、外毒素、效应器传输系统、生物膜、营养代谢因子和免疫调控等,数据库内有11 693个毒力因子基因 |
Pathosystems Resource Integration Center[ (PATRIC) | https://www.patricbrc.org/ | 细菌生物信息学资源中心,提供一站式重要病原体数据分析服务平台 | PATRIC整合了超过250 000个统一注释和公开可用的基因组和环境数据,同时,PATRIC-VF资源库,针对分枝杆菌属、沙门氏菌属、大肠杆菌属、志贺氏菌属、李斯特菌属和巴尔通氏体属这6种细菌病原体,从公开的1 071篇文献数据中,收集到1 572种毒力因子基因数据 |
A Novel High-throughput qPCR Chip for Virulence Factor Genes[ (VFG-Chip) | 针对4种人畜共患致病菌的毒力基因开发的高通量定量技术 | VFG芯片针对环境中4种典型人畜致病菌——肺炎克雷伯氏菌(Klebsiella pneumoniae)、鲍曼不动杆菌(Acinetobacter baumannii)、大肠杆菌(Escherichia coli)和肠道沙门氏菌(Salmonella enterica),覆盖了其中参与编码毒素、黏附因子、分泌系统、免疫逃避/入侵和铁摄取等5种主要功能的96个毒力基因 | |
Pan-pathogen Microarray[ (PathoChip) | 专门针对毒力因子基因设计的功能基因芯片 | 包含3 715种特异性探针,覆盖1 397种细菌和2 336种菌株的13种毒力因子基因,共计7 417条编码序列,主要毒力因子有毒素、溶血素、黏附、菌毛、定殖因子、毒力蛋白、分泌物、细菌素等 | |
The Chinese Local Salmonella Genome Database version 2[ (CLSGDB v2) | https://nmdc.cn/clsgdbv2/ | 中国地区人兽共患沙门菌的基因组数据库 | 包含7 997个中国人兽共患沙门菌基因组序列和相关背景数据,含有164个沙门菌血清型和295个序列型,是目前中国规模的最大、综合性、开源的沙门菌基因组数据库。数据库提供了基因组序列下载等 |
Microbial index of pathogenic bacteria[ (MIP) | https://github.com/qdu-bioinfo/mip | 条件致病菌数据库 | 包括3部分内容:(1)SILVA数据库149 252条全长16S rRNA基因序列;(2)中国疾病预防控制中心(CDC)发布的300种条件致病菌的18 389条全长16S rRNA;(3)一个病原菌-疾病互作网络。进一步细化发病器官信息,包含口腔和感官系统、皮肤、循环系统、消化系统、呼吸系统、泌尿生殖系统和其他系统(运动、神经和内分泌等),形成了病原菌-疾病-发病器官的互作网络 |
16S Pathogenic Identification Process[ (16sPIP) | https://github.com/jjmiao1314/16sPIP | 快速检测临床样本中的病原体 | 该数据库包含一个针对环境样品细菌16S rRNA基因全长序列的综合数据库和一个针对346个致病菌16S rRNA基因全长序列的特异性数据库,针对环境样品的数据库收集了来自2 094个属,15 217各种的252 567 条序列,而针对致病菌的数据库,包含来自346个致病菌的29 258条序列 |
Multiple bacterial pathogen detection[ (MBPD) | https://github.com/LorMeBioAI/MBPDLorMeBioAI/MBPD | 快速识别多种病原生物的复合污染 | 包含1 986种已报道病原细菌的72 685条16S rRNA基因序列的大型病原菌数据库,该数据库是当前最为全面的动物、植物、人畜共患致病细菌数据库 |
真核病原体数据库Database for fungal pathogen | |||
Database of Fungal Virulence Factors[ (DFVF) | http://sysbio.unl.edu/DFVF/ | 针对植物和动物宿主的各种真菌感染疾病的毒力因子数据库 | 收集已知的真菌病原菌的毒力因子并构建了真菌毒力因子数据库,包含了85个属的228个真菌菌株的2 058个致病基因,数据库提供了蛋白序列进行下载 |
The Eukaryotic Pathogen, Vector and Host Informatics Resource[ (VEuPathDB) | | 主要是无脊椎动物载体和真核病原体(原生生物和真菌) | 由VectorBase( |
病毒数据库Database for virus | |||
Virus Pathogen Resource[ (IRD/ViPR) | https://www.viprbrc.org | 被整合到BV-BRC资源中心平台上 | 目前的公开数据中共包含121个病毒科、2 093个病毒属的27 475种病毒信息,并提供序列信息、基因和蛋白注释、3D蛋白结构、免疫表位位置、临床数据等信息 |
ViralZone[ | https://viralzone.expasy.org | 将病毒分子生物学知识与病毒基因组和蛋白质序列结合在一起 | 提供128个科567个属的病毒信息,包括病毒体结构、复制周期和宿主-病毒相互作用。详细描述了每个病毒属的病毒体形状、分子生物学和流行病学,并链接到相应的UniProtKB注释蛋白质组 |
病原体综合数据库Database for multi-pathogens | |||
Global Catalogue of Pathogens (gcPathogen) | https://nmdc.cn/gcpathogen/ | 用于公共卫生的人类病原体综合基因组资源 | 目前包括1 166 147个细菌病原体基因组,分属于497个细菌分类,986 044个细菌菌株;6 785个真菌病原体基因组,分属于407个真菌分类,6 294个真菌菌株;90 029个病毒基因组,分属于226个病毒分类,13 689个病毒毒株;670个寄生虫基因组,分属于174个寄生类群,403个寄生虫菌株 |
PMDB | 搭载于PMseq Datician病原专家系统,付费平台 | 专用高质量临床级别数据库 | 细菌、真菌、病毒、寄生虫等17 500种病原体。其中PMDB-耐药/毒力数据库经过大数据测试严格筛选,最终选择了32种代表性耐药基因和281种代表性毒力基因 |
Victors[ | https://www.phidias.us/victors/ | 是一个经人工筛选的、基于网页界面的交互式的综合数据库和分析资源 | 目前平台上包含5 304个毒力因子和127种病原菌,主要有51种细菌、54种病毒、13种寄生物种和8种真菌物种 |
Pathogen-Host Interaction[ (PHI-base) | http://www.phi-base.org/ | 病原菌-宿主互作数据库 | 包含宿主有动物、植物、真菌和昆虫,病原体有真菌、卵菌和细菌,并整理了经过实验验证的致病性、毒力和效应基因。目前的版本中包括4 387篇参考文献,提供了279种病原菌的8 411种基因,被证实对228种宿主的18 190种相互作用关系产生影响,致病种类达533种 |
Table 1 Databases of pathogens in common use
数据库 Database | 访问地址 Website | 特色 Characteristics | 数据类型和数据量 Data type and volume |
---|---|---|---|
细菌病原体数据库Database for bacterial pathogen | |||
Virulence Factor Database[ (VFDB) | http://www.mgc.ac.cn/VFs/main.htm | 内容最全面、数据质量最高的病原菌毒力因子数据库 | 涵盖了32个属的常见细菌病原体,并把毒力因子划分到14个基本类别和100多个子类别中,主要类别包括黏附、入侵、外毒素、效应器传输系统、生物膜、营养代谢因子和免疫调控等,数据库内有11 693个毒力因子基因 |
Pathosystems Resource Integration Center[ (PATRIC) | https://www.patricbrc.org/ | 细菌生物信息学资源中心,提供一站式重要病原体数据分析服务平台 | PATRIC整合了超过250 000个统一注释和公开可用的基因组和环境数据,同时,PATRIC-VF资源库,针对分枝杆菌属、沙门氏菌属、大肠杆菌属、志贺氏菌属、李斯特菌属和巴尔通氏体属这6种细菌病原体,从公开的1 071篇文献数据中,收集到1 572种毒力因子基因数据 |
A Novel High-throughput qPCR Chip for Virulence Factor Genes[ (VFG-Chip) | 针对4种人畜共患致病菌的毒力基因开发的高通量定量技术 | VFG芯片针对环境中4种典型人畜致病菌——肺炎克雷伯氏菌(Klebsiella pneumoniae)、鲍曼不动杆菌(Acinetobacter baumannii)、大肠杆菌(Escherichia coli)和肠道沙门氏菌(Salmonella enterica),覆盖了其中参与编码毒素、黏附因子、分泌系统、免疫逃避/入侵和铁摄取等5种主要功能的96个毒力基因 | |
Pan-pathogen Microarray[ (PathoChip) | 专门针对毒力因子基因设计的功能基因芯片 | 包含3 715种特异性探针,覆盖1 397种细菌和2 336种菌株的13种毒力因子基因,共计7 417条编码序列,主要毒力因子有毒素、溶血素、黏附、菌毛、定殖因子、毒力蛋白、分泌物、细菌素等 | |
The Chinese Local Salmonella Genome Database version 2[ (CLSGDB v2) | https://nmdc.cn/clsgdbv2/ | 中国地区人兽共患沙门菌的基因组数据库 | 包含7 997个中国人兽共患沙门菌基因组序列和相关背景数据,含有164个沙门菌血清型和295个序列型,是目前中国规模的最大、综合性、开源的沙门菌基因组数据库。数据库提供了基因组序列下载等 |
Microbial index of pathogenic bacteria[ (MIP) | https://github.com/qdu-bioinfo/mip | 条件致病菌数据库 | 包括3部分内容:(1)SILVA数据库149 252条全长16S rRNA基因序列;(2)中国疾病预防控制中心(CDC)发布的300种条件致病菌的18 389条全长16S rRNA;(3)一个病原菌-疾病互作网络。进一步细化发病器官信息,包含口腔和感官系统、皮肤、循环系统、消化系统、呼吸系统、泌尿生殖系统和其他系统(运动、神经和内分泌等),形成了病原菌-疾病-发病器官的互作网络 |
16S Pathogenic Identification Process[ (16sPIP) | https://github.com/jjmiao1314/16sPIP | 快速检测临床样本中的病原体 | 该数据库包含一个针对环境样品细菌16S rRNA基因全长序列的综合数据库和一个针对346个致病菌16S rRNA基因全长序列的特异性数据库,针对环境样品的数据库收集了来自2 094个属,15 217各种的252 567 条序列,而针对致病菌的数据库,包含来自346个致病菌的29 258条序列 |
Multiple bacterial pathogen detection[ (MBPD) | https://github.com/LorMeBioAI/MBPDLorMeBioAI/MBPD | 快速识别多种病原生物的复合污染 | 包含1 986种已报道病原细菌的72 685条16S rRNA基因序列的大型病原菌数据库,该数据库是当前最为全面的动物、植物、人畜共患致病细菌数据库 |
真核病原体数据库Database for fungal pathogen | |||
Database of Fungal Virulence Factors[ (DFVF) | http://sysbio.unl.edu/DFVF/ | 针对植物和动物宿主的各种真菌感染疾病的毒力因子数据库 | 收集已知的真菌病原菌的毒力因子并构建了真菌毒力因子数据库,包含了85个属的228个真菌菌株的2 058个致病基因,数据库提供了蛋白序列进行下载 |
The Eukaryotic Pathogen, Vector and Host Informatics Resource[ (VEuPathDB) | | 主要是无脊椎动物载体和真核病原体(原生生物和真菌) | 由VectorBase( |
病毒数据库Database for virus | |||
Virus Pathogen Resource[ (IRD/ViPR) | https://www.viprbrc.org | 被整合到BV-BRC资源中心平台上 | 目前的公开数据中共包含121个病毒科、2 093个病毒属的27 475种病毒信息,并提供序列信息、基因和蛋白注释、3D蛋白结构、免疫表位位置、临床数据等信息 |
ViralZone[ | https://viralzone.expasy.org | 将病毒分子生物学知识与病毒基因组和蛋白质序列结合在一起 | 提供128个科567个属的病毒信息,包括病毒体结构、复制周期和宿主-病毒相互作用。详细描述了每个病毒属的病毒体形状、分子生物学和流行病学,并链接到相应的UniProtKB注释蛋白质组 |
病原体综合数据库Database for multi-pathogens | |||
Global Catalogue of Pathogens (gcPathogen) | https://nmdc.cn/gcpathogen/ | 用于公共卫生的人类病原体综合基因组资源 | 目前包括1 166 147个细菌病原体基因组,分属于497个细菌分类,986 044个细菌菌株;6 785个真菌病原体基因组,分属于407个真菌分类,6 294个真菌菌株;90 029个病毒基因组,分属于226个病毒分类,13 689个病毒毒株;670个寄生虫基因组,分属于174个寄生类群,403个寄生虫菌株 |
PMDB | 搭载于PMseq Datician病原专家系统,付费平台 | 专用高质量临床级别数据库 | 细菌、真菌、病毒、寄生虫等17 500种病原体。其中PMDB-耐药/毒力数据库经过大数据测试严格筛选,最终选择了32种代表性耐药基因和281种代表性毒力基因 |
Victors[ | https://www.phidias.us/victors/ | 是一个经人工筛选的、基于网页界面的交互式的综合数据库和分析资源 | 目前平台上包含5 304个毒力因子和127种病原菌,主要有51种细菌、54种病毒、13种寄生物种和8种真菌物种 |
Pathogen-Host Interaction[ (PHI-base) | http://www.phi-base.org/ | 病原菌-宿主互作数据库 | 包含宿主有动物、植物、真菌和昆虫,病原体有真菌、卵菌和细菌,并整理了经过实验验证的致病性、毒力和效应基因。目前的版本中包括4 387篇参考文献,提供了279种病原菌的8 411种基因,被证实对228种宿主的18 190种相互作用关系产生影响,致病种类达533种 |
方法 Method | 算法 Algorithm | 优点 Advantages | 局限 Limitations |
---|---|---|---|
KMCP | 基于K-mer | 通过结合k-mer相似性和基因组覆盖信息,减少了假阳性率。能够处理低深度的临床样本,有助于病原体的准确检测 | 针对临床样品开发,K设定为21,查找一系列的21长度的K-mer,但这些K-mers并不是某个病原菌所特有的,对于环境中低丰度的病原体,难以保证仍然能够找到这一系列的K-mers |
DCiPatho | 深度交叉融合网络 | 利用深度学习算法进行人畜和动植物致病菌的准确识别。使用k-mer频率特征,有助于捕捉病原微生物的特定序列特征 | K设定为3-7,当K增加时,需要的计算和内存资源呈指数级增加。分析宏基因组数据时,要求MAG尽可能完整,在环境病原体检测中的实用性不高 |
SeqScreen | 集成机器学习模型 | 高召回率和精确度,能够通过功能信息补充现有的分类方法。能够区分已知和未知的病原体,提供更全面的病原体检测 | 依赖于注释序列进行SoCs的识别,可能限制了对新病原体的检测能力。需要大量的计算资源,尤其是在处理大规模数据集时 |
GSMer | 基于K-mer | GSMer方法通过识别特定的基因组序列标记,提供了对病原微生物高度特异性的鉴定。一旦建立了数据库,GSMer方法可以快速地对新的样本进行分类和鉴定。该方法对低丰度的病原体也具有较好的检测能力 | GSMer方法的效果很大程度上取决于所用数据库的完整性和准确性。如果数据库中缺少某些病原体的信息,那么这些病原体可能无法被检测到。分析大规模基因组数据需要较高的计算资源和存储能力 |
Table 2 Comparison of matagenomic methods for the detection of environmental pathogens
方法 Method | 算法 Algorithm | 优点 Advantages | 局限 Limitations |
---|---|---|---|
KMCP | 基于K-mer | 通过结合k-mer相似性和基因组覆盖信息,减少了假阳性率。能够处理低深度的临床样本,有助于病原体的准确检测 | 针对临床样品开发,K设定为21,查找一系列的21长度的K-mer,但这些K-mers并不是某个病原菌所特有的,对于环境中低丰度的病原体,难以保证仍然能够找到这一系列的K-mers |
DCiPatho | 深度交叉融合网络 | 利用深度学习算法进行人畜和动植物致病菌的准确识别。使用k-mer频率特征,有助于捕捉病原微生物的特定序列特征 | K设定为3-7,当K增加时,需要的计算和内存资源呈指数级增加。分析宏基因组数据时,要求MAG尽可能完整,在环境病原体检测中的实用性不高 |
SeqScreen | 集成机器学习模型 | 高召回率和精确度,能够通过功能信息补充现有的分类方法。能够区分已知和未知的病原体,提供更全面的病原体检测 | 依赖于注释序列进行SoCs的识别,可能限制了对新病原体的检测能力。需要大量的计算资源,尤其是在处理大规模数据集时 |
GSMer | 基于K-mer | GSMer方法通过识别特定的基因组序列标记,提供了对病原微生物高度特异性的鉴定。一旦建立了数据库,GSMer方法可以快速地对新的样本进行分类和鉴定。该方法对低丰度的病原体也具有较好的检测能力 | GSMer方法的效果很大程度上取决于所用数据库的完整性和准确性。如果数据库中缺少某些病原体的信息,那么这些病原体可能无法被检测到。分析大规模基因组数据需要较高的计算资源和存储能力 |
[1] |
Gu W, Miller S, Chiu CY. Clinical metagenomic next-generation sequencing for pathogen detection[J]. Annu Rev Pathol, 2019, 14: 319-338.
doi: 10.1146/annurev-pathmechdis-012418-012751 pmid: 30355154 |
[2] |
Mina MJ, Andersen KG. COVID-19 testing: one size does not fit all[J]. Science, 2021, 371(6525): 126-127.
doi: 10.1126/science.abe9187 pmid: 33414210 |
[3] | Delgado-Baquerizo M, Guerra CA, Cano-Díaz C, et al. The proportion of soil-borne pathogens increases with warming at the global scale[J]. Nat Clim Change, 2020, 10: 550-554. |
[4] | Carlson CJ, Albery GF, Merow C, et al. Climate change increases cross-species viral transmission risk[J]. Nature, 2022, 607(7919): 555-562. |
[5] |
Morawska L, Allen J, Bahnfleth W, et al. A paradigm shift to combat indoor respiratory infection[J]. Science, 2021, 372(6543): 689-691.
doi: 10.1126/science.abg2025 pmid: 33986171 |
[6] | Chaloner TM, Gurr SJ, Bebber DP. Plant pathogen infection risk tracks global crop yields under climate change[J]. Nat Clim Change, 2021, 11: 710-715. |
[7] |
Bein T, Karagiannidis C, Quintel M. Climate change, global warming, and intensive care[J]. Intensive Care Med, 2020, 46(3): 485-487.
doi: 10.1007/s00134-019-05888-4 pmid: 31820033 |
[8] | Mora C, McKenzie T, Gaw IM, et al. Over half of known human pathogenic diseases can be aggravated by climate change[J]. Nat Clim Chang, 2022, 12(9): 869-875. |
[9] | Levy K, Woster AP, Goldstein RS, et al. Untangling the impacts of climate change on waterborne diseases: a systematic review of relationships between diarrheal diseases and temperature, rainfall, flooding, and drought[J]. Environ Sci Technol, 2016, 50(10): 4905-4922. |
[10] | Liu B, Zheng DD, Zhou SY, et al. VFDB 2022: a general classification scheme for bacterial virulence factors[J]. Nucleic Acids Res, 2022, 50(D1): D912-D917. |
[11] | Davis JJ, Wattam AR, Aziz RK, et al. The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities[J]. Nucleic Acids Res, 2020, 48(D1): D606-D612. |
[12] | Xie ST, Ding LJ, Huang FY, et al. VFG-Chip: A high-throughput qPCR microarray for profiling virulence factor genes from the environment[J]. Environmental International, 2023, 172: 107761. |
[13] | Lee YJ, van Nostrand JD, Tu QC, et al. The PathoChip, a functional gene array for assessing pathogenic properties of diverse microbial communities[J]. ISME J, 2013, 7(10): 1974-1984. |
[14] | Wang YN, Xu XB, Zhu BL, et al. Genomic analysis of almost 8, 000 Salmonella genomes reveals drivers and landscape of antimicrobial resistance in China[J]. Microbiol Spectr, 2023, 11(6): e0208023. |
[15] | Sun Z, Liu XD, Jing GC, et al. Comprehensive understanding to the public health risk of environmental microbes via a microbiome-based index[J]. J Genet Genomics, 2022, 49(7): 685-688. |
[16] |
Miao JJ, Han N, Qiang YJ, et al. 16SPIP: a comprehensive analysis pipeline for rapid pathogen detection in clinical samples based on 16S metagenomic sequencing[J]. BMC Bioinformatics, 2017, 18(Suppl 16): 568.
doi: 10.1186/s12859-017-1975-3 pmid: 29297318 |
[17] | Yang XR, Jiang GF, Zhang YZ, et al. MBPD: a multiple bacterial pathogen detection pipeline for One Health practices[J]. iMeta, 2023, 2(1): e82. |
[18] | Lu T, Yao B, Zhang C. DFVF: database of fungal virulence factors[J]. Database, 2012, 2012: bas032. |
[19] | Amos B, Aurrecoechea C, Barba M, et al. VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center[J]. Nucleic Acids Res, 2022, 50(D1): D898-D911. |
[20] | Pickett BE, Sadat EL, Zhang Y, et al. ViPR: an open bioinformatics database and analysis resource for virology research[J]. Nucleic Acids Res, 2012, 40(Database issue): D593-D598. |
[21] | Hulo C, de Castro E, Masson P, et al. ViralZone: a knowledge resource to understand virus diversity[J]. Nucleic Acids Res, 2011, 39(Database issue): D576-D582. |
[22] | Sayers S, Li L, Ong E, et al. Victors: a web-based knowledge base of virulence factors in human and animal pathogens[J]. Nucleic Acids Res, 2019, 47(D1): D693-D700. |
[23] | Urban M, Cuzick A, Seager J, et al. PHI-base: the pathogen-host interactions database[J]. Nucleic Acids Res, 2020, 48(D1): D613-D620. |
[24] | 糜祖煌. 生物信息学技术在细菌毒力基因研究中的前景及意义[J]. 中华临床感染病杂志, 2014, 7(1): 15-20. |
Mi ZH. Prospect of bioinformatics in research on virulence genes of bacteria[J]. Chin J Clin Infect Dis, 2014, 7(1): 15-20. | |
[25] | Zhu L, Lian YL, Lin D, et al. Insights into microbial contamination in multi-type manure-amended soils: the profile of human bacterial pathogens, virulence factor genes and antibiotic resistance genes[J]. J Hazard Mater, 2022, 437: 129356. |
[26] | Liu HB, Zhang Y, Chen JG. Whole-genome sequencing and functional annotation of pathogenic causing human cellulitis[J]. Human Genomics, 2023, 17(1): 65. |
[27] | Liu YQ, Ji MK, Yu T, et al. A genome and gene catalog of glacier microbiomes[J]. Nat Biotechnol, 2022, 40(9): 1341-1348. |
[28] |
Knight R, Vrbanac A, Taylor BC, et al. Best practices for analysing microbiomes[J]. Nat Rev Microbiol, 2018, 16(7): 410-422.
doi: 10.1038/s41579-018-0029-9 pmid: 29795328 |
[29] | Gupta A, Kapil R, Dhakan DB, et al. MP3: a software tool for the prediction of pathogenic proteins in genomic and metagenomic data[J]. PLoS One, 2014, 9(4): e93907. |
[30] |
Balaji A, Kille B, Kappell AD, et al. SeqScreen: accurate and sensitive functional screening of pathogenic sequences via ensemble learning[J]. Genome Biol, 2022, 23(1): 133.
doi: 10.1186/s13059-022-02695-x pmid: 35725628 |
[31] | Jiang GF, Zhang JX, Zhang YZ, et al. DCiPatho: deep cross-fusion networks for genome scale identification of pathogens[J]. Brief Bioinform, 2023, 24(4): bbad194. |
[32] | Shen W, Xiang HY, Huang TQ, et al. KMCP: accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping[J]. Bioinformatics, 2023, 39(1): btac845. |
[33] | Tu QC, He ZL, Zhou JZ. Strain/species identification in metagenomes using genome-specific markers[J]. Nucleic Acids Res, 2014, 42(8): e67. |
[34] |
de Nies L, Lopes S, Busi SB, et al. PathoFact: a pipeline for the prediction of virulence factors and antimicrobial resistance genes in metagenomic data[J]. Microbiome, 2021, 9(1): 49.
doi: 10.1186/s40168-020-00993-9 pmid: 33597026 |
[35] |
Beam AL, Kohane IS. Big data and machine learning in health care[J]. JAMA, 2018, 319(13): 1317-1318.
doi: 10.1001/jama.2017.18391 pmid: 29532063 |
[36] |
Deneke C, Rentzsch R, Renard BY. PaPrBaG: a machine learning approach for the detection of novel pathogens from NGS data[J]. Sci Rep, 2017, 7: 39194.
doi: 10.1038/srep39194 pmid: 28051068 |
[37] |
Ren J, Ahlgren NA, Lu YY, et al. VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data[J]. Microbiome, 2017, 5(1): 69.
doi: 10.1186/s40168-017-0283-5 pmid: 28683828 |
[38] |
Bartoszewicz JM, Seidel A, Rentzsch R, et al. DeePaC: predicting pathogenic potential of novel DNA with reverse-complement neural networks[J]. Bioinformatics, 2020, 36(1): 81-89.
doi: 10.1093/bioinformatics/btz541 pmid: 31298694 |
[39] | 郑丹丹. 基于深度学习的病原菌毒力因子预测方法研究[D]. 北京: 中国医学科学院北京协和医学院, 2021. |
Zheng DD. A deep learning model for the prediction of bacterial virulence factors[D]. Beijing: Chinese Academy of Medical Sciences Peking Union Medical College Hospital, 2021. | |
[40] |
Ren J, Song K, Deng C, et al. Identifying viruses from metagenomic data using deep learning[J]. Quant Biol, 2020, 8(1): 64-77.
doi: 10.1007/s40484-019-0187-4 |
[41] | Miao Y, Liu F, Hou T, et al. Virtifier: a deep learning-based identifier for viral sequences from metagenomes[J]. Bioinformatics, 2022, 38(5): 1216-1222. |
[42] |
Alipanahi B, Frey BJ. Network cleanup[J]. Nat Biotechnol, 2013, 31(8): 714-715.
doi: 10.1038/nbt.2657 pmid: 23929347 |
[43] | Xiao NJ, Zhou AF, Kempher ML, et al. Disentangling direct from indirect relationships in association networks[J]. Proc Natl Acad Sci USA, 2022, 119(2): e2109995119. |
[44] | Feng K, Peng X, Zhang Z, et al. iNAP: an integrated network analysis pipeline for microbiome studies[J]. iMeta, 2022, 1(2): e13. |
[45] |
Gardner TS, Faith JJ. Reverse-engineering transcription control networks[J]. Phys Life Rev, 2005, 2(1): 65-88.
doi: 10.1016/j.plrev.2005.01.001 pmid: 20416858 |
[46] | Montoya JM, Pimm SL, Solé RV. Ecological networks and their fragility[J]. Nature, 2006, 442(7100): 259-264. |
[47] | Boutin S, Bernatchez L, Audet C, et al. Network analysis highlights complex interactions between pathogen, host and commensal microbiota[J]. PLoS One, 2013, 8(12): e84772. |
[48] |
Kerdraon L, Barret M, Laval V, et al. Differential dynamics of microbial community networks help identify microorganisms interacting with residue-borne pathogens: the case of Zymoseptoria tritici in wheat[J]. Microbiome, 2019, 7(1): 125.
doi: 10.1186/s40168-019-0736-0 pmid: 31470910 |
[49] |
Li J, Wang SM, Chen Z, et al. A bipartite network module-based project to predict pathogen-host association[J]. Front Genet, 2019, 10: 1357.
doi: 10.3389/fgene.2019.01357 pmid: 32038713 |
[50] | Miryala SK, Ramaiah S. Cellular and molecular level host-pathogen interactions in Francisella tularensis: a microbial gene network study[J]. Comput Biol Chem, 2022, 96: 107601. |
[51] | Singh N, Rai S, Bhatnagar R, et al. Network analysis of host-pathogen protein interactions in microbe induced cardiovascular diseases[J]. In Silico Biol, 2021, 14(3/4): 115-133. |
[52] | Wang YF, Dong QB, Hu SX, et al. Decoding microbial genomes to understand their functional roles in human complex diseases[J]. iMeta, 2022, 1(2): e14. |
[53] | Wu ZN, Lyu HH, Liang W, et al. Microbial community in indoor dusts from university dormitories: characteristics, potential pathogens and influence factors[J]. Atmos Pollut Res, 2021, 12(3): 321-333. |
[54] | Sun CN, Yuan T, Chen L, et al. Occurrence of potentially pathogenic bacteria on shared bicycles[J]. Int J Hyg Environ Health, 2020, 224: 113442. |
[55] | Xu Y, Gao Y, Tan L, et al. Exploration of bacterial communities in products after composting rural wastes with different components: core microbiome and potential pathogenicity[J]. Environ Technol Innov, 2022, 25: 102222. |
[56] |
Jing GC, Zhang YF, Cui WZ, et al. Meta-Apo improves accuracy of 16S-amplicon-based prediction of microbiome function[J]. BMC Genomics, 2021, 22(1): 9.
doi: 10.1186/s12864-020-07307-1 pmid: 33407112 |
[57] | Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold[J]. Nature, 2021, 596(7873): 583-589. |
[58] | Jauss RT, Nowack A, Walden S, et al. To the canopy and beyond: air dispersal as a mechanism of ubiquitous protistan pathogen assembly in tree canopies[J]. Eur J Protistol, 2021, 80: 125805. |
[59] | Wu HM, Wu HM, Jiao YY, et al. The combination of biochar and PGPBs stimulates the differentiation in rhizosphere soil microbiome and metabolites to suppress soil-borne pathogens under consecutive monoculture regimes[J]. GCB Bioenergy, 2022, 14(1): 84-103. |
[60] | Zhu D, Ma J, Li G, et al. Soil plastispheres as hotpots of antibiotic resistance genes and potential pathogens[J]. ISME J, 2022, 16(2): 521-532. |
[61] | Zhou L, Liu L, Chen WY, et al. Stochastic determination of the spatial variation of potentially pathogenic bacteria communities in a large subtropical river[J]. Environ Pollut, 2020, 264: 114683. |
[62] | Liu S, Wang PF, Wang C, et al. Anthropogenic disturbances on antibiotic resistome along the Yarlung Tsangpo River on the Tibetan Plateau: ecological dissemination mechanisms of antibiotic resistance genes to bacterial pathogens[J]. Water Res, 2021, 202: 117447. |
[63] | Zhou JZ, Deng Y, Zhang P, et al. Stochasticity, succession, and environmental perturbations in a fluidic ecosystem[J]. Proc Natl Acad Sci USA, 2014, 111(9): E836-E845. |
[64] |
Ning DL, Yuan MT, Wu LW, et al. A quantitative framework reveals ecological drivers of grassland microbial community assembly in response to warming[J]. Nat Commun, 2020, 11(1): 4717.
doi: 10.1038/s41467-020-18560-z pmid: 32948774 |
[65] |
Stegen JC, Lin XJ, Fredrickson JK, et al. Estimating and mapping ecological processes influencing microbial community assembly[J]. Front Microbiol, 2015, 6: 370.
doi: 10.3389/fmicb.2015.00370 pmid: 25983725 |
[66] |
Ning DL, Deng Y, Tiedje JM, et al. A general framework for quantitatively assessing ecological stochasticity[J]. Proc Natl Acad Sci USA, 2019, 116(34): 16892-16898.
doi: 10.1073/pnas.1904623116 pmid: 31391302 |
[67] | Vilmi A, Gibert C, Escarguel G, et al. Dispersal-niche continuum index: a new quantitative metric for assessing the relative importance of dispersal versus niche processes in community assembly[J]. Ecography, 2021, 44(3): 370-379. |
[68] |
Inda-Díaz JS, Lund D, Parras-Moltó M, et al. Latent antibiotic resistance genes are abundant, diverse, and mobile in human, animal, and environmental microbiomes[J]. Microbiome, 2023, 11(1): 44.
doi: 10.1186/s40168-023-01479-0 pmid: 36882798 |
[69] | 柏耀辉, 王巧娟, 梁金松, 等. 一种基于抗性基因及毒力因子基因评价水体健康风险的方法: CN 111944914A[P]. 2020-11-17. |
Bai YH, Wang QJ, Liang JS, et al. A method for assessing water health risk based on resistance gene and virulence factor gene: CN 111944914A[P]. 2020-11-17. | |
[70] |
Zhang ZY, Zhang Q, Wang TZ, et al. Assessment of global health risk of antibiotic resistance genes[J]. Nat Commun, 2022, 13(1): 1553.
doi: 10.1038/s41467-022-29283-8 pmid: 35322038 |
[1] | ZHANG Ya-ya, LI Pan-pan, GAO Hui-hui, JIA Chen-bo, XU Chun-yan. Exploring on the Pathogenesis of Root Rot of Lycium barbarum cv. ‘Ningqi-5' Based on the Rhizoplane Fungal Community and Pathogens Identification [J]. Biotechnology Bulletin, 2024, 40(9): 238-248. |
[2] | LIU Lu, ZHU Zhe-yuan, LI Ying-xi, WANG Jie, PENG Di. Research Progress in Microbial Herbicides [J]. Biotechnology Bulletin, 2024, 40(9): 161-171. |
[3] | CUI Man, SHAO Gai-ge, YANG Nuo-lin, FAN Qing-hao, ZHANG Jin-wei, TIAN Yu, ZHENG Su-yue, ZHANG Rui-ying. Research Progress in the Effects of Microorganisms on the Growth and Development of Edible Mushrooms [J]. Biotechnology Bulletin, 2024, 40(5): 13-22. |
[4] | WANG Li-chao, LI Huan, SHENG Ruo-cheng, LI Min, CHEN Feng-mao. Role of Acetylation in the Pathogenic Process of Plant Pathogens [J]. Biotechnology Bulletin, 2024, 40(5): 1-12. |
[5] | YANG Wen-li, ZHU Li-li, CHEN Jian, CHEN Yan-xin, YAO Juan, JIANG Da-gang. Research Progress in the Reference Materials of Crop Pathogens in China [J]. Biotechnology Bulletin, 2024, 40(2): 31-37. |
[6] | ZHANG Jin-wei, WU Yuan-xia, SUN Jing, LI Xiao-kai, LU Lu, LI Zhou-quan, GE Liang-peng. Effects of Commensal Microbiota on Intestinal Development, Metabolism, and Mitochondrial Function in Piglets [J]. Biotechnology Bulletin, 2024, 40(1): 332-343. |
[7] | MA Jun-xiu, WU Hao-qiong, JIANG Wei, YAN Geng-xuan, HU Ji-hua, ZHANG Shu-mei. Screening and Identification of Broad-spectrum Antagonistic Bacterial Strains Against Vegetable Soft Rot Pathogen and Its Control Effects [J]. Biotechnology Bulletin, 2023, 39(7): 228-240. |
[8] | YANG Yang, ZHU Jin-cheng, LOU Hui, HAN Ze-gang, ZHANG Wei. Transcriptome Analysis of Interaction Between Gossypium barbadense and Fusarium oxysporum f. sp. vasinfectum [J]. Biotechnology Bulletin, 2023, 39(6): 259-273. |
[9] | PAN Guo-qiang, WU Si-yuan, LIU Lu, GUO Hui-ming, CHENG Hong-mei, SU Xiao-feng. Construction and Preliminary Analysis of Verticillim dahliae Mutant Library [J]. Biotechnology Bulletin, 2023, 39(5): 112-119. |
[10] | XU Xiao-wen, LI Jin-cang, HAI Du, ZHA Yu-ping, SONG Fei, WANG Yi-xun. Identification and Diversity Analysis of Mycoviruses from the Phytopathogenic Fungus Colletotrichum spp. of Walnut [J]. Biotechnology Bulletin, 2023, 39(3): 278-289. |
[11] | WANG Wei-chen, ZHAO Jin, HUANG Wei-yi, GUO Xin-zhu, LI Wan-ying, ZHANG Zhuo. Research Progress in Metabolites Produced by Bacillus Against Three Common Plant Pathogenic Fungi [J]. Biotechnology Bulletin, 2023, 39(3): 59-68. |
[12] | WANG Feng-ting, WANG Yan, SUN Ying, CUI Wen-jing, QIAO Kai-bin, PAN Hong-yu, LIU Jin-liang. Isolation and Identification of Saline-alkali Tolerant Aspergillus terreus SYAT-1 and Its Activities Against Plant Pathogenic Fungi [J]. Biotechnology Bulletin, 2023, 39(2): 203-210. |
[13] | WU Li-dan, RAN Xue-qin, NIU Xi, HUANG Shi-hui, LI Sheng, WANG Jia-fu. Genome Comparison and Virulence Factor Analysis of Pathogenic Escherichia coli from Porcine [J]. Biotechnology Bulletin, 2023, 39(12): 287-299. |
[14] | WAN Qi-wu, BAO Xu-dong, DING Ke, MOU Hua-ming, LUO Yang. Research Progress in Microfluidic Technology in the Detection of Pathogenic Microorganisms [J]. Biotechnology Bulletin, 2023, 39(10): 107-114. |
[15] | HUANG Jia-yan, FENG Xiao-yan, SHEN Lin-bo, WANG Wen-zhi, HU Hai-yan, ZHANG Shu-zhen. Cloning of Sugarcane ShPR10 Gene and Study on the Interaction Between ShPR10 Protein and P1 Protein Encoded by Sugarcane Streak Mosaic Virus [J]. Biotechnology Bulletin, 2023, 39(10): 163-174. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||