Biotechnology Bulletin ›› 2022, Vol. 38 ›› Issue (9): 180-190.doi: 10.13560/j.cnki.biotech.bull.1985.2022-0344

Previous Articles     Next Articles

Assembly of Pepino Genome Based on PacBio's Third-generation Sequencing Technology

SI Cheng1,2(), ZHONG Qi-wen1, YANG Shi-peng1()   

  1. 1. Qinghai Academy of Agricultural and Forestry Sciences/Qinghai Key Laboratory of Vegetable Genetics and physiology,Xining 810016
    2. College of Agriculture and Animal Husbandry,Qinghai University,Xining 810016
  • Received:2022-03-22 Online:2022-09-26 Published:2022-10-11
  • Contact: YANG Shi-peng E-mail:qhdxsc@163.com;qhyysp@163.com

Abstract:

Pepino(Solanum muricatum)has a variety of biological activities such as antioxidation,antitumor activity,antidiabetic activity. To enrich genomic information and evolutionary development of Solanaceae crops,we obtained the whole genome sequencing information of pepino,which lays the foundation for pepino-related molecular studies. Illumina HiSeq sequencing platform was used to construct a small fragment library for pepino characterization and evaluation while using plant tissues of pepino as experimental material. Then third-generation sequencing technology PacBio's sequencing technology and Hi-C technology were used to construct a whole genome database of the pepino. Different bioinformatics methods were to study assembling the obtained pepino genomes,function annotating and evolutionary analysis. The results showed that a total of 54.11 Gb of Illumina HiSeq data were acquired. First,55.08 Gb of PacBio data were obtained with an average reads length of 14 179 bp. The obtained chromosome conformation capture(Hi-C)was 143 Gb and total length contig sequemce of the assembled genome was 1.16 Gb,with a scaffold N50 of 22.63 Mb,chromosomes with a total of 1.12 Gb length of sequence that can be mapped to 12 chromosomes,accounting for 97.16% of the total genome sequence,respectively. Among the sequences,the length of sequences for which the order and orientation could be determined was 1.08 Gb,96.11% of the total length genes were localized chromosomal sequences. Based on the estimated genome size(1.25 Gb),the 64.22% repeat sequences were predicted,including 41 571 genes and 99.06% of which could be annotated to NR,GO,KEGG and other databases. Noncoding RNAs included 4 360 tRNAs,5 677 rRNAs,154 miRNAs,and a total of 449 pseudogenes were identified. The evolutionary time between pepino and potato was at approximately 12.82 MYA.

Key words: pepino, genome, PacBio third generation sequencing technology, gene annotation