报告题目:Automated assembly of genomes using PacBio long error-prone reads alone
报告时间:10月19日 9:10-9:40
Euchromatic reference genomes have been released for some eukaryotes, e.g. Human and Caenorhabditis elegans (C. elegans), but they were all semimanually assembled through incorporating different assemblers and advanced sequencing technologies. It was announced here that all genomes could be automatically T2T assembled, with no gaps left unsolved, using PacBio long error-prone reads alone. We invented a highly efficient and effective foolish algorithm, named CocSil, which can automatically assemble genomes from the ordinary PacBio long reads alone with no gaps left at all by mocking Cocoon Silking instead of traversing on a graph. The initial test on C. elegans VC2010 showed that CocSil completely assembled the C. elegan genome chromosome by chromosome on AMD EPYC7742 64 cores 2L within 4hrs as observed that the CocSil assemblies not only cover the reference updated very recently but also fills all the gaps corresponding to the regions of centromere/telomere and consecutive short repeats. We are strongly persuaded that the CocSil could be sufficient to gapless assemble human genome from PacBio reads alone.
李国君,山东大学特聘教授。1996年获中科院数学与系统科学研究所博士学位。研究领域涉及图论、组合最优化和生物信息学。在图论领域:论文分别发表在JCTB, Combinatorica和JGT等; 在组合最优化/理论计算机科学领域:论文分别发表在SIAM J. Compt和ACM Trans. Algorithms等;在生物信息学领域:以第一或通讯作者在生物信息学相关的顶级期刊发表论文20+篇。主持国家自然科学基金委重点项目2项。