Home NATURALEZA Detection of viral sequences at single-cell resolution identifies novel viruses associated with...

Detection of viral sequences at single-cell resolution identifies novel viruses associated with host gene expression changes

4
0


  • Anthony, S. J. et al. A strategy to estimate unknown viral diversity in mammals. MBio 4, e00598-13 (2013).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jones, K. E. et al. Global trends in emerging infectious diseases. Nature 451, 990–993 (2008).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mollentze, N., Babayan, S. A. & Streicker, D. G. Identifying and prioritizing potential human-infecting viruses from their genome sequences. PLoS Biol. 19, e3001390 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Schiller, J. T. & Lowy, D. R. An introduction to virus infections and human cancer. Recent Results Cancer Res. 217, 1–11 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Martens, C. R. & Accornero, F. Viruses in the heart: direct and indirect routes to myocarditis and heart failure. Viruses 13, 1924 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bjornevik, K. et al. Longitudinal analysis reveals high prevalence of Epstein–Barr virus associated with multiple sclerosis. Science 375, 296–301 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Levine, K. S. et al. Virus exposure and neurodegenerative disease risk across national biobanks. Neuron 111, 1086–1093.e2 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cairns, D. M., Itzhaki, R. F. & Kaplan, D. L. Potential involvement of varicella zoster virus in Alzheimer’s disease via reactivation of quiescent herpes simplex virus type 1. J Alzheimers Dis. 88, 1189–1200 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Camargo, A. P. et al. IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata. Nucleic Acids Res. 51, D733–D743 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Edgar, R. C. et al. Petabase-scale sequence alignment catalyses viral discovery. Nature 602, 142–147 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Babaian, A. & Edgar, R. Ribovirus classification by a polymerase barcode sequence. PeerJ 10, e14055 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Chang, J.-T., Liu, L.-B., Wang, P.-G. & An, J. Single-cell RNA sequencing to understand host‒virus interactions. Virol. Sin. 39, 1–8 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hill, V. et al. Toward a global virus genomic surveillance network. Cell Host Microbe 31, 861–873 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ren, J. et al. Identifying viruses from metagenomic data using deep learning. Quant. Biol. 8, 64–77 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tithi, S. S., Aylward, F. O., Jensen, R. V. & Zhang, L. FastViromeExplorer: a pipeline for virus and phage identification and abundance profiling in metagenomics data. PeerJ 6, e4227 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Camargo, A. P. et al. Identification of mobile genetic elements with geNomad. Nat. Biotechnol. 42, 1303–1312 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Amgarten, D., Braga, L. P. P., da Silva, A. M. & Setubal, J. C. MARVEL, a tool for prediction of bacteriophage sequences in metagenomic bins. Front. Genet. 9, 304 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Starikova, E. V. et al. Phigaro: high-throughput prophage sequence annotation. Bioinformatics 36, 3882–3884 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kieft, K., Zhou, Z. & Anantharaman, K. VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences. Microbiome 8, 90 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Antipov, D., Raiko, M., Lapidus, A. & Pevzner, P. A. Metaviral SPAdes: assembly of viruses from metagenomic data. Bioinformatics 36, 4126–4129 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Ren, J., Ahlgren, N. A., Lu, Y. Y., Fuhrman, J. A. & Sun, F. VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data. Microbiome 5, 69 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Guo, J. et al. VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses. Microbiome 9, 37 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xia, Y., Liu, Y., Deng, M. & Xi, R. Detecting virus integration sites based on multiple related sequencing data by VirTect. BMC Med. Genomics 12, 19 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bost, P. et al. Host–viral infection maps reveal signatures of severe COVID-19 patients. Cell 181, 1475–1488.e12 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lee, C. Y. et al. Venus: an efficient virus infection detection and fusion site discovery method using single-cell and bulk RNA-seq data. PLoS Comput. Biol. 18, e1010636 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yasumizu, Y., Hara, A., Sakaguchi, S. & Ohkura, N. VIRTUS: a pipeline for comprehensive virus analysis from conventional RNA-seq data. Bioinformatics 37, 1465–1467 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Lu, J. & Salzberg, S. L. Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2. Microbiome 8, 124 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hou, X. et al. Using artificial intelligence to document the hidden RNA virosphere. Cell https://doi.org/10.1016/j.cell.2024.09.027 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Melsted, P. et al. Modular, efficient and constant-memory single-cell RNA-seq preprocessing. Nat. Biotechnol. 39, 813–818 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Sullivan, D. K. et al. kallisto, bustools and kb-python for quantifying bulk, single-cell and single-nucleus RNA-seq. Nat. Protoc. https://doi.org/10.1038/s41596-024-01057-0 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ramsköld, D. et al. Full-length mRNA-seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gierahn, T. M. et al. Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput. Nat. Methods 14, 395–398 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Desai, N. et al. Temporal and spatial heterogeneity of host response to SARS-CoV-2 pulmonary infection. Nat. Commun. 11, 6319 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Viloria Winnett, A. et al. Morning SARS-CoV-2 testing yields better detection of infection due to higher viral loads in saliva and nasal swabs upon waking. Microbiol. Spectr. 10, e0387322 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Viloria Winnett, A. et al. Extreme differences in SARS-CoV-2 viral loads among respiratory specimen types during presumed pre-infectious and infectious periods. PNAS Nexus 2, gad033 (2023).

    Article 

    Google Scholar
     

  • Kotliar, D. et al. Single-cell profiling of Ebola virus disease in vivo reveals viral and host dynamics. Cell 183, 1383–1401.e19 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sharma, A. et al. Human iPSC-derived cardiomyocytes are susceptible to SARS-CoV-2 infection. Cell Rep. Med. 1, 100052 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Peck, K. M. & Lauring, A. S. Complexities of viral mutation rates. J. Virol. 92, e0103117 (2018).

    Article 

    Google Scholar
     

  • Gihawi, A. et al. Major data analysis errors invalidate cancer microbiome findings. mBio 14, e0160723 (2023).

    Article 
    PubMed 

    Google Scholar
     

  • Breitwieser, F. P., Pertea, M., Zimin, A. V. & Salzberg, S. L. Human contamination in bacterial genomes has created thousands of spurious proteins. Genome Res. 29, 954–960 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Steinegger, M. & Salzberg, S. L. Terminating contamination: large-scale search identifies more than 2,000,000 contaminated entries in GenBank. Genome Biol. 21, 115 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, J. & Han, G.-Z. Genome mining shows that retroviruses are pervasively invading vertebrate genomes. Nat. Commun. 14, 4968 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Warren, W. C. et al. Sequence diversity analyses of an improved rhesus macaque genome enhance its biomedical utility. Science 370, eabc6617 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wachtman, L. & Mansfield, K. Viral diseases of nonhuman primates. In Nonhuman Primates in Biomedical Research 2nd edn (eds. Abee, C. R. et al.) Ch. 1 (Academic Press, 2012).

  • Porter, A. F., Cobbin, J., Li, C.-X., Eden, J.-S. & Holmes, E. C. Metagenomic identification of viral sequences in laboratory reagents. Viruses 13, 2122 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Callanan, J. et al. Expansion of known ssRNA phage genomes: from tens to over a thousand. Sci. Adv. 6, eaay5981 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cohen, J. I. Herpesvirus latency. J. Clin. Invest. 130, 3361–3369 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Woźniakowski, G. & Samorek-Salamonowicz, E. Animal herpesviruses and their zoonotic potential for cross-species infection. Ann. Agric. Environ. Med. 22, 191–194 (2015).

    Article 
    PubMed 

    Google Scholar
     

  • Yao, X. et al. In vitro infection dynamics of wuxiang virus in different cell lines. Viruses 14, 2383 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Melsted, P., Ntranos, V. & Pachter, L. The barcode, UMI, set format and BUStools. Bioinformatics 35, 4472–4473 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Sakaguchi, S., Nakano, T. & Nakagawa, S. NeoRdRp2 with improved seed data, annotations, and scoring. Front. Virol. 4, 1378695 (2024).

    Article 

    Google Scholar
     

  • Sakaguchi, S. et al. NeoRdRp: a comprehensive dataset for identifying RNA-dependent RNA polymerases of various RNA viruses from metatranscriptomic data. Microbes Environ. 37, ME22001 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Pirtskhalava, M. et al. DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 49, D288–D297 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Abdill, R. J. et al. Integration of 168,000 samples reveals global patterns of the human gut microbiome. Cell 188, 1100–1118.e17 (2025).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Buchfink, B., Reuter, K. & Drost, H.-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods 18, 366–368 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kühl, M. A., Stich, B. & Ries, D. C. Mutation-simulator: fine-grained simulation of random mutations in any genome. Bioinformatics 37, 568–569 (2021).

    Article 
    PubMed 

    Google Scholar
     

  • Golomb, S. W., Gordon, B. & Welch, L. R. Comma-free codes. Canad. J. Math. 10, 202–209 (1958).

    Article 

    Google Scholar
     

  • Hauser, M., Steinegger, M. & Söding, J. MMseqs software suite for fast and deep clustering and searching of large protein sequence sets. Bioinformatics 32, 1323–1330 (2016).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lu, J. et al. Metagenome analysis using the Kraken software suite. Nat. Protoc. 17, 2815–2839 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kuznetsov, A. & Bollin, C. J. NCBI genome workbench: desktop software for comparative genomics, visualization, and genbank data submission. Methods Mol. Biol. 2231, 261–295 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Luebbert, L. & Pachter, L. Efficient querying of genomic reference databases with gget. Bioinformatics 39, btac836 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cock, P. J. A. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zulkower, V. & Rosser, S. DNA Chisel, a versatile sequence optimizer. Bioinformatics 36, 4508–4509 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hughes, T. K. et al. Second-strand synthesis-based massively parallel scRNA-seq reveals cellular states and molecular features of human inflammatory skin pathologies. Immunity 53, 878–894.e7 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gálvez-Merchán, Á., Min, K. H. J., Pachter, L. & Booeshaghi, A. S. Metadata retrieval from sequence databases with ffq. Bioinformatics 39, bta2667 (2023).

    Article 

    Google Scholar
     

  • Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://doi.org/10.48550/arXiv.1303.3997 (2013).

  • Svensson, V., da Veiga Beltrame, E. & Pachter, L. Quantifying the tradeoff between sequencing depth and cell number in single-cell RNA-seq. Preprint at https://doi.org/10.1101/762773 (2019).

  • Booeshaghi, A. S. & Pachter, L. Normalization of single-cell RNA-seq counts by log(x + 1) or log(1 + x). Bioinformatics 37, 2223–2224 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ondov, B. D., Bergman, N. H. & Phillippy, A. M. Interactive metagenomic visualization in a web browser. BMC. Bioinformatics 12, 385 (2011).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gene Ontology Consortium. et al. The Gene Ontology knowledgebase in 2023. Genetics 224, iyad031 (2023).

    Article 

    Google Scholar
     

  • Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).

    Article 

    Google Scholar
     

  • Ostendorf, B. N. et al. Common human genetic variants of APOE impact murine COVID-19 mortality. Nature 611, 346–351 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Edgar, R., Domrachev, M. & Lash, A. E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Luebbert, L. & Pachter, L. S. Efficient and accurate detection of viral sequences at single-cell resolution reveals novel viruses perturbing host gene expression. CaltechDATA https://doi.org/10.22002/KRQMP-5HY81 (2024).

  • Luebbert, L. & Pachter, L. Efficient and accurate detection of viral sequences at single-cell resolution reveals novel viruses perturbing host gene expression (continued). CaltechDATA https://doi.org/10.22002/K7XQW-88D74 (2023).

  • Luebbert, L. et al. GitHub repository containing the source code for the manuscript ‘Detection of viral sequences at single-cell resolution identifies novel viruses associated with host gene expression changes’. Github https://github.com/pachterlab/LSCHWCP_2023 (2023).

  • Wick, R. R., Schultz, M. B., Zobel, J. & Holt, K. E. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 31, 3350–3352 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     



  • Source link

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here