TY - JOUR
T1 - DNAscan2: a versatile, scalable, and user-friendly analysis pipeline for human next-generation sequencing data
AU - Marriott, Heather
AU - Kabiljo, Renata
AU - Al Khleifat, Ahmad
AU - Dobson, Richard
AU - Al-Chalabi, Ammar
AU - Iacoangeli, Alfredo
N1 - Funding Information:
The authors acknowledge use of the research computing facility at King’s College London, Rosalind (https://rosalind.kcl.ac.uk). This work was supported by GlaxoSmithKline and the KCL Funded Centre for Doctoral Training (CDT) in Data-Driven Health to H.M. The authors receive funding from ALS Association Milton Safenowitz Research Fellowship, The Motor Neurone Disease Association, The Darby Rimmer Foundation, MND Scotland, Rosetrees Trust, The NIHR Maudsley Biomedical Research Centre at South London and Maudsley NHS Foundation Trust, an EU Joint Programme—Neurodegenerative Disease Research (JPND) project, Medical Research Council, Economic and Social Research Council, My Name’5 Doddie Foundation, and Alan Davidson Foundation.
Publisher Copyright:
© The Author(s) 2023. Published by Oxford University Press.
PY - 2023/4/3
Y1 - 2023/4/3
N2 - Summary: The current widespread adoption of next-generation sequencing (NGS) in all branches of basic research and clinical genetics fields means that users with highly variable informatics skills, computing facilities and application purposes need to process, analyse, and interpret NGS data. In this landscape, versatility, scalability, and user-friendliness are key characteristics for an NGS analysis software. We developed DNAscan2, a highly flexible, end-to-end pipeline for the analysis of NGS data, which (i) can be used for the detection of multiple variant types, including SNVs, small indels, transposable elements, short tandem repeats, and other large structural variants; (ii) covers all standard steps of NGS analysis, from quality control of raw data and genome alignment to variant calling, annotation, and generation of reports for the interpretation and prioritization of results; (iii) is highly adaptable as it can be deployed and run via either a graphic user interface for non-bioinformaticians and a command line tool for personal computer usage; (iv) is scalable as it can be executed in parallel as a Snakemake workflow, and; (v) is computationally efficient by minimizing RAM and CPU time requirements.
AB - Summary: The current widespread adoption of next-generation sequencing (NGS) in all branches of basic research and clinical genetics fields means that users with highly variable informatics skills, computing facilities and application purposes need to process, analyse, and interpret NGS data. In this landscape, versatility, scalability, and user-friendliness are key characteristics for an NGS analysis software. We developed DNAscan2, a highly flexible, end-to-end pipeline for the analysis of NGS data, which (i) can be used for the detection of multiple variant types, including SNVs, small indels, transposable elements, short tandem repeats, and other large structural variants; (ii) covers all standard steps of NGS analysis, from quality control of raw data and genome alignment to variant calling, annotation, and generation of reports for the interpretation and prioritization of results; (iii) is highly adaptable as it can be deployed and run via either a graphic user interface for non-bioinformaticians and a command line tool for personal computer usage; (iv) is scalable as it can be executed in parallel as a Snakemake workflow, and; (v) is computationally efficient by minimizing RAM and CPU time requirements.
UR - http://www.scopus.com/inward/record.url?scp=85152973672&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btad152
DO - 10.1093/bioinformatics/btad152
M3 - Article
SN - 1367-4803
VL - 39
JO - BIOINFORMATICS
JF - BIOINFORMATICS
IS - 4
M1 - btad152
ER -