TY - JOUR
T1 - Bigmelon
T2 - Tools for analysing large DNA methylation datasets
AU - Gorrie-Stone, Tyler J.
AU - Smart, Melissa C.
AU - Saffari, Ayden
AU - Malki, Karim
AU - Hannon, Eilis
AU - Burrage, Joe
AU - Mill, Jonathan
AU - Kumari, Meena
AU - Schalkwyk, Leonard C.
PY - 2019/3/15
Y1 - 2019/3/15
N2 - Motivation The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. Results Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data. We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. copy; 2018 The Author(s). Published by Oxford University Press.
AB - Motivation The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. Results Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data. We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. copy; 2018 The Author(s). Published by Oxford University Press.
UR - http://www.scopus.com/inward/record.url?scp=85057651419&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bty713
DO - 10.1093/bioinformatics/bty713
M3 - Article
C2 - 30875430
AN - SCOPUS:85057651419
SN - 1367-4803
VL - 35
SP - 981
EP - 986
JO - BIOINFORMATICS
JF - BIOINFORMATICS
IS - 6
ER -