
Dear AnVIL users, The AnVIL team is excited to announce the release of the following studies below on the AnVIL platform. If you are interested in these datasets, please submit data access requests through dbGaP. The datasets are now findable on the AnVIL Data Explorer <https://explore.anvilproject.org/datasets> for cohort building and on the AnVIL Data Library <https://duos.org/datalibrary/anvil> for dataset level search. For additional resources for how to find datasets on AnVIL, please refer to the AnVIL Data Explorer Guide <https://explore.anvilproject.org/guides> or DUOS Data Library user guide <https://support.terra.bio/hc/en-us/articles/31034718333851-How-to-access-and-export-controlled-data-to-Terra-via-DUOS> . Study Name phsID DULs Release Notes Submitter blog post Where to apply for access Link to dataset on AnVIL Common Fund (CF) Genotype-Tissue Expression Project <https://www.gtexportal.org/home/aboutAdultGtex> (GTEx) phs000424.v10.p2 GRU Sample and subject annotation files were added. Link <https://anvilproject.org/news/2024/11/20/gtexv10> dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000424.v10.p2> Data Workspace <https://anvil.terra.bio/#workspaces/anvil-datastorage/AnVIL_GTEx_v10_hg38> NHGRI GREGoR Consortium <https://gregorconsortium.org/>: Genomics Research to Elucidate the Genetics of Rare Disease phs003047.v3.p2 GRU HMB The new data release consists of 8,840 participants and more than 3,000 families. Included in the release are short-read whole exomes and genomes, long-read whole genomes, and RNA-seq files. Link <https://anvilproject.org/news/2025/07/25/gregor-consortium-data-release> dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs003047.v3.p2> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs003047%22%5D%7D%5D> Data Library <https://duos.org/studies/107> Impact of Genomic Variation on Function (IGVF) Consortium <https://igvf.org/> phs003472.v1.p1 HMB-MDS This first release contains data from seven participants and includes assay data such as single-cell ATAC-seq, single-cell RNA sequencing, and SHARE-seq data. N/A dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs003472.v1.p1> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs003472%22%5D%7D%5D> Data Library <https://duos.org/studies/191> Genomic Answers for Kids <https://www.childrensmercy.org/childrens-mercy-research-institute/studies-and-trials/genomic-answers-for-kids/> phs002206.v5.p1 DS-PEDD-IRB The new data release includes over 2,000 long-read genome sequences and 12,000 short-read genome and exome analyses, nearly 400 snapshots of patient transcriptomes and epigenomes in individual cells using single-cell RNA (scRNA) and sc open chromatin (scATAC), over 3,000 bulk whole genome bisulphite genome sequences for methylome interpretation, and over 200 functional assessments in available patient tissues using full length cDNA sequences by IsoSeq (PacBio) methodology. This release also consolidates data from release 4 and 5 into a single dataset for exporting. Link <https://anvilproject.org/news/2025/08/25/ga4k-version-5-data-release> dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs002206.v5.p1> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs002206%22%5D%7D%5D> Data Library <https://duos.org/studies/106> Center for Common Disease Genomics [CCDG] - Neuropsychiatric: Epilepsy: Epi25 Consortium <http://epi-25.org> phs001489.v4.p2 32 consent codes. For a full list, please see dbGaP study page <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001489.v4.p2> . This new data release includes whole genome genotype data on over 30,000 Epi25 participants, generated using Illumina’s Infinium GSA-MD v1 platform. Additionally, detailed clinical phenotypes related to epilepsy diagnosis are now available for both the GSA data as well as the whole exome sequencing (WES) data previously released in v3. Link <https://anvilproject.org/news/2025/08/25/epi25-version-4-data-release> dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001489.v4.p2> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs001489%22%5D%7D%5D> Data Library <https://duos.org/studies/111> CARD Consortium <https://card.nih.gov/research-programs/long-read-sequencing>: North American Brain Expression Consortium phs001300.v5.p1 (parent) phs003181.v2.p1 (child) GRU This new release includes 206 samples with haplotype-resolved assemblies, structural and small variant calls, as well as methylation calls for neurologically ‘normal’ prefrontal cortex ( and cortex ) brain tissue samples. Link <https://anvilproject.org/news/2025/08/25/card-data-release-2> dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001300.v5.p1> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs001300%22%5D%7D%5D> Data Library <https://duos.org/studies/73> CARD Consortium <https://card.nih.gov/research-programs/long-read-sequencing>: Gene Expression in Postmortem DLPFC and Hippocampus from Schizophrenia and Mood Disorders phs000979.v4.p2 GRU This new release includes 155 samples with haplotype-resolved assemblies, structural and small variant calls, as well as methylation calls for neurologically ‘normal’ prefrontal cortex ( and cortex ) brain tissue samples. Link <https://anvilproject.org/news/2025/08/25/card-data-release-2> dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000979.v4.p2> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs000979%22%5D%7D%5D> Data Library <https://duos.org/studies/190> PAGE: The Charles Bronfman Institute for Personalized Medicine (IPM) BioMe Biobank phs000925.v1.p1 GRU Please see dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000925.v1.p1> for more information. N/A dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000925.v1.p1> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs000925%22%5D%7D%5D> Data Library <https://duos.org/studies/77> PAGE: Multi-Ethnic Cohort Study phs000220.v2.p2 GRU Please see dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000925.v1.p1> for more information. N/A dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000220.v2.p2> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs000220%22%5D%7D%5D> Data Library <https://duos.org/studies/78> PAGE: Global Reference Panel phs001033.v1.p1 GRU Please see dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000925.v1.p1> for more information. N/A dbGaP <https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001033.v1.p1> Explorer <https://explore.anvilproject.org/datasets?filter=%5B%7B%22categoryKey%22%3A%22datasets.registered_identifier%22%2C%22value%22%3A%5B%22phs001033%22%5D%7D%5D> Data Library <https://duos.org/studies/76> In addition to the data released above, the following are developmental enhancements made to the AnVIL data: - Inconsistencies in snapshot naming conventions that were causing issues with indexing for the AnVIL Data Explorer have been resolved. - MD5s for file metadata are now consistently encoded to Base64 to be consistent with what is provided in the GCS metadata. - Values that were causing issues importing data from DUOS or the AnVIL Data Explorer into Workspaces have been corrected. - The presence of double-pipes that was causing issues with indexing certain datasets for the AnVIL Data Explorer has been resolved. Thank you, The AnVIL Team -- M. Kate Balaconis, PhD Data Sciences Platform Program Manager Broad Institute of MIT and Harvard 105 Broadway, Cambridge, MA 02142 kbalacon@broadinstitute.org
participants (1)
-
Kate Balaconis