Whole genome sequencing data of 36 high-grade serous carcinoma (HGSC) patients (89 samples) sequenced with HiSeq X Ten.
Whole genome sequencing data of 21 high-grade serous carcinoma (HGSC) patients (59 samples) sequenced with MGISEQ-2000.
This submission includes targeted and whole exome paired-end fastq files.
Whole genome sequencing data of 35 high-grade serous carcinoma (HGSC) patients (112 samples) sequenced with Illumina Novoseq 6000
Genome and transcriptome sequence data from a rosette-forming glioneuronal tumor (RGNT) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study
Genome and transcriptome sequence data from a diffuse midline glioma, H3K27 mutant patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study
Standards The EGA is a long-standing supporter of the Global Alliance for Genomics & Health (GA4GH) to enhance responsible sharing of human genetic data through the development of interoperable global standards for human data access. The EGA is one of the founding GA4GH Driver Projects and has contributed to the development and implementation of several GA4GH standards and APIs. Below is a list of the GA4GH standards and APIs that are currently available or planned for implementation at EGA. TechnicalStandards Purpose SpecificationVersion SupportedVersion Implementation Large Scale Genomics htsget A protocol for secure, efficient, and reliable access to sequencing read and variation data. V1.3.0 V1.0.0 Specification Documentation Endpoint Read File Formats (SAM/BAM/CRAM) Specifications for storing next-generation sequencing read data. V3.0.0 V3.0.0 Implementation Example of Usage Variation File Formats (VCF/BCF) The specifications for Variant Call Format Files (VCF) and its binary counterpart BCF. V4.0.0 V2.0.0 V4.0.0 V2.0.0 Implementation Example of Usage Crypt4GH v1.0 Enables direct byte-level compatible random access to encrypted genetic data stored in community standards (e.g. CRAM, VCF) V1.0 V1.0 Specification Documentation Endpoint refget API Enables access to reference sequences using an identifier derived from the sequence itself. V1.2.6 NA Specification RNAget API v1 Provides a means of retrieving data from several types of RNA experiments including (i) feature-level expression data from RNA-seq type measurements and (ii) coordinate-based signal/intensity data similar to a bigwig representation via a client/server model. V1.0.0 NA Documentation Discovery Beacon v2 Supports discovery of genomic variants, phenotypes, and individuals V1.0.1 V0.3 Web UI API Source Code Service Info API v1 The Service Info API is an endpoint for describing GA4GH service metadata, designed for extension and inclusion in other APIs. Service info is used to describe a single service, while Service Registry is used to describe multiple services. V1.0.0 NA Documentation Service Registry API v1 provides information about other GA4GH services, primarily for the purpose of organizing services into networks or groups and service discovery across organizational boundaries. V1.0.0 NA Documentation Data Use Researcher Identities Data Use Ontology (DUO) Allow users to semantically tag genomic datasets with usage restrictions, allowing them to become automatically discoverable based on a health, clinical, or biomedical researcher’s authorisation level or intended use. 2021-02-23 2021-02-23 Specification Documentation Endpoint Authentication & Authorization Infrastructure (AAI) The GA4GH AAI specification leverages OpenID Connect (OIDC) Servers for use in authenticating the identity of researchers desiring to access clinical and genomic resources from data holders adhering to GA4GH standards, and to enable data holders to obtain security-related attributes of those researchers. V1.2.0 V1.2.0 API URI: ega.ebi.ac.uk:8443 Documentation Repository Researcher IDs (passport, visa) Specify the collection of researchers that may access a dataset at any given time, and the credentials they must supply. V1.0.1 V1.0.1 Specification Documentation Endpoint Cloud Tool Registry Service API TRS is a standard API for exchanging tools and workflows to analyze, read, and manipulate genomic data. V2.0.1 NA Documentation Repostiory Data Repository Service API DRS API is a standard for building data repositories and adapting access tools to work with those repositories, works with other approved APIs from the GA4GH Cloud Work Stream to allow researchers to discover algorithms across different cloud environments and send them to datasets they wish to analyse. V1.0.3 NA Documentation Repostiory Workflow Execution Service API This API lets users run a single workflow (defined using CWL or WDL) on multiple different platforms, clouds, and environments, and be confident that it will work the same way. The API provides methods to request that a workflow be run, pass parameters to that workflow, get information about running workflows, and cancel a running workflow. V1.0.1 NA Documentation Repostiory Genomic Knowledge Standards Variation Representation v1 Provides a flexible framework of computational models, schemas, and algorithms to precisely and consistently exchange genetic variation data across communities. V1.3.0 EGA team is contributing to including it in in Beacon v2 Specification and Elixir Reference Implementation Documentation Repostiory Clin/ Pheno Data Capture Phenopackets Provides information models with different levels of complexity to enable high level clinical phenotype information as well as deep clinical phenotype information to be exchanged. V2.0.0 Included in Ongoing Submissions EGA team is contributing to including it in in Beacon v2 Specification and Elixir Reference Implementation Documentation Repostiory Driver Project The EGA, jointly coordinated by the EBI and the CRG, was announced, in 2017, to be one of the 15 Driver Projects for GA4GH. Driver Projects are international genomic data initiatives, focussed on real projects and challenges that will guide the development efforts in order to accelerate and enable completely responsible and standarised data sharing by 2022. All chosen Driver Projects make a cross-sectional effort by playing an important role across the different workstreams. Thomas Keane, Jordi Rambla, Mallory Freeberg, and Aina Jené have been named Driver Project Champions for the EGA. All Driver Project Champions will be leading this ambitious initiative for the following years.
In this prospective study, targeted deep sequencing was performed on a total of 160 primary tumors (474 regions) and 112 lymph nodes from 125 patients with stage I-III lung cancer (LuCaTH). Progressive evolution at clonal divergence scale was observed while specific driver events were positively selected for clonal sweeps during tumor development. Between-region genetic divergence (BRGD) of tumors were assessed and positively correlated with tumor differentiation. A machine learning algorithm was employed to evaluate clinicopathological and molecular parameters of primary tumors underlying lymph node metastasis. By analyzing clonal lineages and metastatic trajectories across multiple nodal stations, we unraveled a common sequential LNM seeding pattern but with divergent modes of clonal spread.
The dataset contains 90 lung cancer and 5 non-cancerous lung lesion plasma cfDNA samples collected in EDTA blood collection tubes. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing.
Single cell transcriptomes, generated using chromium 10X 3' sequencing, for two tumour types (AT/RT, and Ewing's sarcoma). For each individual, tumour and normal whole genome sequencing was also obtained using Illumina short read sequencing to an average depth of 30X. These data were used to validate the accuracy of a method for identifying cancer cell transcriptomes based on the allelic shift produced by copy number changes.