Need Help?

Whole genome sequencing data of lung adenocarcinomas

This dataset includes the whole-genome sequencing data from a study entitled "Tracing Oncogene Rearrangements in the Mutational History of Lung Adenocarcinoma". Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X Ten machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset.

Request Access

Data Access Agreement (Version 1.0, updated in Dec 16, 2016)

DATA ACCESS AGREEMENT (Version 1.0, updated in Dec 16, 2016) These terms and conditions govern access to the managed access datasets (details of which are set out in Appendix I) provided by Graduate School of Medical Science and Engineering in Korea Advanced Institute of Science and Technology (KAIST-GSMSE) to which the User Institution has requested access. The User Institution agrees to be bound by these terms and conditions. Definitions Authorized Personnel: The individuals at the User Institution to whom KAIST-GSMSE grants access to the Data. This includes the User, the individuals listed in Appendix II and any other individuals for whom the User Institution subsequently requests access to the Data. Details of the initial Authorized Personnel are set out in Appendix II. Data: The managed access datasets to which the User Institution has requested access. Data Producers: KAIST-GSMSE and the collaborators listed in Appendix I responsible for the development, organization, and oversight of these Data. External Collaborator: A collaborator of the User, working for an institution other than the User Institution. Project: The project for which the User Institution has requested access to these Data. A description of the Project is set out in Appendix II. Publications: Includes, without limitation, articles published in print journals, electronic journals, reviews, books, posters and other written and verbal presentations of research. Research Participant: An individual whose data form part of these Data. Research Purposes: Shall mean research that is seeking to advance the understanding of genetics and genomics, including the treatment of disorders, and work on statistical methods that may be applied to such research. User: The principal investigator for the Project. User Institution(s): The Institution that has requested access to the Data. ?1. The User Institution agrees to only use these Data for the purpose of the Project (described in Appendix II) and only for Research Purposes. The User Institution further agrees that it will only use these Data for Research Purposes which are within the limitations (if any) set out in Appendix I. 2. The User Institution agrees to preserve, at all times, the confidentiality of these Data. In particular, it undertakes not to use, or attempt to use these Data to compromise or otherwise infringe the confidentiality of information on Research Participants. Without prejudice to the generality of the foregoing, the User Institution agrees to use at least the measures set out in Appendix I to protect these Data. 3. The User Institution agrees to protect the confidentiality of Research Participants in any research papers or publications that they prepare by taking all reasonable care to limit the possibility of identification. 4. The User Institution agrees not to link or combine these Data to other information or archived data available in a way that could re-identify the Research Participants, even if access to that data has been formally granted to the User Institution or is freely available without restriction. 5. The User Institution agrees only to transfer or disclose these Data, in whole or part, or any material derived from these Data, to the Authorized Personnel. Should the User Institution wish to share these Data with an External Collaborator, the External Collaborator must complete a separate application for access to these Data. 6. The User Institution agrees that the Data Producers, and all other parties involved in the creation, funding or protection of these Data: a) make no warranty or representation, express or implied as to the accuracy, quality or comprehensiveness of these Data; b) exclude to the fullest extent permitted by law all liability for actions, claims, proceedings, demands, losses (including but not limited to loss of profit), costs, awards damages and payments made by the Recipient that may arise (whether directly or indirectly) in any way whatsoever from the Recipient’s use of these Data or from the unavailability of, or break in access to, these Data for whatever reason and; c) bear no responsibility for the further analysis or interpretation of these Data. 7. The User Institution agrees to follow the Fort Lauderdale Guidelines (http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtd003207.pdf ) and the Toronto Statement (http://www.nature.com/nature/journal/v461/n7261/full/461168a.html). This includes but is not limited to recognizing the contribution of the Data Producers and including a proper acknowledgement in all reports or publications resulting from the use of these Data. 8. The User Institution agrees to follow the Publication Policy in Appendix III. 9. The User Institution agrees not to make intellectual property claims on these Data and not to use intellectual property protection in ways that would prevent or block access to, or use of, any element of these Data, or conclusion drawn directly from these Data. 10. The User Institution can elect to perform further research that would add intellectual and resource capital to these data and decide to obtain intellectual property rights on these downstream discoveries. In this case, the User Institution agrees to implement licensing policies that will not obstruct further research and to follow the U.S. National Institutes of Health Best Practices for the Licensing of Genomic Inventions (2005) (https://www.icgc.org/files/daco/NIH_BestPracticesLicensingGenomicInventions_2005_en.pdf ) in conformity with the Organization for Economic Co-operation and Development Guidelines for the Licensing of the Genetic Inventions (2006) (http://www.oecd.org/science/biotech/36198812.pdf ). 11. The User Institution agrees to destroy/discard the Data held, once it is no longer used for the Project, unless obliged to retain the data for archival purposes in conformity with audit or legal requirements. 12. The User Institution will notify KAIST-GSMSE within 30 days of any changes or departures of Authorized Personnel. 13. The User Institution will notify KAIST-GSMSE prior to any significant changes to the protocol for the Project. 14. The User Institution will notify KAIST-GSMSE as soon as it becomes aware of a breach of the terms or conditions of this agreement. 15. KAIST-GSMSE may terminate this agreement by written notice to the User Institution. If this agreement terminates for any reason, the User Institution will be required to destroy any Data held, including copies and backup copies. This clause does not prevent the User Institution from retaining these data for archival purpose in conformity with audit or legal requirements. 16. The User Institution accepts that it may be necessary for the Data Producers to alter the terms of this agreement from time to time. As an example, this may include specific provisions relating to the Data required by Data Producers other than KAIST-GSMSE. In the event that changes are required, the Data Producers or their appointed agent will contact the User Institution to inform it of the changes and the User Institution may elect to accept the changes or terminate the agreement. 17. If requested, the User Institution will allow data security and management documentation to be inspected to verify that it is complying with the terms of this agreement. 18. The User Institution agrees to distribute a copy of these terms to the Authorized Personnel. The User Institution will procure that the Authorized Personnel comply with the terms of this agreement. 19. This agreement (and any dispute, controversy, proceedings or claim of whatever nature arising out of this agreement or its formation) shall be construed, interpreted and governed by the laws of Republic of Korea (South Korea) and shall be subject to the exclusive jurisdiction of the South Korean courts. ?Agreed for User Institution * This should be signed by primary institutional officials who are authorized to sign sponsored research and technology transfer documents on behalf of User Institution (typically the Vice President of Research, Dean, or other positions with similar institutional authority). Signature:   Name:   Title:   Date:   Principal Investigator I confirm that I have read and understood this Agreement. Signature:   Name:   Title:   Date:   Agreed for KAIST-GSMSE Signature:   Name:   Title:   Date:   APPENDIX I – DATASET DETAILS APPENDIX II ––PROJECT DETAILS APPENDIX III –– PUBLICATION POLICY APPENDIX I – DATASET DETAILS Dataset reference (EGA Study ID and Dataset Details) This dataset includes the whole-genome and transcriptome sequencing data which were used in a research article entitled "Complex Chromosomal Rearrangements by Single Catastrophic Pathogenesis in NUT Midline Carcinoma", which were published in Annals of Oncology (2017). The dataset is archived in the European Genome-phenome Archive (EGA), under accession number of EGAS00001001934. Name of project that created the dataset The registered name of this project in EGA, which is different from the title of the publication, is "Genomic characterization of NUT midline carcinoma". Names of other data producers/collaborators Professor Young Seok Ju in KAIST-GSMSE is the main producer of this dataset. Collaborators include: Professor Tae Min Kim in Seoul National University Hospital, doctors June-Koo Lee and Seongyeol Park in KAIST-GSMSE. Specific limitations on areas of research and protection measures The User and the User Institution should protect the confidentiality of research participants in any research papers or publications that they prepare by taking all reasonable care to limit the possibility of identification, especially for germline variants. File access: Data can be held in unencrypted files on an institutional compute system, with Unix user group read/write access for one or more appropriate groups but not Unix world read/write access behind a secure firewall. Laptops holding these data should have password protected logins and screenlocks (set to lock after 5 min of inactivity). If held on USB keys or other portable hard drives, the data must be encrypted. ?APPENDIX II – PROJECT DETAILS (to be completed by the Requestor) Details of dataset requested i.e., EGA Study and Dataset Accession Number Brief abstract of the Project in which the Data will be used All Individuals who the User Institution to be named as registered users Name of Registered User Email Job Title Supervisor* All Individuals that should have an account created at the EGA Name of Registered User Email Job Title APPENDIX III – PUBLICATION POLICY In any publications based on these data, please describe how the data can be accessed, including the name of the hosting database (e.g., The European Genome-phenome Archive at the European Bioinformatics Institute) and its accession numbers (e.g., EGAS00001001934), and acknowledge its use in a form agreed by the User Institution with KAIST-GSMSE.

Studies are experimental investigations of a particular phenomenon, e.g., case-control studies on a particular trait or cancer research projects reporting matching cancer normal genomes from patients.

Study ID Study Title Study Type
EGAS00001002801 Other

This table displays only public information pertaining to the files in the dataset. If you wish to access this dataset, please submit a request. If you already have access to these data files, please consult the download documentation.

ID File Type Size Located in
EGAF00002396610 bam 174.7 GB
EGAF00002396611 bam 194.1 GB
EGAF00002396612 bam 175.0 GB
EGAF00002396613 bam 188.5 GB
EGAF00002396614 bam 182.4 GB
EGAF00002396615 bam 171.5 GB
EGAF00002396616 bam 529.7 GB
EGAF00002396617 bam 178.2 GB
EGAF00002396618 bam 321.9 GB
EGAF00002396619 bam 166.0 GB
EGAF00002396620 bam 188.4 GB
EGAF00002396621 bam 183.5 GB
EGAF00002396622 bam 189.6 GB
EGAF00002396623 bam 162.6 GB
EGAF00002396624 bam 306.8 GB
EGAF00002396625 bam 173.4 GB
EGAF00002396626 bam 171.7 GB
EGAF00002396627 bam 161.8 GB
EGAF00002396628 bam 207.2 GB
EGAF00002396629 bam 179.8 GB
EGAF00002396630 bam 184.6 GB
EGAF00002396631 bam 202.0 GB
EGAF00002396632 bam 147.6 GB
EGAF00002396633 bam 172.0 GB
EGAF00002396634 bam 144.5 GB
EGAF00002396635 bam 191.6 GB
EGAF00002396636 bam 189.8 GB
EGAF00002396637 bam 171.4 GB
EGAF00002396638 bam 187.8 GB
EGAF00002396639 bam 187.7 GB
EGAF00002396640 bam 165.5 GB
EGAF00002396641 bam 179.4 GB
EGAF00002396642 bam 166.1 GB
EGAF00002396643 bam 184.8 GB
EGAF00002396644 bam 152.8 GB
EGAF00002396645 bam 199.7 GB
EGAF00002396646 bam 322.8 GB
EGAF00002396647 bam 193.1 GB
EGAF00002396648 bam 164.7 GB
EGAF00002396649 bam 173.7 GB
EGAF00002396650 bam 156.9 GB
EGAF00002396651 bam 186.3 GB
EGAF00002396652 bam 346.6 GB
EGAF00002396653 bam 174.7 GB
EGAF00002396654 bam 152.4 GB
EGAF00002396655 bam 157.8 GB
EGAF00002396656 bam 205.1 GB
EGAF00002396657 bam 181.3 GB
EGAF00002396658 bam 545.8 GB
EGAF00002396659 bam 227.5 GB
EGAF00002396660 bam 189.9 GB
EGAF00002396661 bam 163.0 GB
EGAF00002396662 bam 303.3 GB
EGAF00002396663 bam 189.0 GB
EGAF00002396664 bam 351.9 GB
EGAF00002396665 bam 199.0 GB
EGAF00002396666 bam 150.9 GB
EGAF00002396667 bam 201.5 GB
EGAF00002396668 bam 334.3 GB
EGAF00002396669 bam 192.0 GB
EGAF00002396670 bam 229.1 GB
EGAF00002396671 bam 158.2 GB
EGAF00002396672 bam 227.9 GB
EGAF00002396673 bam 139.7 GB
EGAF00002396674 bam 221.9 GB
EGAF00002396675 bam 158.0 GB
EGAF00002396676 bam 224.8 GB
EGAF00002396677 bam 159.7 GB
EGAF00002396678 bam 205.5 GB
EGAF00002396679 bam 147.1 GB
EGAF00002396680 bam 174.2 GB
EGAF00002396681 bam 152.3 GB
EGAF00002396682 bam 184.2 GB
EGAF00002396683 bam 169.4 GB
EGAF00002396684 bam 177.3 GB
EGAF00002396685 bam 163.1 GB
EGAF00002396686 bam 213.2 GB
EGAF00002396687 bam 220.8 GB
EGAF00002396688 bam 234.7 GB
EGAF00002396689 bam 156.7 GB
EGAF00002396690 bam 214.8 GB
EGAF00002396691 bam 157.5 GB
EGAF00002396692 bam 236.3 GB
EGAF00002396693 bam 197.0 GB
EGAF00002396694 bam 218.4 GB
EGAF00002396695 bam 163.6 GB
EGAF00002396696 bam 217.5 GB
EGAF00002396697 bam 160.6 GB
EGAF00002396698 bam 229.5 GB
EGAF00002396699 bam 153.8 GB
EGAF00002396700 bam 206.1 GB
EGAF00002396701 bam 184.9 GB
EGAF00002396702 bam 210.2 GB
EGAF00002396703 bam 157.2 GB
EGAF00002396704 bam 231.8 GB
EGAF00002396705 bam 160.1 GB
EGAF00002396706 bam 218.5 GB
EGAF00002396707 bam 168.0 GB
98 Files (19.7 TB)