Need Help?

Test dataset: Sequence and variant data from public 1000 Genomes Project

This is a test dataset derived from public data of the 1000 Genomes Project. Its purpose is not to allow for any inference about cohort data or results, but to aid bioinformaticians in the technical development and testing of tools, as well as data consumers in learning how to access information. This dataset consists of 2508 samples from the 1000 Genomes Project (https://www.nature.com/articles/nature15393). Samples' (e.g. NA18534) data can be accessed through the IGSR portal (e.g. https://www.internationalgenome.org/data-portal/sample/NA18534) or their corresponding folder at the 1000 Genomes' FTP site (e.g. http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/data/CHB/NA18534/exome_alignment/). There are several different types of data this dataset encompasses: Variant Calling Format (VCF, or its binary counterparts BCF) files, both joint (e.g. ALL_chr22_20130502_2504Individuals.vcf.gz) and split (HG01775.chrY.vcf.gz); exome sequencing CRAM files (e.g. NA18534.GRCh38DH.exome.cram); whole genome sequencing CRAM/BAM files (e.g. NA19239.cram). Additionally, there are multiple files that were sliced to create shorter files, which allows for a quick download, formated as "{FILE-INFO}__{NUMBER-OF-READS}r__{CHR}.{START-COORDINATE}-{END-COORDINATE}.{FILETYPE}" (e.g. "HG01500.GRCh38DH__90r__3.10000-10500__4.10000-10500.cram"). These files can be downloaded directly through the EGA-download-client PyEGA3 (https://github.com/EGA-archive/ega-download-client).

Request Access

EGA Test Policy

...

Studies are experimental investigations of a particular phenomenon, e.g., case-control studies on a particular trait or cancer research projects reporting matching cancer normal genomes from patients.

Study ID Study Title Study Type
Other
Whole Genome Sequencing
Other

This table displays only public information pertaining to the files in the dataset. If you wish to access this dataset, please submit a request. If you already have access to these data files, please consult the download documentation.

ID File Type Size Located in
EGAF00001753734 cram 45.1 GB
EGAF00001753735 crai 1.6 MB
EGAF00001753736 cram 38.2 GB
EGAF00001753737 crai 1.3 MB
EGAF00001753738 cram 38.4 GB
EGAF00001753739 crai 1.3 MB
EGAF00001753740 cram 34.8 GB
EGAF00001753741 crai 1.2 MB
EGAF00001753742 cram 44.1 GB
EGAF00001753743 crai 1.5 MB
EGAF00001753744 cram 48.3 GB
EGAF00001753745 crai 1.6 MB
EGAF00001753746 bam 143.5 GB
EGAF00001753747 bai 9.0 MB
EGAF00001753748 bam 4.3 GB
EGAF00001753749 bai 9.2 MB
EGAF00001753751 bai 9.2 MB
EGAF00001753752 bam 229.9 GB
EGAF00001753753 bai 9.4 MB
EGAF00001753754 bam 136.1 GB
EGAF00001753755 bai 9.0 MB
EGAF00001753756 bam 140.5 GB
EGAF00001753757 bai 9.0 MB
EGAF00001770106 bam 462.3 MB
EGAF00001770107 bam 3.6 GB
EGAF00001775034 bai 6.0 MB
EGAF00001775036 bai 4.8 MB
EGAF00005000662 vcf.gz 25.5 MB
EGAF00005000663 tbi 18.6 kB
EGAF00005000664 bcf 27.0 MB
EGAF00005000665 csi 14.5 kB
EGAF00005001623 vcf.gz 214.5 MB
EGAF00005001624 tbi 36.1 kB
EGAF00005001625 bcf 186.5 MB
EGAF00005001626 csi 27.6 kB
EGAF00005007180 cram 1.8 GB
EGAF00005007181 cram 2.9 GB
EGAF00005007323 vcf.gz 5.7 MB
EGAF00005007324 tbi 8.1 kB
EGAF00005007325 bcf 5.5 MB
EGAF00005007326 csi 6.3 kB
EGAF00005007327 vcf.gz 851.1 kB
EGAF00005007328 tbi 5.0 kB
EGAF00005007329 bcf 876.7 kB
EGAF00005007330 csi 4.7 kB
EGAF00005007331 crai 137.5 kB
EGAF00005007332 crai 229.4 kB
EGAF00007243773 1664408722194 194.9 kB
EGAF00007243774 1664408722194 135.2 kB
EGAF00007243775 1664408722194 23.0 kB
EGAF00007243776 1664408722194 2.0 kB
EGAF00007243777 1664408722194 122.5 kB
EGAF00007243778 1664408722194 112.3 kB
EGAF00007243779 1664408722194 15.4 kB
EGAF00007243780 1664408722194 2.0 kB
EGAF00007243781 1664408722194 27.1 kB
EGAF00007243782 1664408722194 29.6 kB
EGAF00007243783 1664408722194 95 Bytes
EGAF00007243784 1664408722194 92 Bytes
EGAF00007462299 1667557821299 16.0 GB
EGAF00007462300 1667557821299 7.4 GB
EGAF00007462301 1667565921046 69.7 GB
EGAF00007462302 1667563453328 34.9 GB
EGAF00007462303 1667560521104 8.2 GB
EGAF00007462304 1667557821299 4.2 GB
EGAF00007462305 1667562321090 68.9 GB
EGAF00007462306 1667561421091 37.0 GB
EGAF00007553557 1667492121465 16.8 GB
EGAF00007553559 1667478623405 2.4 GB
EGAF00007553560 1667477724543 2.7 MB
EGAF00007553561 1667477724543 434.2 kB
71 Files (1.2 TB)