Here you can download phenotypic and genotypic data from the database in bulk.
All data are archived in the digital library of the Technical University of Munich under the following DOI: https://doi.org/10.14459/2022mp1649709.
All data can be downloaded using the REST API from HeliantHOME. All REST endpoints can be found in the following documentation of the REST API.
Phenotype Data

All Phenotypes for wild sunflower populations
www.helianthome.org/rest/population/phenotype_matrix/wild.csv
www.helianthome.org/rest/population/phenotype_matrix/wild.json

All Phenotypes for cultivated SAM sunflower populations
www.helianthome.org/rest/population/phenotype_matrix/sam.csv
www.helianthome.org/rest/population/phenotype_matrix/sam.json

For a specific wild sunflower population by population id
Downloading a certain wild population by ID can be done with the REST API. In the following example we show how to download all phenotypes for the population ANN_01
www.helianthome.org/rest/population/phenotype_matrix/ANN_01.csv
www.helianthome.org/rest/population/phenotype_matrix/ANN_01.json

Imaging Data

High Quality Imaging Data

All high-quality images for 1449 wild sunflower individuals as a single ZIP file
HighQualityImagesWildIndividuals.tar

You can also download images for certain species only. For this purpose, you can use the following URL endpoint:
www.helanthome.org/data/individuals/ANN1208.tar.gz

Just replace the Individual ID, e.g. ANN1208, with any other Individual ID.

Genotype Data (external data)

Links to VCFs called on Ha412HOv2.0 reference genome

Hard filter was applied to retain only bi-allelic SNPs with 90% tranche, MAF > 0.01 and genotype rate > 50%. Additional filter of MAF > 0.01 and genotype rate > 50% was also applied for each subset.
Imputed datasets are generated with Beagle on all available samples for each species and the same filter was applied for each subset. (SNP set is slightly different from the un-imputed one.)
The VCFs with perennials are VQSR-ed with the same goldset used in the corresponding intra-species VCF to try to keep as many consistent SNPs as possible. To accommodate different requirements, multi-allelic SNPs are kept and filtered as follows: rare alleles with AF<0.01 are removed while others at the same site are kept; indels overlapping with a SNP are set as missing; sites that are variable after filtering and have >50% samples called are kept. Downstream filtration based on the genotypes (e.g. SNPs with no more than 1 missing data in the perennial outgroup, SNPs fixed between the target species and perennials, or SNPs that are variable within the target species but fixed in the perennial outgroup) should be conducted according to one’s purpose.


Links to Datasets

Annuus:

Argophyllus: Petiolaris:
GWAS Results (external data available at easyGWAS)
Legal Notice
Contact Information
Prof. Dr. Dominik Grimm
Professorship of Bioinformatics

TUM Campus Straubing for Biotechnology and Sustainability (TUMCS)
Weihenstephan-Triesdorf University of Applied Sciences (HSWT)
Petersgasse 18
93415 Straubing
Germany

dominik.grimm [at] hswt.de

Legal Information
Type of business: Research Institute

Disclaimer
This website provides information about public research data. We make no guarantees of accuracy, completeness and timeliness of the infomration on this website. We, therefore, accept no responsibility or liability for damages or losses resulting from the use of this website. This website provides links to other internet sites for the convenience of users. We are not responsible for the availability or content of these external sites, nor do they endorse, warrant, or guarantee any commercial product, service, site, law firm, attorney or information described or offered at these other internet sites.