Allium cepa analysis

Page to report on the reanalysis of a RAD-seq dataset of onions

Allium cepa RAD-seq dataset reanalysis

Reanalysis of the dataset from Lee et al (2018) [3] using stacks and a reference genome that was not included in [3] as said genome was published well after Lee published its RAD-seq analysis.

Context

This reanalysis was done with the purpose of testing some software before using it in on dataset that i expected to get. The main focus is to try to find population structure in those datasets. I also wanted to study some concepts of population genetics here. This is not necessarily an state of the art reanalysis of the dataset.

Data description and collection

Raw reads

The raw reads consist of 192 Illumina PE libraries. Each library corresponds to an individual from four korean companies [3]. The reads were retrieved with SRA-Explorer [1], using the SRA(Sequence read archive) accesion code: SRP150117.

Reference genome

The reference genome was downloaded from the NCBI with assembly accesion code: GCA_905187595.1. The associated paper is [2].

Softwares used

  • Stacks V2.64
  • fastp V0.23.4
  • Samtools

Pipeline run

Reference guided genotyping

deNovo genotyping

References

[1]
Ewels, P. SRA-explorer. http://sra-explorer.info/.
[2]
Finkers, R. et al. 2021. Insights from the first genome assembly of onion (allium cepa). G3. 11, 9 (2021), jkab243.
[3]
Lee, J.-H. et al. 2018. SNP discovery of korean short day onion inbred lines using double digest restriction site-associated DNA sequencing. PloS one. 13, 8 (2018), e0201229.