Data from: First insights into the transcriptome and development of new genomic tools of a widespread circum-Mediterranean tree species, Pinus halepensis Mill.

  1. Pinosio, Sara 1
  2. González-Martínez, Santiago C. 2
  3. Bagnoli, Francesca 3
  4. Cattonaro, Federica 4
  5. Grivet, Delphine 5
  6. Marroni, Fabio 6
  7. Lorenzo, Zaida 5
  8. Pausas, Juli G. 7
  9. Verdú, Miguel 8
  10. Vendramin, Giovanni G. 1
  1. 1 Institute of Biosciences and Bioresources
    info

    Institute of Biosciences and Bioresources

    Bari, Italia

    ROR https://ror.org/01gtsa866

  2. 2 University of Lausanne
    info

    University of Lausanne

    Lausana, Suiza

    ROR https://ror.org/019whta54

  3. 3 Plant Protection Institute
    info

    Plant Protection Institute

    Budapest, Hungría

    ROR https://ror.org/052t9a145

  4. 4 Istituto di Genomica Applicata
    info

    Istituto di Genomica Applicata

    Udine, Italia

    ROR https://ror.org/057w9fs93

  5. 5 National Agriculture and Food Research Organization
    info

    National Agriculture and Food Research Organization

    Tsukuba, Japón

    ROR https://ror.org/023v4bd62

  6. 6 Università di Udine
    info

    Università di Udine

    Udine, Italia

    ROR https://ror.org/05ht0mh31

  7. 7 Consejo Superior de Investigaciones Científicas
    info

    Consejo Superior de Investigaciones Científicas

    Madrid, España

    ROR https://ror.org/02gfc7t72

  8. 8 National Research Council

Editor: Dryad

Any de publicació: 2014

Tipus: Dataset

Resum

Pinus halepensis is a relevant conifer species for studying adaptive responses to drought and fire regimes in the Mediterranean region. Deciphering the molecular basis of Aleppo pine to the Mediterranean environment is therefore needed. In this study we performed Illumina next-generation sequencing of two phenotypically divergent Pinus halepensis accessions with the aims of i) characterizing the transcriptome through Illumina RNA-Seq of two accessions, phenotypically divergent for adaptive traits link to fire adaptation and drought, ii) performing a functional annotation of the assembled transcriptome, iii) identifying genes with accelerated evolutionary rates, iv) studying the expression levels of the annotated genes, and v) developing gene-based markers for population genomic and association genetic studies. The assembled transcriptome consisted in 48,629 contigs and covered about 54.6 Mbp. The comparison of P. halepensis transcripts to Picea sitchensis protein-coding sequences resulted in the detection of 34,014 SNPs across species, with a Ka/Ks average value of 0.216, suggesting that the majority of the assembled genes are under negative selection. Assembled genes showed an over-representation in expression of genes involved in protein synthesis. Several genes were differentially expressed across the two pine accessions with contrasted phenotypes, including glutathione s-transferase, the cellulose synthase and the cobra-like protein . A large number of new markers (8,248 SSRs and 28,236 SNPs) has been identified which should facilitate future population genomics and association genetics in this species. Our results showed that Illumina next-generation sequencing is a valuable technology to obtain an extensive overview on whole transcriptomes of non-model species with large genomes.