• Language
Customer Survey
Tell us about your project
Contact us to discuss how we can help you achieve your research goals
mRNA Sequencing

mRNA Sequencing


Messenger RNA (mRNA) is the RNA that carries information from DNA to the ribosome, the sites of protein synthesis (translation) in the cell. The coding sequence of the mRNA determines the amino acid sequence in the protein that is produced. The eukaryotic mRNA sequencing aims at the mRNA (protein-coding RNA) of all kinds of eukaryotes, short as mRNA-Seq.

mRNA-Seq uses next-generation sequencing (NGS) to reveal the presence and quantity of messenger RNA in a biological sample at a given moment, analyzing the continuously changing cellular transcriptome. Novogene’s mRNA-Seq, based on state-of-the-art Illumina NovaSeq platforms with paired-end 150 bp sequencing strategy, offers complete solutions for gene expression quantification and differential gene expression analysis among groups of samples, as well as for identification of novel transcripts, alternative splicing, and gene fusion events, etc. Our experienced bioinformaticians work closely with customers to provide standard and customized data analysis and publication ready results for species with and without a reference genome.


For medical research:

  • Pathological mechanism
  • Tumor-subtypes classification
  • Molecular markers
  • Human evolution
  • Drug target
  • Clinical diagnostics
  • Personal health care

For agricultural research:

  • Development
  • Adaptability
  • Agronomic traits
  • Crop evolution


  • Extensive experience with over tens of thousands of projects being successfully completed and multiple articles being published on journals of high Impact Factors.
  • Unsurpassed data quality with a guaranteed Q30 score ≥80% that exceeds Illumina’s official benchmarks.
  • Comprehensive data analysis using widely accepted mainstream software and mature in-house pipeline to detect differential expressions, to discover novel transcripts and to make functional annotations.
  • Free, powerful Novogene in-house software enabling clients to visualize data analysis results flexibly with a user-friendly interface.

Sample Requirements

Library Type Sample Type Amount RNA Integrity Number
(Agilent 2100)
Eukaryotic RNA-Seq
(cDNA library)
Total RNA ≥ 0.4 μg
≥ 6.8 (Animal), with smooth base line
≥ 6.3 (Plant and Fungus), with smooth base line
OD260/280 = 1.8-2.2;
OD260/230 ≥ 1.8;
Total RNA (Blood) ≥ 0.8 μg
Total RNA (Single Cell) ≥ 100ng
Amplified cDNA (double-stranded) ≥ 100ng Fragments between 400bp and 5000bp with main peak at ~2000bp OD260/280 = 1.8-2.0;
OD260/230 ≥ 1.8;
Eukaryotic RNA-Seq
(strand specific library)
Total RNA ≥ 0.8 μg ≥ 6.8 (Animal), with smooth base line
≥ 6.3 (Plant and Fungus), with smooth base line
OD260/280 = 1.8-2.2;
OD260/230 ≥ 1.8;

Download full version

Note: For detailed information, please contact us.

Sequencing Parameters and Analysis Contents

Sequencing Platform Illumina NovaSeq 6000
Read Length Pair-end 150
Recommended Data Output ≥20 million read pair per sample for species with reference genome;
≥50 million read pairs per sample for species without reference genome (de novo transcriptome assembly projects)
Standard Data Analysis
Data Quality Control
Mapping to reference genome/assembled genome
Gene expression quantification & Differential expression profiling & Enrichment analysis
Protein-Protein Interaction (PPI) analysis
Transcription factors functional annotation analysis
Oncogene functional annotation analysis
SNP & InDel analysis
Alternative splicing analysis
Fusion gene prediction (Only for tumor sample and cancer cell line)

Download full version

Note: Sequencing depths and bioinformatic analysis requests can be customized based on the project needs. Please contact us for more information.

Project Workflow

Sample Quality Control

Library Quality Control

Data Quality Control

Total RNA

Library Construction


Bioinformatics Analysis

Armadillo repeat containing 12 promotes neuroblastoma progression through interaction with retinoblastoma binding protein 4


Neuroblastoma (NB), one of the most common malignant solid tumors in pediatric population that arises from neural crest-derived cells, constitutes 15% of cancer related mortality in childhood. Poor clinical outcome in patients suffering from high risk NB. The mechanisms essential for the aggressiveness and progression of NB still warrant further investigation.

Sampling & Sequencing Strategy:

Sample Preparation
• Tumor cells with/ without MYCN amplification

Sequencing Strategy
• Library preparation: RNA-seq library
• Sequencing: Illumina HiSeq X Ten”


Figure 1 Ectopic expression of ARMC12 represses the expression of PRC2 downstream tumor suppressive genes in NB cells.

A Volcano plots (left panel), Venn diagram (middle panel), and heatmap (right panel) revealing the alteration of gene expression (fold change > 2.0, FDR < 0.05) in SH-SY5Y cells stably transfected with empty vector (mock) or ARMC12. Red indicates high expression, and blue indicates low expression in heatmap.


ARMC12 plays a crucial role in tumor progression and could be a potential therapeutic approach for NB. Mechanistically, ARMC12 physically interacts with retinoblastoma binding protein 4 (RBBP4) to facilitate the formation and activity of polycomb repressive complex 2, resulting in transcriptional repression of tumor suppressive genes.

Targeting epigenetic crosstalk as a therapeutic strategy for ezh2-aberrant solid tumors


Mutations or aberrant upregulation of EZH2 occur frequently in human cancers, yet clinical benefits of EZH2 inhibitor (EZH2i) remain unsatisfactory and limited to certain hematological malignancies. Addressing how EZH2i modulates global epigenetic signatures and, more importantly, how the new insights can be translated into a better therapeutic strategy using EZH2is in a variety of solid tumors is quite meaningful.

Sampling & Sequencing Strategy:

Sample Preparation
• U2932, SMMC-7721 and Pfeiffer cells

Sequencing Strategy
• Library Preparation: mRNA library, NEBNext UltraTM RNA Library Prep it for Illumina
• Sequencing: Illumina platform


Figure 2. Feedback H3K27 Acetylation Change Drives Oncogenic Transcriptional Reprogramming

A) GSEA analysis of H3K27ac ChIP-seq data, RNA-seq data, and proteome data affected by EPZ-6438. The global heatmap showing the enriched pathways in the oncogenic signatures from the Molecular Signatures Database (MSigDB) with EPZ-6438 compared to DMSO treated in U2932, SMMC-7721, and Pfeiffer cell lines. The color is according to FDR q value, and the darkest blue represents q R 0.1 or N/A. (B and C) Venn diagram showing the overlap of the statistically (FDR q < 0.05) enriched pathways among the insensitive cell lines (U2932, SMMC-7721) based on RNA-seq data (B) and proteome data (C), respectively.


Together, the epigenetic interplay revealed in this study enabled us to expand the therapeutic potential of EZH2is from hematological malignances to solid tumors The insights reveal that EZH2i caused the crosstalk between H3K27me and H3K27ac and leads to oncogene activation. This may suggest that targeting this crosstalk could provide therapeutic promise.

mRNA and Small RNA Transcriptomes Reveal Insights into Dynamic Homoeolog Regulation of Allopolyploid Heterosis in Nascent Hexaploid Wheat


Nascent allohexaploid wheatmay represent the initial genetic state of common wheat (Triticumaestivum), which arose as a hybrid between Triticum turgidum (AABB) and Aegilops tauschii (DD) and by chromosome doubling and outcompeted its parents in growth vigor and adaptability. The molecular basis for this success remains unclear.

Sampling & Sequencing Strategy:

Sample Preparation
• Tissues of Hexaploid Wheat, Chinese spring, Triticum Turgidum, Aegilops Tauschii

Sequencing Strategy
• Library preparation: mRNA-seq and sRNA-seq libraries
• Sequencing: Illumina HiSeq2000


Figure 3. Nonadditively Expressed Genes in Young Spikes of Nascent Allohexaploid Wheat.

(A) Genes differentially expressed in S3 progeny and their tetraploid (AABB) and diploid (DD) progenitors. Numbers close to the species (colored) represent upregulated genes compared with the neighboring species. Percentages indicate those among all expressed genes in young spikes. The total number of genes differentially expressed between two species is given (black).

(B) GO enrichment analysis of nonadditively expressed genes. Shown are significantly enriched GO terms (Fisher test FDR < 0.05). BP, biological process; MF, molecular function; CC, cellular component.


Allohexaploid wheat combines the AB genomes from tetraploid wheat with the D genome from Ae. tauschii, resulting in the union of genomes from varieties previously adapted to different environments and thus providing the potential for further adaptation to a wider range of growth environments. Overall, the molecular underpinnings established during the early allopolyploidization events laid the groundwork for the successful advent of common wheat.

Error Rate Distribution

The x-axis shows the base position along each sequencing read and the y-axis shows the base error rate.

GC Content Distribution

Horizontal axis for reads position, vertical axis for single base percentage. Different color for different base type.

Classification of raw reads

Mapping region


X axis represents the name of sample, Y axis indicates the log10(FPKM+1), parameters of box plots are indicated, including maximum, upper quartile, mid-value, lower quartile and minimum.

Volcano Plot

Horizontal axis for the fold change of genes in different samples. Vertical axis for statistically significant degree of changes in gene expression levels, the smaller the corrected pvalue, the bigger -log10(corrected pvalue), the more significant the difference. The point represents gene, blue dots indicate no significant difference in genes, red dots indicate upregulated differential expression genes, green dots indicate downregulated differential expression genes.

Hierarchical Clustering Heatmap

The overall results of FPKM cluster analysis, clustered using the log10(FPKM+1) value. Red denotes genes with high expression levels, and blue denotes genes with low expression levels. The color ranging from red to blue indicates log10(FPKM+1) value from large to small.