For analysis of global gene expression, RNA-seq generates short reads from fragmented RNA molecules and the number of reads is proportion alto the richness and length of the transcripts. Methods for normalization of RNA-sequencing gene expression data commonly assume equal total expression between compared samples. During zebra fish early developmental stages, compare performance of three normalization methods (BSN, RPM and TMM) when polyA+ RNA content fluctuates significantly. The results suggest that reads per kilobase per million (RPKM) and trimmed mean of M-values (TMM) normalization systematically leads to biased gene expression estimates.
In zebrafish embryos, about 70% of maternal transcripts undergo cytoplasmic polyadenylation prior to onset of zygotic transcription, leading to a 50–70% increase in the rescued polyA+ RNA amounts
The new method, called biological scaling normalization (BSN),uses experimentally measured polyA+ RNA amounts as scales to normalize different developmental stages. The method improved accuracy of normalizing gene expression data using BSN compared to RPM and TMM during developmental periods that display global increases and decreases in mRNA content. This approach can apply to all kinds of fields, ranging from comparing gene expression during development, disease, and tissue- and cell-type specification.