Great work led by Moran Cabili and Margaret Dunagin. A wonderful collaboration between the Rinn, Regev and Raj labs!
The advent of tiling microarrays and then deep sequencing has revealed that there are many long transcripts in the cell that often have many hallmarks of messenger RNA (splicing, polyadenylation, etc.), but have very low protein coding potential. These long non-coding RNA have many putative functions in the cell, including control of gene expression. However, the mechanisms underlying their behavior have often proven elusive.
One of the most well-known long non-coding RNAs is Xist, which is involved in X chromosome dosage compensation in eutherians. Part of the reason we know something about how Xist works is due to direct imaging studies, showing that Xist coats one copy of the X chromosome. Debates continue about how many Xist molecules are out there with implications for mechanism (how many sites is Xist active at), thus showing the value in absolute quantification and localization. Unfortunately, we do not have any such information for the majority of recently identified lncRNAs, and we have no idea how general the lessons learned from Xist are.
We wanted to develop a more systematic picture of lncRNA localization and abundance at the single cell level. Thus, we used our single molecule RNA FISH method, that enables detection of individual RNA molecules by fluorescence microscopy to interrogate a panel of around 35 representative lncRNAs for localization and abundance in single cells.
For regular mRNA FISH, we have a very high success rate with our standard probe design software. However, lncRNAs often contain repetitive elements and other sequences. When an oligonucleotide targets such a region, it can bind off-target, thus creating spurious signals. We controlled for this by using two-color assays: if two probe sets colocalize, then they must be binding specifically. This eliminated a number of probe sets, and appears to be a critical validation step for RNA FISH on lncRNAs.
We performed RNA FISH with probe sets for our panel across 3 cell types: HeLa, human foreskin fibroblasts and human lung fibroblasts. We found a wide range of expression levels and patterns, with some expressed at levels on the order of many hundreds or higher per cell (MALAT1) to a large number the expressed at levels of just a few molecules per cell. Of course, the power of our technique was also that it showed us where these lncRNA were instead of just how many there were. We found that overall, they were much more nuclear biased than mRNA.
Overall, we found that the vast majority of our lncRNA localizations could be described as a combination of three underlying patterns: bright foci (most likely at the site of transcription), mono-disperse nuclear RNA, and mono-disperse cytoplasmic RNA. We found examples consisting of each of these independently, and examples in which they all appeared in combination. Our pictures suggest a model in which lncRNA can accumulate at the transcription site, then may diffuse away from there and sometimes make it to the cytoplasm. Note that one can find these patterns in mRNAs as well, but nuclear blobs for mRNAs are typically much less bright, especially when normalized for the total number of mRNAs floating around in the cell, which is typically much larger than for lncRNA.
The generally low abundance of most lncRNA has led to many researchers wondering how they could exert their function at such low copy numbers. One hypothesis in the field is that while most cells may have few or zero lncRNA, occasional rare cells might have very large numbers, thus allowing lncRNA to play their functional role in those cells. We performed an extensive single cell analysis of lncRNA expression, and while we observed some variability, none had variability beyond that of typical mRNAs, and we found no evidence for rare cells with unusually high abundance.
Many lncRNA seem to appear near coding genes, but transcribed in the opposite direction. We wondered whether this physical proximity would lead to any particular transcriptional associations. We found that a couple of the lncRNAs showed a correlation at the single cell level, but most did not, suggesting (though not proving) that there is no relationship between the transcription of the lncRNA and the proximal coding gene.