Find Paper, Faster
Example:10.1021/acsami.1c06204 or Chem. Rev., 2007, 107, 2411-2502
An integrative proteogenomics approach reveals peptides encoded by annotated lincRNA in the mouse kidney inner medulla.
Physiological Genomics  (IF3.107),  Pub Date : 2020-08-31, DOI: 10.1152/physiolgenomics.00048.2020
Cameron T Flower,Lihe Chen,Hyun Jun Jung,Viswanathan Raghuram,Mark A Knepper,Chin-Rang Yang

Long noncoding RNAs (lncRNAs) are intracellular transcripts longer than 200 nucleotides and lack the capacity to encode protein. A subclass of lncRNA known as long intergenic noncoding RNAs (lincRNAs) are transcribed from genomic regions that share no overlap with annotated protein-coding genes. Increasing evidence has shown that some annotated lincRNA transcripts do in fact contain open reading frames (ORFs) encoding functional short peptides in the cell. Few robust methods for lincRNA-encoded peptide identification have been reported, and the tissue-specific expression of these peptides has been largely unexplored. Here we propose an integrative workflow for lincRNA-encoded peptide discovery and tested it on the mouse kidney inner medulla (IM). In brief, low molecular weight protein fractions were enriched from homogenate of IM and trypsinized into shorter peptides, which were characterized using high resolution liquid chromatography-tandem mass spectrometry (LC-MS/MS). The challenge is to curate a hypothetical lincRNA-encoded peptide database for peptide-spectrum matching following LC-MS/MS. We performed RNA-Seq on IM, computationally filtered out reads overlapping with annotated protein-coding genes, and re-mapped the remaining reads to a database of mouse noncoding transcripts. The mapped transcripts are likely to be lincRNAs, and further searched for ORFs using an existing rule-based algorithm for peptide-spectrum matching. Peptides identified by LC-MS/MS were further evaluated using several quality control criteria and bioinformatics methods. We discovered three novel lincRNA-peptides, which are conserved in mouse, rat, and human. The workflow can be adapted for discovery of small protein-coding genes in any species or tissue where noncoding transcriptome information is available.