Revealing the Grammar of Small RNA Secretion Using Interpretable Machine Learning
AbstractSmall non-coding RNAs can be secreted through a variety of mechanisms, including exosomal sorting, in small extracellular vesicles, and within lipoprotein complexes1,2. However, the mechanisms that govern their sorting and secretion are still not well understood. In this study, we present ExoGRU, a machine learning model that predicts small RNA secretion probabilities from primary RNA sequence. We experimentally validated the performance of this model through ExoGRU-guided mutagenesis and synthetic RNA sequence analysis, and confirmed that primary RNA sequence is a major determinant in small RNA secretion. Additionally, we used ExoGRU to revealcisandtransfactors that underlie small RNA secretion, including known and novel RNA-binding proteins, e.g., YBX1, HNRNPA2B1, and RBM24. We also developed a novel technique called exoCLIP, which reveals the RNA interactome of RBPs within the cell-free space. We used exoCLIP to reveal the RNA interactome of HNRNPA2B1 and RBM24 in extracellular vesicles. Together, our results demonstrate the power of machine learning in revealing novel biological mechanisms. In addition to providing deeper insight into complex processes such as small RNA secretion, this knowledge can be leveraged in therapeutic and synthetic biology applications.