Tiny, transient loops of genetic material, detected and studied by the hundreds for the first time at Brown University, are providing new insights into how the body transcribes DNA and splices (or missplices) those transcripts into the instructions needed for making proteins.
The lasso-shaped genetic snippets – they are called lariats – that the Brown team reports studying in Nature Structural & Molecular Biology are byproducts of gene transcription. Until now scientists had found fewer than 100 lariats, mostly by poring over very small selections of introns, which are sections of genetic code that do not directly code for proteins, but contain important signals that direct the way protein-coding regions are assembled. In the new study, Brown biologists report that they found more than 800 lariats in a publicly available set of billions of RNA reads derived from human tissues.
“We used modern genomic methods, deep sequencers, to detect these rare intermediates of splicing,” said William Fairbrother, associate professor of biology and senior author of the study. “It’s the first ever report of these things being discovered at a genome scale in living cells, and it tells us a lot about this step of gene processing.”
That specific step is known as RNA splicing. Like film editors splicing together movie scenes, enzymes cut away the introns to assemble exons that instruct a cell’s ribosome to make proteins. The body often has a choice of ways and places to make those cuts. Most of what is known about splicing has come from studying these spliced instructions, said Allison Taggart, a graduate student who is lead author of the study. What’s been missing is the data hidden in the lariats, which fall apart shortly after being spliced out, but turn out to predict the body’s splicing choices.
The key information uncovered in the study, Taggart said, is the location of so-called “branchpoints” on the lariats. Physically, the branchpoint is where the lariat closes on itself to form a loop during the first step of splicing, but its position and proximity to possible splice sites, the researchers learned, reliably relate to where splicing will occur.
After studying the sites of these branchpoints and their relationship to splice sites, the researchers created an algorithmic model that could predict splice sites 95.6 percent of the time. The value of the model is not in identifying splice sites – those are already well known, Fairbrother said. Instead, the model’s accuracy shows that, with the new data from the lariats, scientists have gained a more general understanding of how the body chooses among alternative splicing sites.
“What it does tell us is sets of rules defining the relationship between branchpoints and the chosen splice sites, which gives clues about how the splicing machinery makes decisions,” Taggart said. “Certain branchpoint locations can enforce specific splicing isoforms.”
Connections to disease
In addition to ferreting out the mechanisms of alternative splicing, the team also studied the connection between branchpoints and disease. They looked through the Human Gene Mutation Database for disease-causing mutations found in introns and compared their newly found branchpoint sequences to those mutations. They found that many relate specifically to branchpoints.
“We saw a sequence motif that looked exactly like a branchpoint sequence motif,” she said. “What this tells us is that these mutations are forming at branchpoints and are leading to disease, presumably through causing aberrant splicing by interfering with lariat formation.”
In other words, Fairbrother said, it could well be that a consequence of mutations in branchpoints could be disease.
In addition to Taggart and Fairbrother, other authors include Alec DeSimone, Janice Shih, and Madeleine Filloux.
The National Science Foundation and Brown University funded the research, which was performed in part on the OSCAR supercomputing cluster at the University’s Center for Computation and Visualization.