Powerful CRISPR/Cas9 screens via computational prediction of DNA repair profile
Wellcome Trust Sanger Institute
Dr Felicity Allen’s project will make novel large-scale measurements
of DNA mutations generated by CRISPR/Cas9 to build a predictive machine
learning model of gene editing outcome. This will resolve costly issues
of redundant design and inaccurate quantification of powerful
genome-wide gene knockout experiments.
CRISPR/Cas9, a recently discovered DNA editing system, is
revolutionising biological research across medicine, agriculture and
fundamental cell biology. Its perhaps most simple, yet exciting
application disables a gene to test whether this affects a chosen cell
function, such as correct development or cancerous growth. Large-scale
experimental designs allow this to be carried out for all genes in the
human genome simultaneously, in a comprehensive, unbiased fashion. While
this has transformed the way we answer a wide range of biological
questions, we still cannot efficiently measure or predict the exact
editing outcome within each gene. As this impacts whether the gene is
effectively disabled, it limits the power of the overall method. In this
project, Felicity will build models and tools to solve this problem.
The CRISPR/Cas9 system disables a gene by causing short insertions or
deletions in its DNA sequence. Only some of these possible mutations
will succeed in preventing the gene from functioning, and despite their
central role, it is too laborious and expensive to measure which editing
outcomes occurred within each experiment.
It
is known that the generated mutations are not random, and depend on the
DNA sequence of the gene target, which motivates a systematic study of
the exact nature of this link, and the development of a predictive tool,
as proposed here.
Together with collaborators at the Wellcome Trust Sanger Institute,
Felicity has designed a novel approach to efficiently measure the
mutations for 90,000 CRISPR/Cas9 gene edits. With the data from this
experiment, Felicity will use statistical and machine learning methods
to develop a model that predicts the distribution of mutations for each
gene target. She will then produce a design tool that uses these
predictions to select targets with desirable mutations, as well as an
analysis tool that accounts for variability in editing outcomes when
determining causal genes from large-scale CRISPR/Cas9 experiments. These
will be the first public CRISPR/Cas9 tools to be informed by the
diversity of generated mutations, and will explicate experimental
results for researchers using the transformative CRISPR/Cas9 technology
worldwide.