Selected Publications

Genome editing with the CRISPR–Cas9 system has enabled unprecedented efficacy for reverse genetics and gene correction approaches. While off-target effects have been successfully tackled, the effort to eliminate variability in sgRNA efficacies—which affect experimental sensitivity—is in its infancy. To address this issue, studies have analyzed the molecular features of highly active sgRNAs, but independent cross-validation is lacking. Utilizing fluorescent reporter knock-out assays with verification at selected endogenous loci, we experimentally quantified the target efficacies of 430 sgRNAs. Based on this dataset we tested the predictive value of five recently-established prediction algorithms. Our analysis revealed a moderate correlation (r = 0.04 to r = 0.20) between the predicted and measured activity of the sgRNAs, and modest concordance between the different algorithms. We uncovered a strong PAM-distal GC-content-dependent activity, which enabled the exclusion of inactive sgRNAs. By deriving nine additional predictive features we generated a linear model-based discrete system for the efficient selection (r = 0.4) of effective sgRNAs (CRISPRater). We proved our algorithms’ efficacy on small and large external datasets, and provide a versatile combined on- and off-target sgRNA scanning platform. Altogether, our study highlights current issues and efforts in sgRNA efficacy prediction, and provides an easily-applicable discrete system for selecting efficient sgRNAs.
Nucleic Acids Research 2017; doi: 10.1093/nar/gkx1268, 2017

DNA adenine methyltransferase identification (DamID) has emerged as an alternative method to profile protein-DNA interactions; however, critical issues limit its widespread applicability. Here, we present iDamIDseq, a protocol that improves specificity and sensitivity by inverting the steps DpnI-DpnII and adding steps that involve a phosphatase and exonuclease. To determine genome-wide protein-DNA interactions efficiently, we present the analysis tool iDEAR (iDamIDseq Enrichment Analysis with R). The combination of DamID and iDEAR permits the establishment of consistent profiles for transcription factors, even in transient assays, as we exemplify using the small teleost medaka (Oryzias latipes). We report that the bacterial Dam-coding sequence induces aberrant splicing when it is used with different promoters to drive tissue-specific expression. Here, we present an optimization of the sequence to avoid this problem. This and our other improvements will allow researchers to use DamID effectively in any organism, in a general or targeted manner.
Development 2016 143: 4272-4278; doi: 10.1242/dev.139261, 2016

Engineering of the CRISPR/Cas9 system has opened a plethora of new opportunities for site-directed mutagenesis and targeted genome modification. Fundamental to this is a stretch of twenty nucleotides at the 5’ end of a guide RNA that provides specificity to the bound Cas9 endonuclease. Since a sequence of twenty nucleotides can occur multiple times in a given genome and some mismatches seem to be accepted by the CRISPR/Cas9 complex, an efficient and reliable in silico selection and evaluation of the targeting site is key prerequisite for the experimental success. Here we present the CRISPR/Cas9 target online predictor (CCTop, to overcome limitations of already available tools. CCTop provides an intuitive user interface with reasonable default parameters that can easily be tuned by the user. From a given query sequence, CCTop identifies and ranks all candidate sgRNA target sites according to their off-target quality and displays full documentation. CCTop was experimentally validated for gene inactivation, non-homologous end-joining as well as homology directed repair. Thus, CCTop provides the bench biologist with a tool for the rapid and efficient identification of high quality target sites.
PLoS ONE 10(4): e0124633. doi:10.1371/journal.pone.0124633, 2015

The gene regulatory network (GRN) that supports neural stem cell (NS cell) self-renewal has so far been poorly characterized. Knowledge of the central transcription factors (TFs), the noncoding gene regulatory regions that they bind to, and the genes whose expression they modulate will be crucial in unlocking the full therapeutic potential of these cells. Here, we use DNase-seq in combination with analysis of histone modifications to identify multiple classes of epigenetically and functionally distinct cis-regulatory elements (CREs). Through motif analysis and ChIP-seq, we identify several of the crucial TF regulators of NS cells. At the core of the network are TFs of the basic helix-loop-helix (bHLH), nuclear factor I (NFI), SOX, and FOX families, with CREs often densely bound by several of these different TFs. We use machine learning to highlight several crucial regulatory features of the network that underpin NS cell self-renewal and multipotency. We validate our predictions by functional analysis of the bHLH TF OLIG2. This TF makes an important contribution to NS cell self-renewal by concurrently activating pro-proliferation genes and preventing the untimely activation of genes promoting neuronal differentiation and stem cell quiescence.
Genome Res. 2015. 25: 41-56, DOI: 10.1101/gr.173435.114, 2015

Recent Publications

More Publications

. TGFβ-facilitated optic fissure fusion and the role of bone morphogenetic protein antagonism. Open Biology 2018 8 170134; doi: 10.1098/rsob.170134, 2018.

PDF Custom Link 1

. Refined sgRNA efficacy prediction improves large- and small-scale CRISPR–Cas9 applications. Nucleic Acids Research 2017; doi: 10.1093/nar/gkx1268, 2017.


. iDamIDseq and iDEAR: an improved method and computational pipeline to profile chromatin-binding proteins. Development 2016 143: 4272-4278; doi: 10.1242/dev.139261, 2016.

PDF Code Dataset

. Depletion of Key Meiotic Genes and Transcriptome-Wide Abiotic Stress Reprogramming Mark Early Preparatory Events Ahead of Apomeiotic Transition. Front. Plant Sci. 7:1539 doi: 10.3389/fpls.2016.01539, 2016.


. MEPD: medaka expression pattern database, genes and more. Nucl Acids Res (2015) 44 (D1): D819-D821. DOI:, 2016.


. Handling Permutation in Sequence Comparison: Genome-Wide Enhancer Prediction in Vertebrates by a Novel Non-Linear Alignment Scoring Principle. PLOS ONE 10(11): e0143989. doi: 10.1371/journal.pone.0143989, 2015.


. Loss of NFIX Transcription Factor Biases Postnatal Neural Stem/Progenitor Cells Toward Oligodendrogenesis . Stem Cells and Development 2015, 24(18):2114-2126, DOI: 10.1089/scd.2015.0136, 2015.


. CCTop: An Intuitive, Flexible and Reliable CRISPR/Cas9 Target Prediction Tool. PLoS ONE 10(4): e0124633. doi:10.1371/journal.pone.0124633, 2015.


. Characterization of the neural stem cell gene regulatory network identifies OLIG2 as a multifunctional regulator of self-renewal. Genome Res. 2015. 25: 41-56, DOI: 10.1101/gr.173435.114, 2015.


. Epigenomic enhancer annotation reveals a key role for NFIX in neural stem cell quiescence. Genes & Dev. 2013. 27: 1769-1786 doi: 10.1101/gad.216804.113, 2013.



CCTop (CRISPR/Cas9 target online predictor) standalone

CCTop is a tool to determine suitable CRISPR target sites in a given query sequence(s) and predict its potential off-target sites. The online version of CCTop is available at

The standalone version has a command line interface designed to allow search of large volume of sequences and higher flexibility. Also advisable if the target genome is not publicly available.

Access CCTop standalone from my bitbucket site.

iDEAR (iDamID Enrichment Analysis with R)

iDEAR identifies the binding profile of a transcription factor (or any DNA binding protein) from iDamIDseq data according to our publication in Development.

Access iDEAR from my bitbucket site.


Currently I teach at the University of Oviedo the following subjects:

  • Bioinformática (Máster Universitario en Biotecnología Aplicada a la Conservación y Gestión Sostenible de Recursos Vegetales)
  • Repositorios de Información (Grado en Ingeniería Informática del Software)
  • Sistemas Operativos (Grado en Ingeniería Informática en Tecnologías de la Información)
  • Tecnologías y Paradigmas de la Programación (Grado en Ingeniería Informática en Tecnologías de la Información)
  • Arquitectura del Software (Grado en Ingeniería Informática del Software)

At the University of Heidelberg I’ve taught these subjects (2014-2016):

  • Introduction to Statistical Analysis for Biologists with R (Bachelor in the Faculty of Biosciences)
  • Data Analysis with R (Bachelor in the Faculty of Biosciences)
  • Introduction to Genetics (Bachelor in the Faculty of Biosciences)
  • Statistics for Biology: design, analysis and visualization (Core course in the HBIGS International PhD Program in Molecular and Cellular Biology)

At the University of Castilla-La Mancha I taught these subjects (2006-2010):

  • Minería de Datos (Ingeniería Informática)
  • Bioinformática (Ingeniería Informática)
  • Metodología de la Programación (Ingeniería Informática)
  • Estructuras de Datos y de la Información (Ingeniería Informática)