A database of G-quadruplex sequences and clusters

G-quadruplexes (G4s) play crucial roles in key biological functions, including transcription, replication, telomere maintenance, and genomic instability, and have been an important component of organismal evolution.

Quadrupia is a comprehensive database of G4 sequences for a large collection of reference genomes from GenBank and RefSeq, enabling their study across the tree of life. Each genome has been annotated using two distinct methods (regular expressions and G4hunter) for the detection of G4 sequences. In addition, the database contains a curated list of G4 sequence clusters, created using the linclust algorithm of MMseqs2.

Through Quadrupia, users can:

  • Browse the G4 sequences of more than 100,000 reference genomes from all organism groups (Bacteria, Archaea, Eukaryota and Viruses).
  • Explore a curated list of 319,784 sequence clusters.
  • Perform sequence-based, HMM-based and motif-based queries against the database's contents.

If you find Quadrupia useful in your work please cite:

Chantzi, N.+, Nayak, A.+, Baltoumas, F.A.+, Aplakidou, E., Liew, S.W., Galuh, J.E., Montgomery, A., Moeckel1, C., Mouratidis, I., Sazed, S.A., Guiblet, W., Karmiris-ObrataƄski, P., Wang, G., Zaravinos, A., Vasquez, K.M, Kwok, C.K., Pavlopoulos, G.A., Georgakopoulos-Soares, I. (2024)
Quadrupia: Derivation of G-quadruplexes for organismal genomes across the tree of life

Database Contents

GenomesTotal G-quadruplexesPredicted by regular expressionsPredicted with G4Hunter
