NMPFamsDB

NMPFamsDB

NMPFamsDB

A database of Novel Metagenome Protein Families

A database of Novel Metagenome Protein Clusters

A database of Novel Metagenome Protein Clusters
x
This website uses cookies to improve user experience. By using NMPFamDB you consent to all cookies in accordance with our privacy policy. OK
Metagenome Family F104154

Metagenome Family F104154

Go to section:
Overview Alignments Structure & Topology Gene Neighborhood Phylogeny Ecosystems Sequences
Select file to download:
   Download


Overview

Basic Information
Family ID F104154
Family Type Metagenome
Number of Sequences 100
Average Sequence Length 47 residues
Representative Sequence MQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY
Number of Associated Samples 2
Number of Associated Scaffolds 100

Quality Assessment
Transcriptomic Evidence No
Most common taxonomic group Unclassified
% of genes with valid RBS motifs 0.00 %
% of genes near scaffold ends (potentially truncated) 0.00 %
% of genes from short scaffolds (< 2000 bps) 0.00 %
Associated GOLD sequencing projects 2
AlphaFold2 3D model prediction Yes
3D model pTM-score0.18

Note: High quality evidence is represented by blue. Low quality evidence is represented by red.
Hidden Markov Model
Powered by Skylign

Most Common Taxonomy
Group Unclassified (98.000 % of family members)
NCBI Taxonomy ID N/A
Taxonomy N/A

Most Common Ecosystem
GOLD Ecosystem Host-Associated → Plants → Roots → Unclassified → Unclassified → Root
(99.000 % of family members)
Environment Ontology (ENVO) Unclassified
(100.000 % of family members)
Earth Microbiome Project Ontology (EMPO) Host-associated → Plant → Plant rhizosphere
(100.000 % of family members)



 ⦗Top⦘

Multiple Sequence Alignments

Select alignment to view:      


 ⦗Top⦘

Structure & Topology

Predicted Secondary Structure and Topology

Predicted Topology & Secondary Structure
Classification: Globular Signal Peptide: No Secondary Structure distribution: α-helix: 5.19%    β-sheet: 0.00%    Coil/Unstructured: 94.81%
Feature Viewer
Powered by Feature Viewer

Predicted 3D Structure

Structure Viewer
Per-residue confidence (pLDDT):
  0-50   51-70   71-90   91-100  
pTM-score: 0.18
Powered by PDBe Molstar

Low Quality Model:

This family has a low confidence model (pTM < 0.7) and has not been screened against SCOPe or PDB.


 ⦗Top⦘

Gene Neighborhood

Neighboring Pfam domains

Pfam IDName % Frequency in 100 Family Scaffolds
PF03732Retrotrans_gag 6.00
PF00665rve 4.00
PF00967Barwin 1.00
PF03469XH 1.00
PF00004AAA 1.00
PF00385Chromo 1.00
PF13960DUF4218 1.00
PF00078RVT_1 1.00

Neighboring Clusters of Orthologous Genes (COGs)

COG IDNameFunctional Category % Frequency in 100 Family Scaffolds
COG2801Transposase InsO and inactivated derivativesMobilome: prophages, transposons [X] 4.00
COG2826Transposase and inactivated derivatives, IS30 familyMobilome: prophages, transposons [X] 4.00
COG3316Transposase (or an inactivated derivative), DDE domainMobilome: prophages, transposons [X] 4.00
COG4584TransposaseMobilome: prophages, transposons [X] 4.00


 ⦗Top⦘

Phylogeny

NCBI Taxonomy

Select NCBI taxonomy Level:
NameRankTaxonomyDistribution
UnclassifiedrootN/A98.00 %
All OrganismsrootAll Organisms2.00 %

Visualization
Powered by ApexCharts

Associated Scaffolds


ScaffoldTaxonomyLengthIMG/M Link
3300014486|Ga0182004_10005062All Organisms → cellular organisms → Eukaryota → Viridiplantae → Streptophyta → Streptophytina → Embryophyta → Tracheophyta → Euphyllophyta → Spermatophyta → Magnoliopsida → Mesangiospermae → Liliopsida → Petrosaviidae → commelinids → Poales → Poaceae → PACMAD clade → Panicoideae → Andropogonodae → Andropogoneae → Sorghinae → Sorghum → Sorghum bicolor10708Open in IMG/M
3300014486|Ga0182004_10018431All Organisms → cellular organisms → Eukaryota → Viridiplantae → Streptophyta → Streptophytina → Embryophyta → Tracheophyta → Euphyllophyta → Spermatophyta → Magnoliopsida → Mesangiospermae → Liliopsida → Petrosaviidae → commelinids → Poales → Poaceae5330Open in IMG/M



 ⦗Top⦘

Environmental Properties

Associated Habitat Types

Select Environment Taxonomy Level:
HabitatTaxonomyDistribution
RootHost-Associated → Plants → Roots → Unclassified → Unclassified → Root99.00%
RhizosphereHost-Associated → Plants → Rhizosphere → Unclassified → Unclassified → Rhizosphere1.00%

Visualization
Powered by ApexCharts



Associated Samples

Taxon OIDSample NameHabitat TypeIMG/M Link
3300014486Endophyte microbial communities from Sorghum bicolor roots, Mead, Nebraska, USA - 072115-40_1 MetaGHost-AssociatedOpen in IMG/M
3300015265Rhizosphere microbial communities from Sorghum bicolor, Mead, Nebraska, USA - 072115-103_1 MetaGHost-AssociatedOpen in IMG/M

Geographical Distribution
Zoom:     Powered by OpenStreetMap



 ⦗Top⦘

Family Sequences

Protein ID Sample Taxon ID Habitat Sequence
Ga0182004_10000730223300014486RootPHEPLVILVEYDMCSLLQFPNTSGYDNEDDWNEDYRYEY*
Ga0182004_1000246273300014486RootMQLNQFQLFEPYESLVILFEYNMCSLLQFSHTLGYANDDDWNEDYRYEY*
Ga0182004_10002554113300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLHFPNTPGYDNDDAWNEDYRYEY*
Ga0182004_1000506213300014486RootMLSNRCQLFEPHEPLVILIEYDMCSLLQFPNTLGYAKDDDWIEDYCYEY*
Ga0182004_1000929113300014486RootHEPLVILVEYDMCSLLQFPNTSGYDNEDDWNEDYRYEY*
Ga0182004_1000955723300014486RootMQSNQCQLFEPHEPLVIFVEYEMCSLLQSPNTPGYASVDVWNEDYRYEY*
Ga0182004_10009996103300014486RootSNQCQLFEPYEPLVILVEYDMCSLLQFPNTSGYDNEDDWNENYRYEY*
Ga0182004_1001071413300014486RootPHEPLVILVEYDMCSLLQFPNTPGYDNDDAWNEDYRYEY*
Ga0182004_1001442173300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTLGYANDDDWYEDYRYEY*
Ga0182004_1001492233300014486RootMEQR*IMWLQSQLFEPHEPLVILVEYDMCSLLQFSNTPGYANDDDWNEDYRYEY*
Ga0182004_1001492843300014486RootMQSNQCQLIEPHEPLVILVEYDM*SLLQFPNTSGYSNEDDWNKDYRYEY*
Ga0182004_1001516373300014486RootMQSNLCQLFEPHEPLVILVEYDMCSLLQFSNTPGYANDDDWNEDYRYEY*
Ga0182004_1001525583300014486RootMQSNQYQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEHYRYEY*
Ga0182004_1001530613300014486RootMQLNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1001666913300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTLGYAYDDDWNEDYRYES*
Ga0182004_10017223113300014486RootMQSTQCQLFEPHEPLVILVEYDMCSLLQFPNTSGYDNEDVWNEDYRYEY*
Ga0182004_1001778913300014486RootCQLFEPHEPLVILVEYDMCSLLQFPNTPGYDNDDAWNEDYRYEY*
Ga0182004_1001828543300014486RootMQSNQCQLIEPHEPLVILVEYDMCSLLLFDNTLGYSDEDDWNEDYRYEY*
Ga0182004_1001843143300014486RootMQSNQCQFIEPREPLVILVEYDLCSPSLFHNTSGYVNEDDWNENYRYEY*
Ga0182004_1001949173300014486RootMLYAYKSFLQSNKCQLFEPHEPLVILVEYDMCSLLQFPKTSGYVNEDDWNENYRYEY*
Ga0182004_1002028573300014486RootMQSNQCQLFEPHEPLVILVEYNMCSLLQFPNTSGYDNEDDWNEDYRYEY*
Ga0182004_1002045223300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFSNTPGYASDDVWNGVYRYEY*
Ga0182004_1002152723300014486RootMRLNQCQLFEPHEPLAILVEYDMCSLLQFFNTPGYANDDDWNEDYRYEY*
Ga0182004_1002162643300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYAMDDDWIEDYRYEY*
Ga0182004_1002189553300014486RootMQSNQCQLIEPHEPLVILVEYDMCSLLLFHNTLVYSNEDDWNEDYRYEY*
Ga0182004_1002340933300014486RootMQSNQCQLFEPHEPLIILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1002446713300014486RootMQSYQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNED
Ga0182004_1002560433300014486RootMQSNQCKLFEPHEPLVIHVEYDMCSLLQFPNTPGYAYDDDWNEDYRYEY*
Ga0182004_1002677213300014486RootFEPHKSLVILVEYDMCSLLQFPNTPGYANDDDSNEDYRYEY*
Ga0182004_1002907313300014486RootPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1002909443300014486RootMQSNQCQLFELHEPLVILVEYDMCSLLQFPNTPGYASDDDWNEDYRYEY*
Ga0182004_1002916513300014486RootMKSNLQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYDNDVAWNEDYRYEY*
Ga0182004_1003046013300014486RootMQSNHCQLIEPHEPLVILVEYDMCSLLLFHNTSGYVNEGDWNEDYRYEY*
Ga0182004_1003062663300014486RootMQLNQCQLFEPHEHLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1003247213300014486RootMQSNLCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDLNEDYRYEY*
Ga0182004_1003263313300014486RootMQSNQCQLIQPHEPHVILVEYDMCSLLQFPNTPGYANDDGWNEDYRYEY*
Ga0182004_1003285623300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLMQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1003553023300014486RootMQLNQCQLFEPHEHLVILVEYDMYSLLQILKTQGYANEDDWNEDY*
Ga0182004_1003558023300014486RootMQSNQCQLNEPHEPLVILVEYDMCSLLLFHNTSGYVNEDDWNEDYRYEY*
Ga0182004_1003613533300014486RootMQRNQCQLIKPHELLVILVEYNMCSLLLFHNTSGYSNEDDWNEDYRYEY*
Ga0182004_1003623673300014486RootMQLNQCQLFEPHEPLVILVEYDMCSLLQFPNTRGYDNDDDWNEDYCYEY*
Ga0182004_1004076013300014486RootMQSNQCQLFEPHEPLVILVEYDMCSPLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1004132653300014486RootMQSNQCQLFEPREPLFILVEYNMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1004161933300014486RootMESNQCQLFEPHEPLVILVEYDMCLLLQFPNTPGYVNDDAWNEDYPYEY*
Ga0182004_1004162533300014486RootMQSNQCQLFEPHEPLVILIEYDMCSLLQFPNTLGYDNDDACNEDYRYEY*
Ga0182004_1004254313300014486RootMQSNQCQLIEPHEPLVILVEYDMYSLLHFPNTRGYANYDGWNEDYRYEY*
Ga0182004_1004564123300014486RootMQSNQCQLFEPHEPLVILVDYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1004926323300014486RootMQLNQCQLIESQELLIILIVYDMCSLLLFHNTSGYVNEDDWNWNENYRYEY*
Ga0182004_1005041823300014486RootMQSNQCQLFEPYEHLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1005237013300014486RootSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNDNYRYEY*
Ga0182004_1005627313300014486RootMQSNQCQLIETHEPRPLLILVEYDMCSLLLFHNTLGCANEDDWDEG*
Ga0182004_1005819013300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFSNTLGYANEDDWNEDYRYEY*
Ga0182004_1005857713300014486RootMQSNQCQLFEPHEPLVILVEYDMFSLLQFPNTSGYDNKDDWNEDYRYEY*
Ga0182004_1005888333300014486RootMQSNQCQLFEPHEPLVILVEYDMCLLLQFPNTPGYANDDD*NEDYRYEY*
Ga0182004_1006075413300014486RootMQSNQWQLIEPHEALVILVEYDMCSLLLFHNTSGYVNEDDWNEDYRYEY*
Ga0182004_1006109513300014486RootEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1006268533300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFSNTLGYAKDDDWIEDYRYVY*
Ga0182004_1006351413300014486RootNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1006630723300014486RootMQSNQCQLIEPHEPLVILVEYDMCSLLQFSNTSGYDNKDDWNEDYRYEY*
Ga0182004_1006793723300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFRNTSGYANDDAWNEDYRYEY*
Ga0182004_1006845313300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1007246623300014486RootMQLNQCQLFEPDEPLVILVEYDMCLLLQFPNTSGYANDDDWIEDYRYEY*
Ga0182004_1007675013300014486RootCQLFEPHEPLVILVEYDMCSLLQFPNTSGYDNEDDWNEDYRYEY*
Ga0182004_1008359413300014486RootNQCQLFEPHEPLVILVEYDMCSLLQFSNTPGYANDDDWNEDYRYEY*
Ga0182004_1008375913300014486RootMQSNHCQLFEPHEPLVILAEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1009184223300014486RootMQSNQCQLFEPHEPLVILIVYDMCSLLQFPNTLGYANDDGWNKDYCYEY*
Ga0182004_1009240823300014486RootHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1009344023300014486RootMQSNRSQLFEPHEPLVILVEYDMCSLLQFPNTPGYAKDDDWIEDYRYEY*
Ga0182004_1009629813300014486RootSNQCQLFEPHEPLVILVEYDMCSLLQFPNTLGYANDDDWNEDYRYEY*
Ga0182004_1009718613300014486RootQCQLFEPHEPLVILVEYDMCSLLQFPNTLGYDNEDDWNEDYRYEY*
Ga0182004_1009868213300014486RootCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANEDDWNEDYRYEY*
Ga0182004_1011296813300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYAKDDDWIEDYRYEY*
Ga0182004_1012055423300014486RootMQSNQSQLFEPHEPLVILVEYDMCSLLQFPNTLGYANDDDWNEDYRYEY*
Ga0182004_1012618423300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTSGYDNEDDWNE
Ga0182004_1012674913300014486RootMQSNQCQLIESHGPLVILVEYDIRSLLLFHNTSGYSNEDD
Ga0182004_1012952013300014486RootMQSNLCQLFEPHEPLVILVEYYMCSLLQFPITPGYAYDDDWNEDYRYEY*
Ga0182004_1014023513300014486RootQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWSKDYRYEY*
Ga0182004_1014316113300014486RootMQSNQCQLIEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1014509813300014486RootMQLNQCQLFEPHEPLVILVEYDMCSLLQFCKTPGYANEDDWNEDYRYEY*
Ga0182004_1015710913300014486RootQCQLFEPREPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYCYEY*
Ga0182004_1015791513300014486RootSNQCQLFEPHEPLVILVEYDMCSLLQFSNTPGYANDDDWNENYRYEY*
Ga0182004_1016956723300014486RootMQLNQCQLFEPHEPLVILVEYDMCSPLQFPNTPGYDNEDDWNEDYRYEY*
Ga0182004_1017218713300014486RootFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEF*
Ga0182004_1018741613300014486RootFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1019086513300014486RootNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYDNEDDWNEDYRYEY*
Ga0182004_1019618413300014486RootMQSNQCQLFEPYEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRYEY*
Ga0182004_1019776813300014486RootMLLNQCQLFEPHEPLVILVEYDMYSPLQFPNTSGYANDDDWNEDYRYEYYAWS*
Ga0182004_1019927413300014486RootMQLNQCQLSEPHEPLVILVEYDMCSLLQFPNTPGYAKVDAWNEDYRYEY*
Ga0182004_1020564913300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDD
Ga0182004_1020572013300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYTNDDDW
Ga0182004_1022065013300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYA
Ga0182004_1022486713300014486RootMKSNQCQLIEPHEPLVILVEYDMCSLLQFPNTSGYDNEDD
Ga0182004_1024690523300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPDYANDDDWNEDYRYEY*
Ga0182004_1026142423300014486RootMQSNQCQLFEPHEPLVILVEYDMCSLLQFPNTPGYDNDD
Ga0182004_1026440713300014486RootMQSNQCQLFEPYEPLVILVEYDMCSLLQFPNTSGYDN
Ga0182004_1027392913300014486RootCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANDDDWNEDYRIEY*
Ga0182004_1028026113300014486RootCQLFEPHEPLVILVEYDMCSLLQFPNTPGYANEDDWNEDYLYEY*
Ga0182004_1028231913300014486RootCQLIEPHEPLVILVEYDICSLLQFPNTSGYDNEDDWNEDYRYDY*
Ga0182004_1029880213300014486RootMQSNQCQLIEPHEPLVILVEYDMCSLLQFPNTLGYANDDDWNEDYRYEY*
Ga0182005_128041123300015265RhizosphereMQSNLCQLFEPHEPQVILVEYDMCSLLQFPNTSGNDNEDDWNEDYRYEY


 ⦗Top⦘


© Pavlopoulos Lab, Bioinformatics & Integrative Biology | B.S.R.C. "Alexander Fleming" | Privacy Notice
Make sure JavaScript is enabled in your browser settings to achieve functionality.