Collagen protein (polypeptide) - C. hemisphaerica

Overview
NameCollagen protein
Unique NameTCONS_00014070-protein
Typepolypeptide
OrganismClytia hemisphaerica (Jellyfish)
Sequence length1758

Sequence
The following sequences are available for this feature:

polypeptide sequence

>TCONS_00014070-protein ID=TCONS_00014070-protein|Name=Collagen protein|organism=Clytia hemisphaerica|type=polypeptide|length=1758bp
MDLISRSIIFAAIFLALYPGNASSNTQCGGCGACTSLCVGKKGERGAVGM
PGFSGPPGLPGFPGPEGPTGPPGSKGVAGRDGGKGMPGPRGRVGSPGLPG
QPGRHGQPGNEGRRGEAGIPGCNGTKGDRGPEGPRGRDGQQGNQGMEGPE
GPPGEPGDSTATRLKGQKGEPGKFGQDGPKGDGGEKGSKGESGRDGKQGL
RGLKGQKGARGDSNVTIVGQRGDKGLKGLQGPPGQACKGSGGEFAKGAAN
IIQGPKGDQGVQGDKGDAGQRGESGKPGERGDKGSQGSKGDKGDQGRDGP
SGPTGMKGTQGLQGPDGEKGEKGSTGRDGRDGLKGEPGKAGPNGVEGEKG
EQGENVQGLPGKDGEKGDRGKDGQPGEDGPAGPQGAAGPKGSAGPVGPKG
FKGEAGPIGPEGTAGKPGEPGMKGPIGPSGEKGDFGIKGEPGRPGTSVKG
PPGMDGEPGPIGEPGKQGNRGFPGQPGPPGKVTDSEGNEILVGDQGEPGA
QGEAGVQGPAGPPGTRGPKGEDCKTCPPGPQGEKGLPGQAGLNGADGEKG
EKGQLGEPGVAGSRGDPGARGDQGQQGEQGGKGERGVRGKDGPAGNDIVV
RGDRGEKGEAGKAGQDGAIGFQGEQGAAGEPGPRGPAGADGLQGSPGDRG
APGVSGIAGFDGDKGAKGEKGSGLRGFKGERGPQGAEGPQGPRGPFGEQG
EKGEPFPTMLKGDQGPKGNPGPRGPSGAAGPQGESGPQGFQGEIGPRGEP
GSGEKGERGEPGQRGEAGQVGEVGKDGIDGERGPKGEPGQDGPPGPASGT
PGKKGDKGDLGPAGRDGPPGNSAPKGEPGQDGFDGEKGEPGPKGEAGSPG
TSGGRGPTGNDGLRGEKGDAGESSIGQKGEPGPDGQKGERGDNGRDGAPG
EGIPGEQGVKGTKGDSIRGADGQKGDKGIEGIRGPAGKPGYKGESGNPGK
DGEKGERGEDGLGGSKGEPGKDGPEGPLGPLGDKGTKGDPGFIGQKGTKG
DSGPRGKPGATSNVVGPKGNRGPPGSKGEPGREGEKGNEGLAGPKGNPGV
GSEKGEKGGRGTGGEPGTKGVKGERGLQGRPGAIGPVGPKGDQGEEGPKG
SMGDPGRQGRDGKVGEPGPGGPKGDTGFAGEPGSKGEVGPRGEKGSRGLP
LKGERGPPGKPGPTGPAGPPGKDYIIKAGELPVGQLINKGDPGPAGSKGE
PGADGKKGTQGQRGLIGPRGESGVRGDAGGKGPRGDAGVDGEKGDIGSKG
SKGEPGRDGPVGAKGEVGSEGEKGERGPIGPQGQEGPAGIQGYKGQKGDV
GKEGPVGPPCPMGEGQKGDRGGKGNRGPKGFKGEPGPVGQSKEPGPKGAA
GEPGDEGQKGVKGDVGPIGPAGETGPRGLQGERGDIGQKGEPGKEGSKGD
RGPAGVGTGEKGQKGEKGLQGLTGLVGDQGEKGDKGSQGPKGVKGDQGPQ
GDIGPNGLNGDPGPIGKQGAQGAKGNLGPIGGDGPPGPPGKSGSSDFGFP
LVIHSQSTNVPDCPQNYNKLWDGYSFLYAQGNERAFGQDLGKPGSCLSRF
STMPFMFCDLQNQCTVASRNDYSFWLSTPEXMPFMFCDLQNQCTVASRND
YSFWLSTPEPASMMAPAKGPEVRPYISRCSVCQAPANVMAVHSQSSTIPS
CPAGWNGLWRGWSFLMYNSAGAEGAGQLLSSSGSCLQEFRVNPYIECHGR
GTCHYFGPTLSFWLSTIDEQSQFSVPQSETIKSGRLRERVSRCQVCVKDV
SQPDNNSP
Run BLAST on NCBI
Gene-mRNA-Prot
This polypeptide comes from the following gene feature:
Feature NameUnique NameSpeciesType
XLOC_007871XLOC_007871Clytia hemisphaericagene
This polypeptide derives from the following transcript feature(s):
Feature NameUnique NameSpeciesType
TCONS_00014070TCONS_00014070Clytia hemisphaericatranscript
Annotated Terms
The following terms have been associated with this polypeptide:
Vocabulary: INTERPRO
TermDefinition
IPR001442Collagen_VI_NC
IPR008160Collagen
IPR016187CTDL_fold
Vocabulary: Cellular Component
TermDefinition
GO:0005581collagen trimer
Vocabulary: Molecular Function
TermDefinition
GO:0005201extracellular matrix structural constituent
GO Annotation
GO Assignments
This polypeptide is annotated with the following GO terms.
Category Term Accession Term Name
cellular_component GO:0005581 collagen trimer
molecular_function GO:0005201 extracellular matrix structural constituent
InterPro
Analysis Name: InterPro Annotations of C. hemisphaerica v1.0
Date Performed: 2017-06-01
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR001442Collagen IV, non-collagenousSMARTSM00111C4_2coord: 1636..1749
e-value: 1.2E-61
score: 220.8
coord: 1498..1635
e-value: 7.9E-50
score: 181.5
IPR001442Collagen IV, non-collagenousGENE3D2.170.240.10coord: 1497..1581
e-value: 4.0E-37
score: 127.1
coord: 1582..1749
e-value: 1.9E-75
score: 252.4
IPR001442Collagen IV, non-collagenousPFAMPF01413C4coord: 1638..1747
e-value: 5.8E-38
score: 129.6
coord: 1501..1578
e-value: 1.4E-29
score: 102.6
coord: 1581..1633
e-value: 2.7E-10
score: 40.5
IPR001442Collagen IV, non-collagenousPROSITEPS51403NC1_IVcoord: 1498..1750
score: 102.435
NoneNo IPR availableGENE3D2.160.20.50coord: 172..238
e-value: 0.2
score: 11.0
coord: 1253..1314
e-value: 0.013
score: 14.9
coord: 1428..1496
e-value: 1.0E-4
score: 21.6
coord: 366..442
e-value: 0.0025
score: 17.1
coord: 239..306
e-value: 8.6E-4
score: 18.6
coord: 307..365
e-value: 0.0027
score: 17.0
coord: 105..171
e-value: 0.14
score: 11.5
coord: 1315..1390
e-value: 9.2E-4
score: 18.5
coord: 516..599
e-value: 0.014
score: 14.7
coord: 1187..1252
e-value: 0.016
score: 14.6
coord: 600..707
e-value: 0.0031
score: 16.8
coord: 708..769
e-value: 0.0035
score: 16.6
coord: 24..104
e-value: 0.035
score: 13.4
coord: 1053..1124
e-value: 0.31
score: 10.4
coord: 946..1015
e-value: 0.025
score: 13.9
IPR008160Collagen triple helix repeatPFAMPF01391Collagencoord: 254..306
e-value: 7.8E-8
score: 31.9
coord: 955..1011
e-value: 5.0E-7
score: 29.3
coord: 357..413
e-value: 9.4E-9
score: 34.8
coord: 647..704
e-value: 2.0E-6
score: 27.4
coord: 388..446
e-value: 3.4E-8
score: 33.0
coord: 499..557
e-value: 2.2E-8
score: 33.7
coord: 902..960
e-value: 8.3E-7
score: 28.6
coord: 602..659
e-value: 1.2E-8
score: 34.4
coord: 538..596
e-value: 1.3E-7
score: 31.2
coord: 753..808
e-value: 3.2E-7
score: 29.9
coord: 299..355
e-value: 3.7E-9
score: 36.1
coord: 1190..1244
e-value: 1.2E-7
score: 31.3
IPR016187C-type lectin foldSUPERFAMILY56436C-type lectin-likecoord: 1637..1748
IPR016187C-type lectin foldSUPERFAMILY56436C-type lectin-likecoord: 1581..1636
IPR016187C-type lectin foldSUPERFAMILY56436C-type lectin-likecoord: 1497..1581

Blast
BLAST of Collagen protein vs. Swiss-Prot (Human)
Match: CO4A6 (Collagen alpha-6(IV) chain OS=Homo sapiens GN=COL4A6 PE=1 SV=3)

HSP 1 Score: 560.451 bits (1443), Expect = 6.043e-170
Identity = 680/1752 (38.81%), Postives = 830/1752 (47.37%), Query Frame = 0
Query:  234 GQACKGSGGEFAKGAANIIQGPKGDQGVQGDKGDAGQRGESGKPGERGDKGSQGSKGDKGDQGRDGPSGPTGMKGTQGLQGPDGEKGEKGSTGRDGRDGLKGEPGKAGPNGVEG---------EKGEQGENVQGLPGKDGEKGDRGKDGQPGEDGPAGPQGAAGPKGSAGPVGPKGFKGEAGPIGPEGT-----AGKPGEPGMKGPIGPSGEKGDFGIKGEPGRPGTSVKGPPGMDGEPGPIGEPGKQGNRGFPGQPGPPGKVTDSEGNEILVGDQGEPGAQGEAGVQGPAGPPGTRGPKGEDCKTCPPGPQGEKGLPGQAGLNGADGEKGEKGQLGEPG-----------VAGSRGDPGARGDQGQQGEQGGKGERGVRGKDGPAGNDIVVRGDRGEKGEAGKAGQDGAIGFQGEQGAAGEPGPRGPAGADGLQGSPGDRGAPGVSGIAGFDGDKGAKGEKGSGLRGFKGERGPQGAEGPQGPRGPFGEQGEKGEPFPTM------------------LKGDQGPKGNPGPRGPSGA---AGPQGESGPQGFQGEIGPRGEPGSGEKGERGEPGQRGEAGQVGEVGKD-----------------GIDGERGPKGEPGQDGPPGPAS-------------GTPGKKGDKGDLGPAGRDGPPGNSAPKGEPGQ---DGFDGEKGEPGPKGEAGSPGTSGGRGPTGNDGLRGEKGDAGESSIGQK-GEPGPDGQKGERGDNG---RDGAPGE-GIPGEQGVKGTKGDSIRGADGQKGDKGIEGIRGP------AGKPGYKGESGNPGKDGEKGERGEDGLGGSKGEPGKDGPEGPLGPLGDKGTKGDPGFIGQKGTKGDSGPRGKPGATSNVV--------GPKGNRGPPGSKGEPGREGEKGNEGLAGPKGNPGVGSEKGEKGGRGTGGEPGTKGVKGERGLQGRPGAIGPVGPKGDQGEEGPKGSMGDPGRQGRDGKVGEPGPGGP------KGDTGFAGEPGSKGEVGPRGEKGS----------------------------------RGLP-LKGERGP---PGKPGPTGPAGPPGKDYIIKAGELP-----VGQLIN-KGDPGPAGSKGEPGADGKKGTQGQRGLIGPRGESGVRGDAGGKGPRGDAGVDGEKGDIGSKGSKGEPGRDGPVGAKGEVGSEGEKGERGPIGPQGQEGPAGIQGYKGQKGD--------VGKEGPVGPPCPMGE--------GQKGDRGGKGNRGPKGFKGEPGPVGQSKEPG---PKGAAGEPGDEGQKGV---------------------KGDVGPIGPAGETGPRGLQGERGDIGQKGEPGKEGSKGDRG-PAGVGT----------------GEKGQKGEKGLQG-------------------LTGLVGDQGEKGDKGSQGPKGVKGDQGPQGDIGPNGLN---------GDPGPIGKQGAQGAKGNLGPIGGDGPPGPPGKSGSSDFGFPLVIHSQSTNVPDCPQNYNKLWDGYSFLYAQGNERAFGQDLGKPGSCLSRFSTMPFMFCDLQNQCTVASRNDYSFWLSTPEXMPFMFCDLQNQCTVASRNDYSFWLSTPEPASMMAPAKGPEVRPYISRCSVCQAPANVMAVHSQSSTIPSCPAGWNGLWRGWSFLMYNSAGAEGAGQLLSSSGSCLQEFRVNPYIECHG-RGTCHYFGPTLSFWLSTIDEQSQF-SVPQSETIKSGRLRERVSRCQVCVKDV 1750
            GQ C GS   F +  A    GP G QG  G +G  G  G SG  GERG  G  G  G KGD+G   P G  G  G  G+ G  G+ G +G  G DG +G +G  G  GP+G  G         +KG +G+ V   PG    KG +G  G PG DG  GPQGA G  G+ GP GP G +G  GP GP G       G  GE G+KG +G  G  G     GE    G   KG  G  GEPGP G PG  G  GFPG          + G +   G++G PG             PG RGP G +    PPG QG+KG  G  GLNG  G +G+KG +G PG           ++G+ GDPG  G  G       KG+ G++G  GP+G    V G     G  G  G  G  G +G+QG  G    R   GA GL G  G  G PG  G    + +      K SG  G +GE+GP+G  G +G +G  G     G    T                   LKG +G +G+ G +GP+GA    GP G SGP+G +GE  P      G  G+RG+ G +G  G +GE GKD                 G  GE+G  G PG+ G PGP               G PG KG  G  G  G  G  G + P   PG     GF G  G PGPKG  G PGT G  G +G+   +GE G  G   + +  G PGP G+KG  G  G   +DG PG  G PG  G KG  GD     +G  G++G++G+ G       +G PG KG  G PG  G KGERG  G  G  G+PG  G  GP G  G  G  G PGF G  G  G  G RGK G   ++V        G  GN G  G KG PG  G  G   L+GPKG      EKG  G  G  G PG  G+ G RGL+G PG+ G +GP G  G  G KG  G+PG       VG P P  P      KGD G  G  GS G  GPRG+KG                                   RGLP LKG  G    PG PG +G  G  G   +  A  LP      GQ +   G PGP   KG+PG  G KGT+G+ GLIG  G  G +G+ G  G  GD G+ G  G  G  G +GEPG  G  G +G +G  G  G  GP G  G  G  G+ G  G KG          G  GP G P P GE        G  G  G +G +G +GF G  GP G    PG   P   AG+PGD G+ G+                     +GD G  G  G  GP+G +G++G  G  G PG+ G KG RG P  +GT                G+ G +G  GLQG                   L G+ G  G  GD G+QGP G++G +G  G  G +G +         GDPG  G QG  G +G  G  G  G PG PG+  S   G+ LV HSQS  VP CP   ++LW GYS L+ +G E+A  QDLG  GSCL RFSTMPF++C++   C  A RND S+WLST   +P M                              P    ++  YISRCSVC+AP+  +AVHSQ  TIP CP GW  LW G+SFLM+ +AGAEG GQ L S GSCL++FR  P+IEC G RGTCHYF    SFWL+T++E+ QF  +P SET+K+G+L  RVSRCQVC+K +
Sbjct:   33 GQDCSGSCQCFPEKGARGRPGPIGIQGPTGPQGFTGSTGLSGLKGERGFPGLLGPYGPKGDKG---PMGVPGFLGINGIPGHPGQPGPRGPPGLDGCNGTQGAVGFPGPDGYPGLLGPPGLPGQKGSKGDPVLA-PGS--FKGMKGDPGLPGLDGITGPQGAPGFPGAVGPAGPPGLQGPPGPPGPLGPDGNMGLGFQGEKGVKGDVGLPGPAGPPPSTGELEFMGFP-KGKKGSKGEPGPKGFPGISGPPGFPG--------LGTTGEKGEKGEKGIPGL------------PGPRGPMGSEGVQGPPGQQGKKGTLGFPGLNGFQGIEGQKGDIGLPGPDVFIDIDGAVISGNPGDPGVPGLPGL------KGDEGIQGLRGPSG----VPGLPALSGVPGALGPQGFPGLKGDQGNPG----RTTIGAAGLPGRDGLPGPPGPPGPPSPEFETETLHNKESGFPGLRGEQGPKGNLGLKGIKGDSGFCACDGGVPNTGPPGEPGPPGPWGLIGLPGLKGARGDRGSGGAQGPAGAPGLVGPLGPSGPKGKKGE--PILSTIQGMPGDRGDSGSQGFRGVIGEPGKDGVPGLPGLPGLPGDGGQGFPGEKGLPGLPGEKGHPGPPGLPGNGLPGLPGPRGLPGDKGKDGLPGQQGLPGSKGITLPCIIPGSYGPSGFPGTPGFPGPKGSRGLPGTPGQPGSSGS---KGEPGSPGLVHLPELPGFPGPRGEKGLPGFPGLPGKDGLPGMIGSPGLPGSKGATGDIFGAENGAPGEQGLQGLTGHKGFLGDSGLPGLKGVHGKPGLLGPKGERGSPGTPGQVGQPGTPGSSGPYGIKGKSGLPGAPGFPGISGHPGKKGTRGKKGPPGSIVKKGLPGLKGLPGNPGLVGLKGSPGSPGVAGLPALSGPKG------EKGSVGFVGFPGIPGLPGIPGTRGLKGIPGSTGKMGPSGRAGTPGEKGDRGNPG------PVGIPSPRRPMSNLWLKGDKGSQGSAGSNGFPGPRGDKGEAGRPGPPGLPGAPGLPGIIKGVSGKPGPPGFMGIRGLPGLKGSSGITGFPGMPGESGSQGIRGSPGLPGASGLPGLKGDNGQTVEISGSPGP---KGQPGESGFKGTKGRDGLIGNIGFPGNKGEDGKVGVSGDVGLPGAPGFPGVAGMRGEPGLPGSSGHQGAIGPLGSPGLIGPKGFPGFPGLHGLNGLPGTKGTHGTPGPSITGVPGPAGLPGPKGEKGYPGIGIGAPGKPGLRGQKGDRGFPGLQGPAGLPGAPGISLPSLIAGQPGDPGRPGLDGERGRPGPAGPPGPPGPSSNQGDTGDPGFPGIPGPKGPKGDQGIPGFSGLPGELGLKGMRGEPGFMGTPGKVGPPGDPGFPGMKGKAGPRGSSGLQGDPGQTPTAEAVQVPPGPLGLPGIDGIPGLTGDPGAQGPVGLQGSKGLPGIPGKDGPSGLPGPPGALGDPGLPGLQGPPGFEGAPGQQGPFGMPGMPGQ--SMRVGYTLVKHSQSEQVPPCPIGMSQLWVGYSLLFVEGQEKAHNQDLGFAGSCLPRFSTMPFIYCNINEVCHYARRNDKSYWLSTTAPIPMM------------------------------PVSQTQIPQYISRCSVCEAPSQAIAVHSQDITIPQCPLGWRSLWIGYSFLMHTAAGAEGGGQSLVSPGSCLEDFRATPFIECSGARGTCHYFANKYSFWLTTVEERQQFGELPVSETLKAGQLHTRVSRCQVCMKSL 1691          

HSP 2 Score: 187.193 bits (474), Expect = 2.394e-47
Identity = 277/758 (36.54%), Postives = 344/758 (45.38%), Query Frame = 0
Query:   40 GKKGERGAVGMPGFSGPPGLPGFPGPEGPTGPPGSKGVAGRDGGKGMPGPRGRVGSPGLPGQPGRHGQPGNEGRRGEAGIPGCNGTKGDRGPEGPRGRDGQQGNQGMEGPEGPPGEPGDSTATRLKGQKGEPGKFGQDGPKGDGGEKGSKGESGRDGKQGLRGLKGQKGARGDSNVTIVGQRGDKGLKGLQGPPGQACKGSGGEFAKGAANIIQGPKGDQGVQGDKGDAGQRGESGKPGERGDKGSQGSKGD-------------KGDQGRDGPSGPTGMKGTQGLQGPDGEKGEKGSTGRDGRDGLKGEPGKAGPNGVEGEKGEQGENVQ--------GLPGKDGEKGDRGKDGQPGEDGPAGPQGAAGPKGSAGPVGPKGFKGEAGPIGPEGTAGKPGEPGMKGPIGPSGEKGDFGIKGEPGRPGTS-VKGPPG-----------MDGEPGPIGEPGKQGNRGFPGQ-PGPPGKVTDSEGNEILVGDQGEPGAQGEAGVQGPAGPPGTRGPKGEDCKTCPPGPQGEKGLPGQAGLNG------------ADGEKGEKGQLGEPGVAGSRGDPGARGDQGQQGEQGGKGERGVRGKDGPAGN-----DIVVRGDRGEKGEAGKAGQDGAIGFQGEQG----AAGEPGPRGPAGADGLQGSPGDRGAPGVSGIAGFDGDKGAKGEKGS-------------GLRGFKGERGPQGAEGPQGPRGPFGEQGEKGEPF---PTMLKGDQGPKGNPGPRGPS 726
            G++G +G  G  GF G  GLPG  G  G        G+ G  G +G PG  G+VG PG PG  G +G  G  G  G  G PG +G  G +G  G +G  G    +G+ G +G PG PG      LKG  G PG  G     G  GEKGS G  G  G  GL G+ G +G +G     I G  G  G  G  G PG+  KG  G          + P  +  ++GDKG  G  G +G PG RGDKG  G  G              KG  G+ GP G  G++G  GL+G  G  G  G  G  G  G++G PG  G +G+ G KG+ G+ V+        G PG+ G KG +G+DG  G  G  G +G  G  G +G VG  G  G  G  G  G  G PG  G +G IGP G  G  G KG PG PG   + G PG           + G PGP G PG +G +G+PG   G PGK           G +G+ G +G  G+QGPAG PG  G       + P    G+ G PG+ GL+G              G    +G  G+PG  G  G  G +GDQG  G  G  GE G++G  G  G       +   GD G  G  GKAG  G+ G QG+ G    A     P GP G  G+ G PG  G PG  G  G  G KG  G  G              G  G  G +GP G EG  G +GPFG  G  G+      T++K  Q  +  P P G S
Sbjct:  762 GEQGLQGLTGHKGFLGDSGLPGLKGVHG------KPGLLGPKGERGSPGTPGQVGQPGTPGSSGPYGIKGKSGLPGAPGFPGISGHPGKKGTRGKKGPPGSIVKKGLPGLKGLPGNPG---LVGLKGSPGSPGVAGLPALSGPKGEKGSVGFVGFPGIPGLPGIPGTRGLKG-----IPGSTGKMGPSGRAGTPGE--KGDRGNPGPVGIPSPRRPMSNLWLKGDKGSQGSAGSNGFPGPRGDKGEAGRPGPPGLPGAPGLPGIIKGVSGKPGPPGFMGIRGLPGLKGSSGITGFPGMPGESGSQGIRGSPGLPGASGLPGLKGDNGQTVEISGSPGPKGQPGESGFKGTKGRDGLIGNIGFPGNKGEDGKVGVSGDVGLPGAPGFPGVAGMRGEPGLPGSSGHQGAIGPLGSPGLIGPKGFPGFPGLHGLNGLPGTKGTHGTPGPSITGVPGPAGLPGPKGEKGYPGIGIGAPGK----------PGLRGQKGDRGFPGLQGPAGLPGAPG------ISLPSLIAGQPGDPGRPGLDGERGRPGPAGPPGPPGPSSNQGDTGDPGFPGIPGPKGPKGDQGIPGFSGLPGELGLKGMRGEPGFMGTPGKVGPPGDPGFPGMKGKAGPRGSSGLQGDPGQTPTAEAVQVPPGPLGLPGIDGIPGLTGDPGAQGPVGLQGSKGLPGIPGKDGPSGLPGPPGALGDPGLPGLQGPPGFEGAPGQQGPFGMPGMPGQSMRVGYTLVKHSQSEQVPPCPIGMS 1487          
The following BLAST results are available for this feature:
BLAST of Collagen protein vs. Swiss-Prot (Human)
Analysis Date: 2018-01-31 (Blastp Clytia hemisphaerica v1.0 proteins vs SwissProt (Homo sapiens))
Total hits: 1
Match NameE-valueIdentityDescription
CO4A66.043e-17038.81Collagen alpha-6(IV) chain OS=Homo sapiens GN=COL4... [more]
back to top