# STOCKHOLM 1.0 #=GF ID BCSC_C #=GF AC PF05420.2 #=GF DE Cellulose synthase operon protein C C-terminus (BCSC_C) #=GF PI BCSC_N; #=GF AU Moxon SJ #=GF SE Pfam-B_10335 (release 8.0) #=GF GA 25.00 25.00; 25.00 25.00; #=GF TC 162.20 162.20; 168.70 46.60; #=GF NC -118.00 -118.00; 23.70 21.00; #=GF TP Family #=GF BM hmmbuild -F HMM_ls SEED #=GF BM hmmcalibrate --seed 0 HMM_ls #=GF BM hmmbuild -f -F HMM_fs SEED #=GF BM hmmcalibrate --seed 0 HMM_fs #=GF AM globalfirst #=GF RN [1] #=GF RM 11260463 #=GF RT The multicellular morphotypes of Salmonella typhimurium and #=GF RT Escherichia coli produce cellulose as the second component #=GF RT of the extracellular matrix. #=GF RA Zogaj X, Nimtz M, Rohde M, Bokranz W, Romling U; #=GF RL Mol Microbiol 2001;39:1452-1463. #=GF DR INTERPRO; IPR008410; #=GF CC This family contains the C-terminal regions of several bacterial cellulose #=GF CC synthase operon C (BCSC) proteins. BCSC is involved in cellulose synthesis #=GF CC although the exact function of this protein is unknown [1]. #=GF SQ 7 #=GS BCSC_PSEFL/933-1262 AC P58937.1 #=GS BCSC_XANAC/1108-1489 AC P58938.1 #=GS BCSC2_ACEXY/929-1326 AC O82861.1 #=GS ACSC_ACEXY/943-1302 AC P37718.1 #=GS BCSC4_ACEXY/968-1307 AC Q9WX71.1 #=GS BCSC_ECOLI/784-1123 AC P37650.2 #=GS O67403_AQUAE/1-313 AC O67403.1 BCSC_PSEFL/933-1262 LSKITDVEAPFEARMPV..G.....DNTVALRVTPVHLSAGSVKAES.....LSRFGKGGTEPAGSQS.........................DSGVGLAVAFENPDQGLKADVGVSPLGFLYNTLVGGVSVSRPFEANSNFRYGANISRRPVTDSVTSFAGSED...GA....................GNKWGGVTANGGRGELSYDNQKL.GVYGY..ASLHELLGNNVEDNTRLELGSGIYWYLRNNPRDT.LTLGISGSAMTFKENQDFYTYGNGGYFSPQRFFSLGVPIRWAQSFDR..FSYQVKSSVGLQHIAQDGADYFPGDSTLQA......................TKNNPKYDSTSKTGVGYSFNAAAEYRLSSRFYLGGEIGLDNAQDYRQYAGNAYLRYLFEDL BCSC_XANAC/1108-1489 NNNSTAAALAALANEPLPAYLLSINASNTPIGQLARDILGNAALSEQLGAGDAAALRALAGSTAAAQQTPTGLADTLNAMAASGNGSRRLSQDDTGVGVGVRYRN..GGFSAELGSTPIGFQEQNLIGGVGYRG..ELGDTVSWSAEASRRAVTDSVLSFAGAQD..ARS....................GRQWGGVTSTGLSLSATADNGLL.GGYAN..LAAHRLQGNNVADNDHRQVDLGFYVHALETEHQS.LTAGVNLTTMQYDKNLSGFTYGHGGYFSPQDYVDLGFPVHWSGRTAGQTVNWKVDASVGVQHFSTEASPYFPTDPTLQQ......AAYDAASLAALLGLVDRYTDPVYASESRTGVSYNLSGAAEWQVAPQLFLGGRLTFNNARDYNQFSSNLYLRFVMDRL BCSC2_ACEXY/929-1326 MGRLTEANIPIVGRLPLQAG.....ASALTFSITPTMIWSGQLNTGSVYD..VPRYGTFMATQAANQCAGHSSCGGLDFLSANHTQRIAAGAGEAGFAPDVQFGN..SWVRADVCASPIGFPITNVLGGVEFSP..RV.GPVTFRVSAERRSITNSVLSYGGLRD..PNYNSEVGRYARQVYGHDLTKQWGSEWGGVVTNHFHGQVEATLGNT.ILYGG..GGYAIQTGKNVQRNSEREAGIGANTLVWHNANML.VRIGVSLTYFGYAHNEDFYTYGQGGYFSPQSYYAATVPVRYAGQHKR..LDWDVTGSVGYQVFHEHAAPFFPTSSLLQSGANYVASNFVQNALPTDYLSQETVNSAYYPGDSIAGLTGGFNARVGYRFTRNVRLDLSGRYQKAGNWTESGAMISAHYLIMDQ ACSC_ACEXY/943-1302 MGALTEASVPIVGRIPLQAG.....TSALTFTATPTFLTSGHL.PQTGYD..IPRFGTNLFALERNLQNQNN........SAEHRINTDTIGREAGVAPDVRFAN..NWVSADVGASPLGFTLPNVIGGVEFAP..RV.GPVTFRVSGERRSITNSVLSYGGMTD..ALT....................GKKWGGVVTNHFHGQVEATLGNT.IVYGG..GGYAIQTGHHVQSNTEVEGGLGANTLVYRNRKHE.VRVGVNLTYFGYKHNEDFYTYGQGGYFSPQSYFAATVPVRYSGHSGL..FDWDVTGSIGYQLFHEHSSAFFPTNPVYQA.........LANGLAGVSTAELSLESARYPGDDVGSLVGGFDGRVGYRVSHSLRLDLSGRFQKAGNWDEGGAMISAHYLIMDQ BCSC4_ACEXY/968-1307 LGQLTEFAVPITATLPFESW.....DHRLSFSVTPTLLFTGDP.LTNAVS..AHQFGTVAVNGARPWGYHHY....................YTQGVGLSLNYVN..RWFAADVGSSPLGFPITNVVGGLEFAP..RLTRNLGLRISGGRRMVTDSELSYAGERD..PGT....................GKLWGGVTRLFGHGALEWSARGW.NAYAG..GGFAYLGGTNVIGNTETEAGAGGSATVWQDHDRQWLRVGLDLMYFGYKRNAYFFTWGQGGYFSPRQYFGAMVPVEWSGHNRR..WTWFLRGEAGYQYYHSNAAPYFPTSAQLQG...................QADGSPPSYYGDSGASGLAGNMRGRLVYQLDHRLRIGLEGGYSRAGSWSETSGMWMAHYTLDGQ BCSC_ECOLI/784-1123 YSDLKAHTTMLQVDAPY..S.....DGRMFFRSDFVNMNVGSF.STNADGKWDDNWGTCTLQDCSGNR.SQS.....................DSGASVAVGWRN..DVWSWDIGTTPMGFNVVDVVGGISYS...DDIGPLGYTVNAHRRPISSSLLAFGGQKDSPSNT....................GKKWGGVRADGVGLSLSYDKGEANGVWAS..LSGDQLTGKNVEDNWRVRWMTGYYYKVINQNNRR.VTIGLNNMIWHYDKDLSGYSLGQGGYYSPQEYLSFAIPVMWRERTEN..WSWELGASGSWSHSRTKTMPRYPLMNLIPT..................DWQEEAARQSNDGGSSQGFGYTARALLERRVTSNWFVGTAIDIQQAKDYAPSHFLLYVRYSAAGW O67403_AQUAE/1-313 MDKLIHTSVYLNAEKKLARGFYLWTENDLIF.............LDSGGKPDYTHFGGINLPVIRDTDTSFI...................GIEPMVGVRVGRKS...YLKLGIGSTPFGNSKVRPTFRGIFEANYRE.EDILLRLTLKRDSVRDSLLSYVGAKD..PHA....................DREWGRVVEQGGEVKFQLGSGYR.ESFLSLKGGYYDIEGKNVNDNSRYFLEIYPSLYLGNLLIDE.NYLGLFFRYENYDKNENLFYFGNGGYFSPKNFVLLG..PKYTGYFYTDRLMFRLNLLLGFLRFETD...........................................GDTTNTLAADISGEFEYKLKEKISLIGGLGYRNSKDYDEVSLNMGVKYYFNGL #=GC seq_cons hupLTcsslsl.uchPl.tu.....ssslsFp.sss.l.sGph.spsshs..hspaGshshssstspp.tp......................-sGVulsVtatN..phhpADlGuoPlGFshssllGGlpaus..ch.sslsaplsupRRslTsSlLSauGtcD..sts....................G+cWGGVsssthcsplphstGth.tsYuu..uuht.lpGpNVpcNochchshGh.hhlhpstpcp.lplGlshphhsYc+NtshaTaGpGGYFSPQpYhuhslPV+auGpptp..hsWclsuSlGaQhacpcuusaFPssshlQs......................phssshYsusotsGluhshsuthtY+lspplhLshphsappAtsaspsuu.hhs+Yhhts. // # STOCKHOLM 1.0 #=GF ID Ependymin #=GF AC PF00811.9 #=GF DE Ependymin #=GF AU Bateman A #=GF SE Pfam-B_1391 (release 2.1) #=GF GA 25.00 25.00; 25.00 25.00; #=GF TC 119.70 119.70; 26.10 26.10; #=GF NC 22.20 22.20; 24.30 24.30; #=GF TP Family #=GF BM hmmbuild -F HMM_ls SEED #=GF BM hmmcalibrate --seed 0 HMM_ls #=GF BM hmmbuild -f -F HMM_fs SEED #=GF BM hmmcalibrate --seed 0 HMM_fs #=GF AM globalfirst #=GF DR PROSITE; PDOC00699; #=GF DC The following Pfam-B family may contain sequences that according to Prodom #=GF DC are members of this Pfam-A family. #=GF DR PFAMB; PB011734; #=GF DR INTERPRO; IPR001299; #=GF SQ 7 #=GS Q90492_9TELE/6-178 AC Q90492.1 #=GS Q91331_9TELE/6-179 AC Q91331.1 #=GS EPD1_CARAU/24-197 AC P13506.2 #=GS Q91464_9TELE/5-186 AC Q91464.1 #=GS Q91465_9TELE/6-184 AC Q91465.1 #=GS EPD_CLUHA/22-197 AC P32187.1 #=GS EPD1_ONCMY/26-200 AC P28770.1 Q90492_9TELE/6-178 RQPCQSPSMTSGTLTVCSTGGHTVASGDFSYDSTAKKLRFVEDN.HSNKTSHMD.VLIHFEEGLMYELDSKNESCKKHTLQSRKHPMELPADASHENEMYLGSPSISEQGLRLRLWSGKLPDLHA........QYSMWTTSCGCLTVSCAYHAEKND.LIFSFFKVETEVND.SQVFVPPAYCDG Q91331_9TELE/6-179 RVPCPSPSMISGTLTVRSSGGHTVATGEFNYDSTGKKLHFLEKNDDSNKTSHID.VLIHFEEGVIYELDSKNESCKKQTLKSRKHPMELPADASHDSEVYLGSLVVPEQGLRLRVWTGKLPDLHA........QYTMLTTSCGCLTVSCYYHSDKTD.LIFSFLDVETHVDD.PQVFVPRPTCDG EPD1_CARAU/24-197 RQPCHAPPLTSGTMKVVSTGGHDLESGEFSYDSKANKFRFVEDTAHANKTSHMD.VLIHFEEGVLYEIDSKNESCKKETLQFRKHLMEIPPDATHESEIYMGSPSITEQGLRVRVWNGKFPELHA........HYSMSTTSCGCLPVSGSYHGEKKD.LHFSFFGVETEVDD.LQVFVPPAYCEG Q91464_9TELE/5-186 REPCRSPPKTSGTLCVSSSKDDAIAWGEFKYDSSQKHLRFVEDTGKSNKTSYLD.VLIDFDKGVLYELDTKNESCRKQMLPSHKHPLELPSDATHVEELYLGRLDKTEQGLRVRLWSGNLSDHDAHHADHVQAHYSMTTTSCGCIAVSYTYHSEKND.LVFSFYNVKARVDD.SQAFTPPRYCEG Q91465_9TELE/6-184 REPCRSPSKTSGTVCVSSTKDDTIAWGEFKYDSTHKRLRFVEDCSKSNKTSHLD.VLIHFEEGVLYELDPKNESCKKEPLPSHKHPLEVPSDATHMDELYLGSPDKSDQGLRVRLWSGNISHHNP...DHTPDHYSITTTSCGCITVSCTYHGEKND.LIFSFYNVKTEVDD.MQVFNPPDYCDD EPD_CLUHA/22-197 HQPCRPPPQTHGNLWVTAAKGAPASVGEFNYDSQARKLHFKDDALHVNKTDHLE.MLIFFEEGIFYDIDSHNQSCHKKTLQSTYHCLEVPPNATHVTEGYLGSEFIGDQGVRMRKWRKRVPELDG........VVTVATTSCGCVTLFATLFTDSNDVLVFNFLDVEMKVKNPLEVFVPPSYCDG EPD1_ONCMY/26-200 PQHCTSPNMTGVLTVLALTGGEIKATGHYSYDSTDKKIRFTESEMHLNKTEHLEDYLMLFEEGVFYDIDMKNQSCKKMSLHSHAHALELPAGAAHQVELFLGSDTVQEEDIKVNIWTGSVPETKG........QYFLSTTVGECLPLSTFYSTDSIT.LLFSNSEVVTEVKA.PEVFNLPSFCEG #=GC seq_cons RpPCpSPshTSGTlsVsSotGcslAsGEFsYDSosKKLRFlEDs.+uNKTSHlD.VLIaFEEGVhYElDoKNESCKKpoLpS+KHshElPuDAoH.sElYLGS.slsEQGLRlRlWoGplP-hcu........pYohsTTSCGClsVSssYHu-KsD.LlFSFhsVcTcVcD..QVFsPPsYC-G // # STOCKHOLM 1.0 #=GF ID Como_SCP #=GF AC PF02248.7 #=GF DE Small coat protein #=GF AU Bateman A, Mian N #=GF SE Pfam-B_2294 (release 5.2) #=GF GA -60.00 -60.00; 25.00 25.00; #=GF TC 97.20 97.20; 171.90 171.90; #=GF NC -85.00 -85.00; 13.10 13.10; #=GF TP Domain #=GF BM hmmbuild -F HMM_ls SEED #=GF BM hmmcalibrate --seed 0 HMM_ls #=GF BM hmmbuild -f -F HMM_fs SEED #=GF BM hmmcalibrate --seed 0 HMM_fs #=GF AM globalfirst #=GF RN [1] #=GF RM 1546463 #=GF RT Nucleotide sequence and genetic map of cowpea severe mosaic #=GF RT virus RNA 2 and comparisons with RNA 2 of other #=GF RT comoviruses. #=GF RA Chen X, Bruening G; #=GF RL Virology 1992;187:682-692. #=GF DR SCOP; 1bmv; fa; #=GF DR INTERPRO; IPR003182; #=GF CC This family contains the small coat protein (SCP) [1] of the #=GF CC comoviridae viral family. #=GF SQ 10 #=GS Q9YWK0_9COMO/739-920 AC Q9YWK0.1 #=GS Q9YWJ9_9COMO/828-1009 AC Q9YWJ9.1 #=GS VGNM_SQMVM/622-803 AC P36341.2 #=GS VGNM_BPMV/821-1004 AC P23009.2 #=GS VGNM_BPMV/821-1004 DR PDB; 1pgw 1; 1-184; #=GS VGNM_BPMV/821-1004 DR PDB; 1pgl 1; 1-184; #=GS VGNM_BPMV/821-1004 DR PDB; 1bmv 1; 1001-1184; #=GS VGNM_RCMV/786-964 AC P13561.1 #=GS Q66170_CPMV/720-899 AC Q66170.1 #=GS VGNM_CPMV/837-1016 AC P03599.1 #=GS VGNM_CPMV/837-1016 DR PDB; 2bfu S; 4-183; #=GS VGNM_CPMV/837-1016 DR PDB; 1ny7 1; 4-183; #=GS VGNM_CPSMV/808-988 AC P31630.1 #=GS Q66179_APMV/2-185 AC Q66179.1 #=GS VGNM_APMV/802-985 AC P38485.1 Q9YWK0_9COMO/739-920 MVQQLGTYNPIWMVRTPLES.TAPQNFASFTADLMESTVSGDSTG.NWSITAYPSPISNLLKVAAWKKGTIRFQLICRG.AAVKQSDWAASARIDLVNNLSNKALPARSWYITKPRGGDIEFDLEIAGPNNGFEMANSSWAFQTTWYLEIAIDNPKQFTLFELNACLMEDFEVAGNTLNPPILLS Q9YWJ9_9COMO/828-1009 MVQQLGTYNPIWMVRTPLES.TAQQNFASFTADLMESTISGDSTG.NWNITVYPSPIANLLKVAAWKKGTIRFQLICRG.AAVKQSDWAASARIDLINNLSNKALPARSWYITKPRGGDIEFDLEIAGPNNGFEMANSSWAFQTTWYLEIAIDNPKQFTLFELNACLMEDFEVAGNTLNPPILLS VGNM_SQMVM/622-803 MVQQLGTYNPIWMVRTPLES.TAQQNFASFTADLMESTISGDSTG.NWNITVYPSPIANLLKVAAWKKGTIRFQLICRG.AAVKQSDWAASARIDLINNLSNKALPARSWYITKPRGGDIEFDLEIAGPNNGFEMANSSWAFQTTWYLEIAIDNPKQFTLFELNACLMEDFEVAGNTLNPPILLS VGNM_BPMV/821-1004 SISQQTVWNQMATVRTPLNFDSSKQSFCQFSVDLLGGGISVDKTG.DWITLVQNSPISNLLRVAAWKKGCLMVKVVMSGNAAVKRSDWASLVQVFLTNSNSTEHFDACRWTKSEPHSWELIFPIEVCGPNNGFEMWSSEWANQTSWHLSFLVDNPKQSTTFDVLLGISQNFEIAGNTLMPAFSVP #=GR VGNM_BPMV/821-1004 SS ----TT--EEEEEEE--TT--TTT-SEEEEEEETTTTEEEE-SSS.--EEEE---HHHHHHHHEESEEEEEEEEEEEE--TTS-GGG---EEEEEEESSSSTTS--SEEEEE--SSEEEEEEEEEE-BTTEE-B-TT-TTTTB---EEEEEEESTTTS-EEEEEEEE-TT-EEBSBEE---EE-- VGNM_RCMV/786-964 VRTTDGVYSTCFRVRTPLA....LKDSGSFTCDLIGGGITTDSNT.GWNLTALNTPVANLLRTAAWKRGTIHVQVAMFG.STVKRSDWTSTVQLFLRQSMNTSSYDARVWVISKPGAAILEFSFDVEGPNNGFEMWEANWASQTSWFLEFLISNVTQNTLFEVSMKLDSNFCVAGTTLMPPFSVT Q66170_CPMV/720-899 CAEASDVYSPCMIASTPPAP...FSDVTAVTFDLINGKITPVGDD.NWNTHIYNPPIMNVLRTAAWKSGTIHVQLNVRG.AGVKRADWDGQVFVYLRQSMNPESYDARTFVISQPGSAMLNFSFDIIGPNSGFEFAESPWANQTTWYLECVATNPRQIQQFEVNMRFDPNFRVAGNILMPPFPLS VGNM_CPMV/837-1016 CAEASDVYSPCMIASTPPAP...FSDVTAVTFDLINGKITPVGDD.NWNTHIYNPPIMNVLRTAAWKSGTIHVQLNVRG.AGVKRADWDGQVFVYLRQSMNPESYDARTFVISQPGSAMLNFSFDIIGPNSGFEFAESPWANQTTWYLECVATNPRQIQQFEVNMRFDPNFRVAGNILMPPFPLS #=GR VGNM_CPMV/837-1016 SS ---B-S-BEEEEEEE---TT...--S-EEEEEETTTCCEEE-SS-.-SEEEE---HHHHHHHCECCEEEEEEEEEEEEE.-S--CCC--B-EEEEEESS-CTCC--SEEEEEBSSSSEEEEEEEEE-BTCCC-B-TTBTTTT-B--EEEEEES-TTTEEEEEEEEEE-TT-EEEEEEE---EE-- VGNM_CPSMV/808-988 SGQTQQVWNKIWRIGTPP...QATDGLFSFSIDLLGVELVTDGQE.GAVSVLSSSPVANLLRTAAWKCGTLHVKVVMTGRVTTTRANWASHTQMSLVNSDNAQHYEAQKWSVSTPHAWEKEFSIDICGPNRGFEMWRSSWSNQTTWILEFTVAGASQSAIFEIFYRLDNSWKSAGNVLMPPLLVG Q66179_APMV/2-185 CSPCINVWSEFCALDIPVVD.TTKVNFAQYSLDLVNPTVSANASGRNWRFVLIPSPMVYLLQTSDWKRGKLHFKLKILGKSNVKRSEWSSTSRIDVRRAPGTEYLNAITVFTAEPHADEINFEIEICGPNNGFEMWNADFGNQLSWMANVVIGNPDQAGIHQWYVRPGENFEVAGNRMVQPLALS VGNM_APMV/802-985 CSPCINVWSEFCALDIPVVD.TTKVNFAQYSLDLVNPTVSANASGRNWRFVLIPSPMVYLLQTSDWKRGKLHFKLKILGKSNVKRSEWSSTSRIDVRRAPGTEYLNAITVFTAEPHADEINFEIEICGPNNGFEMWNADFGNQLSWMANVVIGNPDQAGIHQWYVRPGENFEVAGNRMVQPLALS #=GC SS_cons ---BTT-BEEEEEEE--TTT.TTT-SEEEEEEETTTCCEEE-SSS.-SEEEE---HHHHHHHCECCEEEEEEEEEEEEE.TTS-CCC--BEEEEEEESSSCTCC--SEEEEEBSSSCEEEEEEEEE-BTCCC-B-TTBTTTTBB--EEEEEECSTTTCEEEEEEEEE-TT-EEECEEE---EE-- #=GC seq_cons ssps.sVYsshhhlcTPlss.osppsFuuFThDLlsusISsDuoG.NWshslhsSPIuNLL+TAAWK+GTIHhQLhhpG.AuVKRSDWuuosplsLppuhuscuhsARoWhIocP+uu-lpFslEIsGPNNGFEMhsSsWANQTTWaLEhlIsNP+QhslFElsh+lspNFEVAGNsLhPPlsLS // # STOCKHOLM 1.0 #=GF ID MHC2-interact #=GF AC PF09307.1 #=GF DE CLIP, MHC2 interacting #=GF AU Mistry J, Sammut SJ #=GF SE pdb_1muj #=GF GA 13.30 13.30; 25.00 25.00; #=GF TC 22.40 22.40; 25.30 25.30; #=GF NC 4.30 4.30; 16.30 16.30; #=GF TP Domain #=GF BM hmmbuild -F HMM_ls SEED #=GF BM hmmcalibrate --seed 0 HMM_ls #=GF BM hmmbuild -f -F HMM_fs SEED #=GF BM hmmcalibrate --seed 0 HMM_fs #=GF AM globalfirst #=GF RN [1] #=GF RM 12589760 #=GF RT Crystal structure of MHC class II I-Ab in complex with a #=GF RT human CLIP peptide: prediction of an I-Ab peptide-binding #=GF RT motif. #=GF RA Zhu Y, Rudensky AY, Corper AL, Teyton L, Wilson IA; #=GF RL J Mol Biol. 2003;326:1157-1174. #=GF DC The following Pfam-B family may contain sequences that according to Prodom #=GF DC are members of this Pfam-A family. #=GF DR PFAMB; PB077178; #=GF DR INTERPRO; IPR015386; #=GF CC Members of this family are found in class II invariant #=GF CC chain-associated peptide (CLIP), and are required for association #=GF CC with class II major histocompatibility complex (MHC) in the MHC #=GF CC class II processing pathway [1]. #=GF SQ 6 #=GS Q2KM24_ANAPL/2-116 AC Q2KM24.1 #=GS Q6R7Z7_CHICK/2-120 AC Q6R7Z7.1 #=GS Q307W8_CAICR/1-119 AC Q307W8.1 #=GS HG2A_RAT/1-113 AC P10247.2 #=GS Q764N1_PIG/1-111 AC Q764N1.1 #=GS Q52KV2_XENLA/43-159 AC Q52KV2.1 Q2KM24_ANAPL/2-116 AEEQR.DLISDRGS.GVVPMG...DSQRSAFGRRAALSTLSILVALLIAGQAVTIYFVYQQSGQISKLTRTSQNLQLEALQRKLPKSSKSAGNMKMSMVNTPLAMRVLPLAPS...LDDTPVK Q6R7Z7_CHICK/2-120 AEEQR.DLISCPSSSGVLPIG...NSERSSLGRRTALSALSILVALLIAGQAVTIYYVYQQSGQISKLTKTSQTLKLESLQRKMPIGTQPANKMSMSTMNMPMAMKVLPLAPSVGDMPVEAMK Q307W8_CAICR/1-119 MDEDRSDLISTPEQTTAPALTAHSGSRRIVCSRGAVYSALSILVALLVAGQAVTVYFVYQQGNHISKLTKTTQTLQLESMSRKLPQG.KSSNKMKMAMISMPMAMRELPLASK...MDAGPTD HG2A_RAT/1-113 MDDQR.DLISNHEQ..LPILGQRARAPESNCNRGVLYTSVSVLVALLLAGQATTAYFLYQQQGRLDKLTVTSQNLQLENLRMKLPKSAKPVSPMRMAT...PLLMRPLSMDN....MLQAPVK Q764N1_PIG/1-111 MEDQR.DLISNHEQ..LPMLGQRPGAPESKCSRGALYTGFSVLVALLLAGQATTAYFLYQQQGRLDKLTVTSQNLQLESLRMKLPKPSKPLSKMRVSA...PMLMQALPMEG......PEPMR Q52KV2_XENLA/43-159 AEESQ.NLVPEHVPGQSVVDVGNRE.PRMSCNKGSLVTALTVLVVVLVAGQAVMAFFITQQNSRIQDLDRSTKNLQLKAMMKELPGSPPVPSKQKMRTFNIPMALKLYDGSE....MNMNDLE #=GC seq_cons hEEQR.DLISs+pp.tlsslGtp.su.RSsCuRGulhouLSlLVALLlAGQAVTsYFlYQQsG+IsKLT+TSQNLQLEuLp+KLPpusKssuKM+MushshPMAM+sLPhus....MsssPh+ // # STOCKHOLM 1.0 #=GF ID Caudal_act #=GF AC PF04731.3 #=GF DE Caudal like protein activation region #=GF AU Kerrison ND #=GF SE DOMO:DM04892; #=GF GA 25.00 25.00; 15.30 15.30; #=GF TC 46.20 46.20; 21.30 21.30; #=GF NC -6.30 -6.30; 13.20 11.50; #=GF TP Family #=GF BM hmmbuild -F HMM_ls SEED #=GF BM hmmcalibrate --seed 0 HMM_ls #=GF BM hmmbuild -f -F HMM_fs SEED #=GF BM hmmcalibrate --seed 0 HMM_fs #=GF AM globalfirst #=GF RN [1] #=GF RM 11729123 #=GF RT Phosphorylation of the serine 60 residue within the Cdx2 #=GF RT activation domain mediates its transactivation capacity. #=GF RA Rings EH, Boudreau F, Taylor JK, Moffett J, Suh ER, Traber #=GF RA PG; #=GF RL Gastroenterology 2001;121:1437-1450. #=GF DC The following Pfam-B family may contain sequences that according to Prodom #=GF DC are members of this Pfam-A family. #=GF DR PFAMB; PB179679; #=GF DR INTERPRO; IPR006820; #=GF CC This family consists of the amino termini of proteins belonging to the #=GF CC caudal-related homeobox protein family. This region is thought to mediate #=GF CC transcription activation. The level of activation caused by mouse Cdx2 #=GF CC (Swiss:P43241) is affected by phosphorylation at serine 60 via the #=GF CC mitogen-activated protein kinase pathway [1]. #=GF CC Caudal family proteins are involved in the transcriptional regulation of #=GF CC multiple genes expressed in the intestinal epithelium, and are important in #=GF CC differentiation and maintenance of the intestinal epithelial lining. #=GF CC Caudal proteins always have a homeobox DNA binding domain (Pfam:PF00046). #=GF SQ 9 #=GS P79788_CHICK/13-172 AC P79788.1 #=GS CDX2_HUMAN/13-180 AC Q99626.2 #=GS HMD1_CHICK/1-132 AC P46692.1 #=GS CDX1_XENLA/13-142 AC Q91622.1 #=GS Q90262_BRARE/13-147 AC Q90262.1 #=GS CDX1_MOUSE/13-147 AC P18111.2 #=GS CDX4_HUMAN/13-171 AC O14627.1 #=GS CDX4_MOUSE/13-169 AC Q07424.1 #=GS Q91542_XENLA/16-147 AC Q91542.1 P79788_CHICK/13-172 MY.PGPVRHSGG.............LNLAAQNFV......GAAQYADYGGYH.........VNLDGA.QSPG....PVWAAPYAAPLRDDW.GAYGQGAPPPAAAAAVHGLNGGSPAAAMAYS.PVDFHHHHHHHPHAHHHVGPETHCSAGGMQTLGAAPAAAAASAAPEPLSPGGQRRGLCEWVRKPAQAPLGSQ CDX2_HUMAN/13-180 MY.PSSVRHSGG.............LNLAPQNFV......SPPQYPDYGGYH.VAAAAAAAANLDSA.QSPG....PSWPAAYGAPLREDW.NGYAPGGAAAAANAVAHGLNGGSPAAAMGYSSPADYHPHHHPHHHPHHPAAAPS.CASGLLQTLNPGPPGPAATAAAEQLSPGGQRRNLCEWMRKPAQQSLGSQ HMD1_CHICK/1-132 MY.PSPVRHPG..............LNLNPQNYV.....PGPPQYSDFASYHHVPGI.....NNDPHHGQPA....AAWGSPY.TPAKEDW.HSYGT...AAASAATNPGQFGF.........SPPDFNPMQPH.............AGSGLL........PPAISSSVPQLSPNAQRRTPYEWMRRSIPSTSSSG CDX1_XENLA/13-142 MY.PNPVRHPG..............LNLNPQNYV.....PAPPQYSDFPSYHHVPGI.....NSDPHHGQPG....GTWSS.Y.TPSREDW.HPYGPGPGASSANPTQIA............FSPSDYNPVQPP..............GSGLL........PPSINSSVPPLSPSAQRADPYEWMRRTGVPTTTTT Q90262_BRARE/13-147 MYHQGAVRRSG..............ISLPPQNFV......STPQYSDFTGYHHVP.......NMET.HAQSA....GAWGAPYGAP.REDW.GAYSLGPP....NSISAPMSNSSPGP.VSYCSP.DYNTMHGP..............GSAVL.......PPPPENIPVAQLSPERERRNSYQWMSKTVQSSSTGK CDX1_MOUSE/13-147 VY.PGPARPSS..............LGLGPPTYAPPGPAPAPPQYPDFAGYTHV..........EPA.PAPP....PTWAAPFPAP.KDDWAAAYGPGPTA......SAA....SPAP.LAFGPPPDFSPV..P........APPG.PGPGIL........AQSLGAPGAPSSPGAPRRTPYEWMRRSVAAAGGGG CDX4_HUMAN/13-171 MY.PGTLMSPGGDGTAGTGGTGGGGSPMPASNFA......AAPAFSHYMGYPHMP.......SMDP.HWPSL....GVWGSPYSPP.REDW.SVY.PGPSSTMGTVPVNDVTS.SPAA...FCS.TDYSNL.GPVGGGTSGSSLPGQAGGSLV.........PTDAGAAKASSPSRSRHSPYAWMRKTVQVTGKTR CDX4_MOUSE/13-169 MY.PGTLRSPGGSSTAGVGTSGGSGSPLPASNFT......AAPVYPHYVGYPHM.......SNMDP.HGPSL....GAWSSPYSPP.REDW.STY.PGPPSTMGTVPMNDMT..SPV....FGSP.DYSTL.GPTSGASNGGSLPDAASESLVS.........LDSGTSGATSPSRSRHSPYAWMRKTVQVTGKTR Q91542_XENLA/16-147 MY.PLSVRSTS..............NNHTAQNYV......SNQPYFNYVAYHHVP.......PMDDQ.GQPMEFGDSVWITEGGMELL..W.SWT.L.......TQIWHKSSDLSPNQ.FAYNS.SGYSSP.HP.............SGTGILH........SVDLTHAAANSPSALSQNSYEWMGKTVQSTSTGK #=GC seq_cons MY.PusVRpsG..............lsLssQNaV......usPQYsDasGYHHVs.......NhDst.spPs....usWuuPYusP.REDW.usYusGssus.ussssts.ss.SPs....asSPsDYssh.tP.............sGuGlL........ssstsusssshSPuupR+ssYEWMRKolQsousop //