IngvarEidhammer I Ingelonassen I WilliamB,Taylor
PROTEIN BIOINFORMATICS An Algorithmic Approacht-o Sequence and Structure Analysis
Pmtèin Bioinformatics: An Algorithnic Appmach ao Sequenceand StrucoìreAnalysis
tryrar Eidhmn€r ùd frge Jons*tr Deporhùt of Infomaîics,UnirérsittoÍ Beryeh,NoNa, Irtision of MathztutìcdLaiobsx Natiotul t"sîituîefor Vediml Rétulci, Londù, ùK
JohnWiley & Sons,Lrd
521.63t !264
Part I
Part I
Global Afigrùent of S€qrcrq
1.r a s6nns schmefd rheModcl 1.4 Fiidìig HighcrsontrgAriemts wnnDynmic r.4.r rr.mir.&,j 1.4.2 UEofEahic*
ii ì'6sÓno8GÀps:c,pPefu|ljs 1.7 rrymic Pmgrunninglof Gcndîl cap Penrq, ì.3 Dyrantu rmglffing fof af6necap FeD.rry l.e ar4rreÍ scoreaid s.qE@ Dishne
Pairwlselacal Aligmenl aùd Dstsbse Se(h 2'ITheBalicop.nliai:cÙredì!8T{oseqEtrcs 2.2.2 Rep€rins$sDe*
2,],2Fjndingúebellocdr|isihcns 2.4
2.3.i1 S.onn! Dúi6 andgappcfruies Dahbse Strh: BLAS|
HyF,ú6is Testinatù seqùem Homology
r! r.j
Poìsson prcbùnny dÌfdbtrtior
ftobabfiry Dnriburiotrsfor cappedalieEds asesinc md Cohpcdn! PósJ
MultiDlèGlobrl Alismenr and Phylogenetic T.€6 .1.1.2 a pBnìDgaìeon66 îor ùe Dp soLurion 4',Mul'ipìeA|igihdhmdPh'|oFreÌicTtds l,r.lTteDmhe'ofdiflÙù''Gbpoloei$ 4.r.2 Moìecutù.lockÌheory 4.r.,1
DifÍeEnr.pprùchestd lmmhdjns
4.3ó Roorincorlé t6t: bmMpping 4.3.7 Sbtisricsl 4.4.1 Aligrinst{o subsrùsnmnb '1.1'sseqEr@vè|glB
5.r s.ùrirg Múi€! ss€d onFùY 5.2 PAMscùns MdÙic$ slhsùtuiìornùn 5.2.2 calcuìare 5:.r
MàtricsrorAtn$ dol!úon!4 Ljme
5.2.6 ScorìigMties (ìos oddrms@t 5'2'7Estìrdinglheevo|Ùlionij'dtbl.Ò
Conìpdìis BLoSUM úd PAMMdn.è
6 1.2 Rènovi'g Ns úd corms 6.1.3 Pdsidonvùshb 6.r.4 scqEnc wcrls T ó.1.5 rerùc grps seNhìtrs Dd,bses wirhPrÒ6les IdTSd BLAST:PSI-BLAST 6'3'lM*jnguì.mÙlljPlea]ìgrmeÚ 6.32 cdsùcddg rh. Èofile
ó..1.2 Cdnnrudiry, Èrrìle HMM rof a prcreùfmìly
?.r 7.?
Îe PROSìî!,ragù4Ò E\ac,/aprminaÈMÍrhins
7 4.?
CompùnÒDBaedMcrh0ls ?.7.1 Piwl rllsd nertrods
Pfrem Dfirn MdhodsrPruh
Part II
StùcturesandStùctù.€Dcsc.iptioÌs 3 r unns ofslncrun Der;P.ions 1l 85.r
3.5.1 srmdted sheb (roPs) 3.5.5 ropolosyorÈo'einsrrucruE 3.6 rdsrryins i\e ssEs 3 6.2
DènÉseÒr&4 shcM orPndis (Dss?l
Sùpè.posiiiotrùal Dymmic ProgrMming
9.2 93
9 r 3 U$nsRMSDÀsoriisofsdctuEsiDiìeìries rnd Alignmed AlEmtins Sùpcrsìsirion DoùblèDynmic Pner3lmine 9.r.1 Ldlevel soring naùies 9'3.3lbdEddohtdylani.P,lgrmriig
l0 ì0 I I îvd.dirúsionl seonètic Mshins rorshdùrc.onpùison 10.12 GamcùichÀsbiis
10.2.r Meaqjq rhesinìlùiry or dirù.è ($b)mries
Cluste.irg: CombiùineL@alSiùúlùiti6 aodCo*nhcy 111 compdibirirr rl.2 s.ùchirsrorS.ù1Mtuhà\ ll.3'2orefìappinccla..6
ll.5.l Compffins rtrtomaÌioDs ll.s.2 cd.ul ii! drercv hoslohdiòn rr ó Crlrdig by U.corRetarions 1r.62 Ceondfi.fchrion
t2 SignifcanceandAssm€Ìr r2l
of SrruclùreConparisons
coishritrg R.1.dom shdua ModÈlN 11.2.1 cÒnsndinenorEduidadrbsG r2.2.2 Den!úloD ti'r for.imì1 l,
1l Mùltiple Sùuctùre Conpafi son
rinding r conmon corc fior ! MuìriptcAtignncn!
135.2 Diso(nne PrcrinsPúem l3.5.3 lhe aPPúmh 135'4 scrirg ùs Prckns úods rots 13.7 Biblìo$.Phic
la Ptut€tnStnctúre Cla$ilcatiotr r1.2 An lsinsModeltof DoMi! rddúficfior 14.31 MdnìY{donl?B r4.r.2 Miinry-, domdns d , dorùis lari 14.5 Aúomtic APprcshs b Cl,ssi6cati0n rj.6 DibhMs lor Sh.M cì6sific'rion r4.7 FSsP-DdiDomhDidioúJY Ia3.1 Domains r.r.3.4 ToF,losy(roìdlibÍv) lmili6 143ó sequcdce ddsìrì.adonpD*dÙE Th. CAIII 14.3.? BNed m sùcLs r1.9 clssificarior
15 Srmctùrehealt tro!: Th@ding 15., PFb! Ssonddf, SÙtrctuEPtdicúoD 15.1.3 Ac.ú4v in $cond.4 shctuE predicÌior ts.r
MdhodsBucd d sequde ausmst rri.l Tr.3D rD ratchiic n.ìrod '5,:]'2]xcFUCU€ú.úod 15.4.1 PotúiolsofFd
DoubÈDyimi Pmsn'rming
AppeDdixA Brsic in Mathematics,PrcbÀbitity..d
B tntmduction to Moleotar Biolo$,
r probiobioiDfomaiìG,rcuiins or mdhÒds
!lms'gúnlberelbaisforch6inglbgtish.plogDmndtb{iglipaEn46od oprions,aÍd liniuy be&ne noÈ conÉ rhe onpúr rcicd (o! q,npúú sico@ rodcú). on 'he dhq hand,nly lìDd aid n or sh. mry edi úc údèóbdiis ics$ry b mmc iftú!ìng probrè'ns, ónnù
bsis for ólabon'ivc prcja 6 nodùe igbt balanemakinsùe
ùis by l$uiry on rheideÀ or rhemrhdrlprcgEft lhire sryiis mry r scnrd 'Ih. ider {bou(heir ùù of appri.ltior úcùons ùo .lso d$nb.n rorn{lly $
'ÀeonginilÉ*afthpapcsalwdlsÈv r.o
ahoú ùe abjcd i EoE daù.
etve.xene desdFio$ onheivanabbdrbbrss forDNA aodprccinsquEncs, Tr4e hool! tpìdlly 3oì! y.e lirlè d Md ii ú. pmgrumsor hk. îrrc a
my rì'd sonc sdios
hardro foìtor lwhi.h oi bc 'kipped). ri . simitrnEDq
s we Fdìdc biolo3nar ndidioi
ùd rÀ.
What is biotntormalicsani{ay?
.vùbsddodfmflEbokpìi.ed rndmofDNAbyw{sialdcfi
mjÒ! bE*rhrouehsitr rlÈ 1910s J r95
x n rhcdeveìopnst ii rres fó
i. rùvcr duì !ùh aos ùd od (bìr).DNA trles(rùdodid$) dìd comein foùr dijìcrú
ldopnm'olhighlhrcuchpuln.lh ih. appì'6ùù 0f 6mpúh
h h6Yd
pi., bu.rilì oreof rheho$ widclyusd :inilÙ6agiqqu4s.qu.i..'Ti.q ru be*prÈd rohapp.nby ch!n.è)si rero rhGcorrriòdaúbs. prorin.areoirhms dì*ú$n in dePlhìl chaP€n r-3. úd, anddudy rhcarìsmsd ro s,in infomrion abournherclarionshìps bdwes 6ùaìaldsfurudprcpefiesommonblhc h.E n\e sme dioo aid). 'ìe aino aid rir. sdc( îcspdivcry) 'bù nry bchifu úNr rhcdnrrc. or.cùdiry rhdtr€ riomfy felarionship beNen a s. or prcbns ror seret. sorh tor pmEin runriona mryss, i' is .ùciaì 'bd rhemùlÌipìealism !fles rc rrcúedin chlpÎeB,r 7. !ìmtl foreuùrd smd€s dd 'hemjtrwùkn's hù*s (pedomjJìe,rordddq tuúbdkm rndsignalìitre) ii livinscel d€*), b bdh ùdddshd rheevolùrimùrpóbìls Gìn.Òrhcsúùcrùie013Èdeir rhdg6 froÉ slowlyin cvolùtionúù d@si$ $qúrc4, aid.o id.idry rb.ommor da$i6.!rìoi or rho úivù*
of prc!ìr 3hdrca. A coúnon appfúh.o doirg t r . , q ,\ n 4 r G t o , l r o n , t r . . , r u . o n
oi prcied rhd(s.
rn Pld n (chlpÌen 3_r4) Ne defnb.
frudùr grcnrssqù.i!Ò('hcodtrorrmjnomidsrtÒrs
!c.rc .tìa i o.ù. rru4re diíion
No This Qn !s sonc
d rhùdine in pin n (chapref l5)
ind porh rrudres Mo{ofùci'lg rrpìicrbìe 'o iúcrùÒ.ide(DNA ù RNÀ) squcr.ù\ lrde e. bÒ\$€( a nùnbú
. o. ,. (, . . . dedÒksuD$ecifred úìno &ids. . a, ., D,. . . (úc oràr..Èf codd is us.dtur spdi8l Àùiro ùjds. . c, Ìd c" e boÌl ù*d fùr rhcblcrbonea{arhon arom. i e is a geHrr aiphabd,mÒsdyusd fd rì. id or mino dds.
. sr' r Nd ró Ir,.rr. ..., Í I. . { is ùsedror a (qùery)*queftq ud a for a dÍib.e seqúna. ! 4r..J;rìcruh6.qu.nccGuùsùing)or4fim4 rÒ4r. . . isùedfùrsèrcBrBidue. . Jsd S'ùc us.drore.orìnsffirìy b r I is usd for o $dir€ mrix, Rd,is rhcsoins beMeer. ud à.
r ur,4,ar,rj.4,4....
. , 4 r . A r ,A , , r / . & , r . . . . . f t N e d . ? is ùsd asa prh in dynmi. prcsGnnirg.
. ( 5 ,P )= r R ( r ,r . . . . ) . ùsìlgrhe$o is na|nxr?.s is rhe$oline,hd t
Acknowledgements 'ftis bmk is builronletoE mb for llm6 in bioirlmrùd aìsonìàms bughr r 'n. 0niv.6iryof Bà-ger.we rcknowredle en studedb whùhM torlow€d rhe drarr oysbìnrùfod Elsdsm ùd iiromaliE ardiúpinns.on!ffirions *iù ReinA,srúdiDdKjerrPtuen, Sone of rh. d.itur ov.rd buirdronjoinrwr únhavh Brumr md DrvidGtb.r eI ùd èpdùUy RurhNusimr ud omù Drcffor vrubìe he1p. rinally,rbinrsroou fmiliG fork cpirgupvirh usdufìns
Part I
PairwiseGlobalAlignmenlof Sequences i"^F.,s ",.*
.h.."" '","
llll:11'":l''-) 6"' d.r r prrif we
dt dù1ùùaqtu@r
016 d \ r" &.",. i
nB^. -.N"i.r
drhÉ5 6uúr
rhcrclrìoNhìpby a/irrù8 'he qEEs. nre iliemù' strùtd ippencdinòccldlurioiof rhcnqìsLqucnces.
(ùoded by rrd,f) nEms delèrilji d iirÒrión (t/rn.
onc q rrsr
owi Grd, ir iÒrkióNn).rsilLnrì$mcnr (onryone hlLrpscd Hidtrechrqe ìn eùb muturioù. rheaùsnnenr b€rnenr rnd
:1.t1:!.11: :t oi.4db, : too. ftncon.
Rod I ooeoiúleBl eú$brliii6.
s à t r r d i r n . r , , ' or . o o ! . , n e m r
sd o- \ ne . hpr. moder.r r ù!d. q b m rÈc ù" ,, Àù ù mnmc
rbuns r*o ildeb. onè hiíoy cÒutdbe
m\ mùr .ì " f - e r h o r t dh , e o ú o @ú a r t Ò E h r i r s " n ; . , u r n 6 "^.o ro pcú rr s.rcrn q s ,\ oir' mer, sfu b, .!!roc_ d ,,4',.s;/" prce n
1.2 Whatis an Alignment? mN \djsrythchrr'^linsoiqdúú (resdu6)inq andI h! . Aìl synbols
for theModel 1.3 A ScorinsScheme 1.4
Gddili\e somg $heme).
. À , ,- l f u 4 = , , 0 r o | , + , i
P4rRwrsE cLoBArar,t6M.{!NroFSIQUENcEs
ffihllffi "hH il?-*Tq;f ffi*#,",#ffi*"'ffi 1.4 FindiDgHighesrscoringAlignmenrs wirh rrynamicprogramming
'*;ffiTliÈi':r-if ;,y,#.r'f.:;hBi*r
*ffiurfr ,**àtfr'".ry#L Usiùsdpùlc pmgmnins, frd ù. hishèt po$ible kE.
hish*r ff iltl*hidìns,rc sùcbv6 ns,,e To.xpìainrie m..hodee ùbdE
"r"Íi:tì,:tr&*J '
sorÈ rohior
Pffi :T::Tr1'1 ilfi#:H',Ì
r 4, rh i!\ synbolùf4,lj ì.1în syDboì orr. "
'nbnk orq rr era0Pìe.r, î =:th. ! rr , r rhesequenc! or 'nc jì6.I lmbok or l.
Ndclhir,,lvillbe'hÈbìgh*soc u* of dÉ Eîami. pfd!Òmmùs wa r,.j by siig oie or mof or rr.J. 0 < f < i 0
LA.2 Îì!arjenmenrforlar,.rr r ) an onlyend+hhoÈ orrhrc djlrcfcm dùm3:
xtsrfi|dsdDlcaionf$Ir;Jb], ddennìtre'becon.l!rL(fu':/l]. we us i = 3 ardr = 4ós sanptein'heeiplrmiE, (4r.I = vEr. /r r= PREî).A$me ve kmv r! r. Îìc i'rrgimciÌe|dssnh (4'. ) Fo.rhe*mdd il is (Ì, ) ftealjlinHÌ fol rr dhtc
2. rhe irilrtri'
eùr \nh ( .,J, c,r)
i mry be vi-
By lsi,rsdic rmc qprindnn r
t-l ri
PAIRMSECLOBALAI-]CNMENTOFS,IQUENCES r. fte argDmn'4ds vjrh (4,/r.rr.disinorisúù
di..,,l!t ';
! Etrl
PE b
., I + Riir, (r?4 =0h'hc4anpre).
F,.r=nylÉj) |
I ishs, .. oe ! . h ,tu, | ,a ,"rr b, e I tr. onÀ ,.4_ t E H, rt t+R..1). |h @
ri hcìph ùe ,lisn;s pNsq n is arurcprifc 6 úecc ú. eoÈs 4J in ! &oùe ùnd mrix orsize(à + l). (,r+ l) s sho*i in FisE 1.2.rhe aIWs shd! fhich úrief fiììedens m uen forcdcuta,ingùîevd@ofr elì, rr (i = r. = I I I N q d , ó n r r e 4 f r . n . o m nd o w rf e ù€boroî,iA\|tun er.HowN . !e r ar I qnnor be cd.dabd. lhmf@, *e bavero itrìriattu rhevaìus ú re atrd . o p r yú l r r . , ì r o / ,
. li,
;;; E":.h trorc-& lrDr iìÀ:
d o 1 îú c w d i n s v . r hI b t ù * , c h r td , "
_rs. F,suFr , .bow.h.
af,befiUed'bdl''tshq''Fig'l|\,' rhFo!'mp|cÚi4t|en.'ed'lrrcq'À|sw'iJ'flgln'd||'
'ì? mìshboùclls vit eiE rhcndimum v'ttre 6mùr yi n couìdh0pped rtìú au -"i^, -tv. Nd" ,hú k,h" nE. *'
E N .=t ú a x Í H r . 6c . H 1 i s , i r ? , 6 + t a d , l - h a t o rvÈsc 6a! rbeD*inur vrtue0) ìs
Prce(s* BibtiosúPhicnor,
1..13 Finding tne afignnenlstùrt givethe ùighcstscoro ncDL\,Ne car fonoq ùe pxòs frd,i r,,.,, bekq?rds 'o /{ 0 Tlìe dovs ro rnroq, r.r lhccrrDprL!ro 5ho!nin Fieùc 1.3{b)
med'o nrìrdc (4',4). !ùicb od6.hd rhe ! bìlir usirgrhosiùrès(ssriF r! r).!e [ad ù( lorlnn omsllondirs ro cen rr.r À\ nr|ors: . ifrhedorco!Ès riÒn 4 . irrbeùor.offrnùm
r, j j.rhecoìrìiùis(
); /r);
Itr;i fliÌ.;'":fli#Í;"",*i.,1fl"'il,TJ:;:^l**,* , r".d", j#ifi d " ".","".,4 ":;;;.;
: ;::-,:":Ì::l-;,î,,
ri,"nT.':"y*i,""ffi : $"s.'f*, i;1,s.r"ffi frlH.Hi *'*ffi;.:"í::'*'*'**oreposniorbÀdin4ùd,ùein4hùe
sq.'enì diemÈÍs otr ei\r úìdhich.s 'hrrrisnmù's!tujnu iNd
ldùrd rhciiilioL
6 /,r.*r .(r.,
r. r)
:Rzt-t):|t =t+):j
t\t =ai|ts|r =' '.bltkt@k(i t.i,t,k+t)t\d ifHl.j=HljL'3th.n R \= 1.[.k+1)tt] t B , \ = , 1 j : b a ú m . t : l ii R t = 4t: R2.,- tt , b4air' t(i
L. r.r + l) rnd
(f) tu 6rcdii h4ni ft
t" "' ". r f:'" I
s.ccncÉlu mof btúLs njgbr loùN chh neolnefo]lwiDgbLfksisoÌledlgap'
:n erùple or d rismeîr wirhhofc snpsis
r.5 ScorineMatrices Tle $ùiig
ùstdìl s{rion L4 i. bo simptero be Nd Nbenilignins na p uKu oi
tu asidùs arufjns in rhc seqùeiB. For No esiducs {4r. ./r. w leed d liaqe or ùo pÌbbrbìriry (or LlÈrihood) n\d rtry ha€ , omnm ú;,br ú rharùc is a resùhoi orè Òlseveralmùbrirfs otúc orhd.11ìèpojtìoo ohh. rcsiduB n iflored. ry of rìì. cúir-! aminodds is trrd. This
scoRriccArs:caPPENALTIIS (ù m*diahlc c.nn). rhcsoring mùj
s ù th! nmcj ùd re erptaii.d ir chaÈr 5. digùrs4ù|rminoejJjÉDlalìÙi
1.6 ScoringGaps:cap Penalties ÙgxPsú'bea|iginù'gaqto'nykd
! :
wd ù\sùú! rhd rhedeleriontù insnioi, ir úc.rolrúiotr lìr sÒD heodrerwr, of
.îTî-.î --1iiÌÌ. "-îi-"ÌÌÌ "Í".T?Ìiî. -iÌ.îÌ.iiT-î
'ÌiÌiiîl-.Ìîi '-.ìT.îîi."ÌÍÌ j-Ìi*iÌ.
Tîî:îiÌ.i"iT.Ì --Ìì-ifî-.ÌÌî "ÌT-îÌi-"ÌÌÌ ÍdîTioîoaìN1-
coRrNccat s c{ PEN^frEs .lDgclg.pNÙ!$qaìreiehboÙilg+Ùt FobNrtu 6f gl'PpÒ:lriess'n8rgErudft
(r r) arcsad rohcq,r{rqrq,
lÌned gap Èùr.y n,ndior (sr : sD, !ììi.h wd tìs! uscd pÉuus,f ,\ aioheiúLlyGidnihNin3or rnddcD.cncidireagrpsboddbep4iurlu fnulr for dÈ slP Fnaìq, dìoùìd b. !n dr,r. so.o..x!O lhe frrmridinr úc hnelrppei,lry s!|ìpeniìryruMiú(lhilhis .oron
r.1 r
{hth lonhiis ferù (rìJr) saPs chrqùg 6e srp p.Ntq ro | + I
d,aodlypkJ|ylirc]:10&{..J\yevi]i I dignhcns 1{c ch?pÈ2.3)
r nr^t
i4 ùi
d ù! !ú.nx
4 &id
HNeE, otutrmof thes{ùeoesÀ rsub*qrr..onhèorh4 andaL2 so,rldh* beùe ored onein 'nÀtcrè Gndfoutd beroundir eùdc!È *m nÒ'psaìized).
1.7 DynamicProgramminefor GeneralGap Penalty mepeo.]ly,ìonglheg'pis' Forfinding'hev ueG.of) in n ,j vhenseFnì gip p.iÀì'icsm Nd, w. nur . rrcrof
ú. subitignmormdswiù îte p'i 4r,d/, n hr g r pi i q o ft . n g 0r ì. t < r < r , rhI grPi! d ol ìeDgrb J. l<J
Fism 1.4(a)showsvhich .lemedsmùi bèu*d roc,l.ulaG r, /. 'ne Ecurere fomurr for rhisis
Tlrerim. compbxnyof ùk ft.union cm befoùndby rÒrinsthú rhenumh€rol ells samin.d rorfrodins4 r n 1+ i + /, here 'he'o'rr uhb.r orclb ex,nii.d
II,r+r +/,=* +I., +ttl = mn+ abù2 + 'nnzJ= oanz + nn2). Fiem L1(b)show!lonè of rìc viruesitrúe dyimi. pósruniDs (DP)Èbb ror aennpleofusilgffitregappdl'y'Îiesolin!shef.isd.|ìn.din'iè6c$'s we s ùd ùè woft fd hrdingùc bcr îri8lmn whú r súfar slp pedry n I liDar sapp€ntq,
nr dlnprc. noddrhrccnrriùs (ofre'lgrh l.2.l) rìJrI prndryd 4+ 7 + 7 = tB
sso rimcomprqiy): ùi\ qirb.'E
irgroúc0c grPpenatB | (i runniq rime(úh 'be ere $îs'
1.8 Dynamic Progranning tbr Affine cap Penalry rhclir$
elp periLlty(sectioi LJ.1).
tu e dson'hnh im@g+ pdry
ir b boinst!, w. nur 6rd ir ir is rhesh or r e.p Gleo + sqqr, Ò!î dtrsìoi Gdd). Fordèdiùine ltij we lókd ,er rr ',4.j I úd H,-'.i llre romuh roruùs d 'hcd1@icisrboùrinc.clls H:.; = Htt.r_l+ Rt,tt. ad ùis tu ein bèùscd.sincèir iovorvam gapGÉ sErion r.4, ror rl?). Fù cdùra,rns ,.r)'i (,hc!ìcmriye yia Hr r.r. {c úur brè ji,o móuÍ how rhealiennm'foi (q r r,lr ,ceetrd.rùtrcsshawrob€oirileE{rG@ (r) t t,r Lj be'r.sorcaîi - I,j vhm comins rrcmt -2, j rher di;"=Er,.r_8qd O)rd4
'.j bèrhesorcati - 1../vhetronìng nomi
s {oùld beone,{irhoú, blÙ]r Bú 4Ì'='j-'ìr c) l:rcr
l. j whed.oniós fton i
"ljr"=c-,., si; - nurr, i.r - fd@d.,.r
NbbciEdú,qr rqudi{s Hú!ù.'h!rreorirùìisfillùr ùdù o(,r,).
1.9 AlignnentScoreandSeq'ìence Distaùce
! r ! , = 0 r b r .= r r - 1 n r 4 + r . 3 n d 3= r ì
onÒ. did,r\hi!o\ hd\Dú o6j&rsco
1. 4, <1,:+lrrdúy.
mtrdrdion (ùe hsromaúon ùth rbenininu : prù of*ings i, io! rec$dììy uiiqù.. FdÓfpÙisdofbioloeiúlsquÙc
Nmbq of opúarioÀ)be'wen
lar *!uqes.
HNcvcr lho ìrc nurbr of
disimcnr Gorcspoidi'ìero 'bc hhrory)is, howev{.
Foronprnngdnur@r b.!vèi diffee pairofs.qu.rms,ir il@mmonrodivid. rits obsfl€d dirb.c by rlr leigù ol úè lorser cqùei.e, eìIting in (ElaÌive) severalnodek for coreriDs aq mulrì ryú of ù. rùdioi I$ rh. .dftcrdl romuh {nh a ìosin.hmi. furc.ioi. Ld , É rheobsfl€d Geìa'iw)dùhncarrhen! conmonmondfor lìndìlg úe .offited . rehriE) nùúbù of rM'io$ (
dln(r /(rr,
?hùc a is r .o$bdr úd /(D) i. r p aù! úàr for pdeiis ({hetr colùmns*i'h brirk irc ìsnorcd)is f(r) - D + iD! Kimum 1e33).(Note.howH, ùr $is canod bENd ror rqc D (D sndù rhù
0.35,si'rc lrri /(D) ùoFcs sfcdcr'lún l.) usìlg 'h6eurusbra
.oiú,iÀ iavr dtriLní ùrjno rcidt,
ùìd /(r)
rL,if Lhsobfrcd v,lùr i5 o 3 Gishrot ren
1.10 Exerciles
1.ll B
À -
(c) conpùc1tuiignmetrhroutrduMùG)úd(b).. rjndfúcrchofrhqu
\rts\!rsrcLoú\r {rr.NMrNroF s[0ur]Nc[s (r) h erìal .6s
À rris Gasombìe (
(b) îÉ gstml prcedurc ror dynîni inìh\i.eyaybùk9arcoflhis
(!) clxùg.Aìgoúhn
I 1 ro hke ido ac 1d) vi h!!c úc *qù.n!d\ ! = 4r u, rynìbols. t roi ùieqùal. ùìd a (lìner) elp pemlq or . Fiid rhc bsr
(r r) whil dtus ú. diskn foùrd neùl G) colLddhùherdismos($rù
l ll
\ =rÈnú rndwu6!i ( rero)arir. d,n rÌ$ùdh aodaqs (ree2)úd isbm ror 'nùl'ipleinuÌitiors (rorDNA q :- :iixrù!pftsftdinKinìum(1e80. le33).Li(lee3,leeT),LLorcu(ree6)
Pai Dal
PairwiseLocalAlienmentand DatabaseSearch
.ù. dishlÙ reraÈd pmbi$. i\e momurarion of mddios mi$r ÈaE Gad uneEnry rD$ ùe sqrner ìèdiis 10! sirilú scsrs Ènaìiiig in o'nÒflis Ìúghlir$hmdisàìmoEdiledyÌhmugh
idenrily a signilìcd mrch N Ìhe idúd ol 0eft (sùbseqùere, in dr rvo squenes. hiorogia! ùEbÀ*, 0d r[crc s r 1ÒrÒr rì DNA sqùen(s (EMBL (Eumpo, DDBj Oapn), acnBnk (usA), whùh aìì drxb*às Òlptu'rh *quofts (È.s.PIRúd s{issÀoo :ùrrin tbe smc $!u@t, rd dúrbri.s or prcÈin rruc'ùrs (PDB). o'her dfubiss, ror i6Ìme. rnnsFrc Tnmcnùion Fador Ddab*O, ùè m \tr enry in one ddab$e my conbitr intormdion Fratd b an eíry in ,nÒrhcr ibbir FÒrcx!'iplc, r EMDL @NA) ciL-y rm conaii rhercsion codingror one € drh $Èilirs
speidàr dftrtu. Fd dm
INC TWOSEOUINCES {ruduE'*istúefolprctiNùÓdcd
, q'fl rc\ lr i' id.Lb.rDih.h. jjity6notpHÙliÙnhonolÓgoul{qucnq.
ds lor $. squcrccs, fd ù.i o dlicicnr
2.1 Th€BasicOperafon:ComparingTwoSequences îc hrk.ú b! Ioúuhrcd asrorrowsciver Ì q!.ryrqlr'd rquocs krt. fiod rho$ l Nhi.h \hfc
4 aM r dd{b*
D ot
ftù úrk F ro hrd rhcsgmens Gùbk!rcnet or Ìh{e rwoqm.$ {irh bìsh4. ds8Eeof sÌmilairy. úd ro .rr.ùL{c d.lo.a/,/4',,f'Noblicl.singo''hc or \úiis. (vhùc suhsqEnft hs úolhq nqoitrs). nÈ remonror 'hi\, $bequcn* aÍd {birirg idùch.igÈrhL} rd spnfy if {. Nc ùb$quúc
a ssaqf
f 3 sùbscqwrceGubrù!)
or r of l'. a {gnon' do* no' conùm
en'rsn' c{h Òf4 6d d (h,y rccdno' a i,.rr ariqnD4, * ù dipned or a seenÉÍ $n
2.2 |
sh*p$''h'he.prcbabìein$ìinlikePegÌide r úoo ùÒ rdn 3y is (rÉ!.s d di. ciÀ iidi.!tr nÒÉr!rL6 ù hd s,pn ..5s
, !o
idu$ ù 'hc sgnen$ ) rhc kÈr arisnmn'
:.2 DotMatric€s =ronmuy'dlefrnlt.hiiqmmedfol -rn€s (aì5o arrd dororos)Àor'! r03ònidÒ(!Òdi.rL),odtlìù*ùr./llorgtbÒotlrrsi'1.(bod,ùúl) Tbcvllucs dorìn'heeI (i. rmexnsrharl, = 'r(j) -;ufer
ì (r) \hors ! dd F!trjÌ, lhúc
I rÒù{r {b{rings (doft'l liî., . fn csy r, undc^knd. ivins r Lisrrld@ì
{oniìs...t g in F gure?.lG)ì
[email protected]
(a)4 dd mù,
lhowùacom 6x.q'hvùdowÈoeù5.Údfuùdd60%'
sùlheind?ah'idìl'oràmprq3ìl Fjsurc 2.r (rl aurd be qEnded h frve
ps. c aid d could, ror ùmplc, bè ombii.d
Figtrc 2.2 ir úc dot úiùix for rhoh .on{ (itrdicatdby rhesìÍshs)
2,2.1 Filrenng
9. l !
1' ;
l'o/r}+r'e'c''lrEicon'iues'j'ìì'hemxt i::nionorq(rcw'Noìn'hedoÌmddx):4, 1+r to dì r. e'c.
rÒ md.h. Usù,lly, xn crd mrth is iÒr iùqtriÈd:
Fieun2l(b)showsrhedo"ndix*h.iawindowÒrsi?xfi vcisu*4rndrìcar c rh{ dr noisedisill{n, bú îhe $siriviq (rherbìliry rodiscoE **l simiìririe, ìs v.l}rDdl.
, t t ) j + ' t > c % r b e n D ij = ' r ù d
2.2.2 Repetingseenenrs dr{ r+r ftùè eirl bddos onù. .ubdirgonirriù (i. j) '0 (i + i, j + À)(r.c
2.3 D)îamic Pmgramming ììebc{ srob{r!|anmsr. Dyi'ri!
F+' ùi .
b$' 0 ghrd \conrs) lo(L rtis'ftnr
ilr /6. TrÈ ber slisDmenr. ftom ùreb.gintriis, rndiie ú rbrhe* ftsjdùàsthc bsi 6)n nMd itr Fisùr 2!k). Nrh i sor of 0 4:
sis d1''
TneexamprelhoNrúútprc6x6 Gonins 66tìn 'he alisinot)
mùlr(4ri /J ,) r!t
<, r
2.3,3 liidiig gùb,r drcuúÀ, w cm Ero iÒv |r,.j b€ rhehisles sm or my h iyrbols, or 0 ìr rhh sóf is < 0l
à , . j= m x l 0 . T i f t r ( frt..4 . . .:,r < f < i , r < r < n l .
'] I Ò!nmN bc úkuLxÈd for an r, j, dd r'ÒrI Lùo, eapp€nal'y(3r = rr) 'hc H,)-n!ta.H,
| 1+Rr)t,
The sÒr Òr ths b€r ìoù1ìiirgnmùús qcrend*j'heaps.sìi.ecrpleìrcDc4tit coirnbdions. l:fts bef iLgmd' ro úìc hishc{ hrtr. ù H )
= 0. 0 < , < ù, 0 < <' t0 r = /1r.0 or4.l is showtrii Figor 2.1(h)îc
(2.r) i@
Finding lhe bcst local alietrmenls 0 is rrched. roî 1rr cxlipr. oic .siry jì!d
L i\ nov shthtfoNrd
h .hdge úiÒx
irnilrrlridn.ùndiidùd.0irù.or1n scÒÉ(of on.r
one atismen. n enoùsb).
*r-fdiig toar disiFot (o' au 'he aìi!nbur ùr h rft i" flL 5!qù.ru, Húwcù,
reagìon'mdúemysm(lol*ample.b in onmon îbn righr b. do!. by tu
S.oring matric€s atrd Cappenalti6 lhespecÈdsolitrgolî]ìgni'eNoúbùaq
lúìino rcidt be a, md úe soE beùeen lrldismotvill
bc(cD cn).R.dù.idsthe
Rcdd'g ùe p.ùdq lùrhef (whilek.rpiie & > 0) will fsùlr in (cDÉc cD-c) eÌhalinlhissinpìe*mpleÌheúmdq qEhonciscqlalro
Gho showirsasair ùat ùe !!p pciarq .hotrd deDrndon Nhi.h (onn! mùir is ùsd). ftogrun nù d.iry ,ienúcrl1 usrlry dlow úÒu$r b spkìry which$riig
2.4 Dafabase Search:BLAST sb]*dmdyimicplog8trìjng.mdal$
gFop or prcsÉmsqùEd FAsrr (LiDmaraodreuon r$5r Pemon 1ee0).ind BLAS',fr(B$ic Lmrarigtr'icu sclrli rÒùr)(aL*hùrd r. 1990.r9!7) îc pmgmmBLASTF.ùdùì*le.sionofilpiìl bcdqtrnb.dìi Chapt!6 rrsom.olùc1(h eì ddlilsBLASTTNùkr slieì,rl! ]rìe[linpinciplefolBLASTnbfi
. ^ /i4$n4, vsnarzo,
sùrcobhii.d|olìosìÙerppddienmúÙlqsd, . a h4h ,cùig,4ù!ùt
Paìr tHsP)i
mrximaìscsmcn( Ilair(LMSP).
MsPar'rons ovd a rhcshoìdv (ve 6sùN r mndr.d
soq *c chaptr r.r 2)
ùtit Ì,tit |ih d stÒtèÒJ'tut k! ttttÌ v rbodìr b rìndúse Nordpans(or rtrgú u)
rlhes$niviiyollbepmgmìwiìì.1 'o b. dn.ded) Dair milhr bùdv'roÒkcdGauft3 hÒnohso6 sLquLnccs
In rhc hr {sion of BLAst ùc 0r@dtr€ abor h4 6.ùi .hrigcJ ro rc'luirc tlohitsbÈlftdÈrdjle'fuqÙjrn'€l {s'mhierrssnng{odptùsndfafiÍon
b(DR, sE, ú).fteMolììb(DR, tr) DFsÉd), posibìy cx'etrdincrorn HsP.rte hit or sE rerd ol onshf. thorhÈsholdprlncrcf 7
rd rhc alphdr be (] vidr cùdtuLli9 r (i = 20 ror prcEin $quenes, nDr rù DNAs'rrcnccr), ù'd *old leuu , TF *qùerÈs/islypica||yequa|lo3'm N' /,Éptusr?4 wh!rc ft nnponio'do. eord$,r(Òrdlpo*ibl.$ordsorì.rgrhù, Ì e.r" md, whichrlnly r(,. oa)> r rÒrr le$r oie *od.q € q (r is 'hesoE) irc round,andarurcpÉscíedii 'hodd! Endr ol rh" .-,1' 'r (|nÒr fo.rd ir rhepiepressii€). heEe frrdir! hùs.Trc 1. scarchin d ror0..ùronss (4) or rhcwo.ds(,?) thd ùc itr rhedrrasrcre 2. E{erd (boùrid.rlly) ro r hish so ig scemcnrpri4 rbosepin or (îon. orúl,ppitr!) sùceedilghirsúich srily rhedhùne onirìinr lmitè HsP9
PAIRWSE LOCALALICNMENT AN 2.4.2 Pr€prccesslhc querJ: make the sord lis!
L.arl,hibe'e = 1À,c.Dl 1vi.hqùd
ThefiN wordi! 4 (ùl d{ÀLy mÌche! s (s.ofc2) but 1.5.( rÀ,4) $orosI úr (a,D) $oa0 t we rrei s.
. rhewfd sditre arnsiduc3 ó{.h$ ac, Dcl
r'ùi plorir *qucrc$ (r = 20) md , =
$r ruaEhing rcr'is ir 'rcqúcry.ryei.rry.
of 6c (lùb)rùd of ú,t frt
atrdrhe bú (ede-) ls oprÒiùr !ì a MmE mrhiE \ih
on {r*)
Th. fr e{d.
2.4.3 Scannirgthe dnlllrascseqù€rc€s
c! sroù5rJ (rìndngnohirt. a|d r
h cid rnd è Iblloqrh! rdgd hidÈd D 1I rnd 2) oi ù. ldPe shoug thaÌ rhuD !t
r*e .D (inwimJy) hble (dot nrtix) rs lurmÈdbyFìeuE26,U=,mdm*rl r.r i bèrhejJìd*of4, sd j orr. rùù rr ositionof 'hc!ùrtdhir(hc u{roùid)is .:..Fdin i/(, (ai amy). as l js $inned. dìehrs ondìediaeoidsm roùid (byùs ::ùerúr.djagEn ofrheúbb).nìe fis' hi' n s ondi.somj-2 U - i = 3 - 5), :r 51úù indù ofq) !s sd.d ìn ùù ú!y i/. suchrhd i'Ì( 2) = 5. Trrcicf hn
'' :.;;;,:l'; '-;.
r,1^" 19r:t-l;.'..t""""..i
í.1 ,;'::r.",",,.'"'-:,'',.,'ri
(c,a),nrch. sinlt rhe soN has dromcd Nìrh \qÉ r j ro fouid (À4, DAJaîd hDlùv rhe rh8hord ( 1). r$o $glnent ùri6 (rcaLlv)roml Do\imús !^!l Nde. hslcrù rMl iì$! ùe nú 14ccD,DÙa) liiehù soro (2 5) iinhùerrunrior({ìth cÀ ca) soúd ralr ii i
2.4.5 lnirodùciúgsaPs
',. 'h '' :;"".;ii;.,; ";,, "" ""
rr\5ed (io r pofpff*siic rcp) kr !h.rk ifn\ey ún be conbilÈd ro r prir (dr iìign'ùen') ùh gxps Howeki thir E
losxlctimc.tbet'oINitrgi5inPìemenèd. t ú. HSPÌrù{hi!Ò!bishrsmah! sion À p3{dtrd fù o,Jy úe pd Í dahhÀE squenes. Nbich cotr$ponds I r rypicd.Ì{isn qlq sLqucnca (s$ dùci! s!p: ro in mry,rd iisnner
(trHt) arrgn
(ù rhe DP bbìe). L€Ì 'he d
ihrcshordbdov rhebsr sor (or rherudr
p!n) aE @tridù€d.
,,,tú rmù 'h. IrsP(otrcrcsirucrro'i
.{b ùrrlr 5.-s'irùsii (4.4 fti\ pai
u rn. bcr slpp.n disrhcd tnuîd (unrirnoql hrf soÈ y rd 7 bedìe
ùÈ3hol4nd, b.'he DPratix'. in or lr,j 1,l/r L/ r,t, ,.r bve roiesleslhms,-Î'lhenHj'jwj|ìn
arcorithn2.1,BL{ST rppn'.h.
I qùùyroadahMs inaLAsfforcomPanrs
fo. ra1lrdlisdal Àdoc$rn(hÈo o fdrl:= r.o, u+rdo i := îìc Niitiar .fthè nd,hiB
wúd h q
ir sorela,-ú| ,+, r,rr r+ú r >rhrldìen daaú. vcttenîrin Qt-tt-t) t+ú t,ttt r+! tbarBSP ME thè Es? lb. F'ìhb dyúnì. tm s nninN
ll soft(HsP) > rhf2 tù€D peùLm lyan1i. rmsnnùùt
a,.tùt HsP
2.5 Exercises Tryùisonlldsquen$d=DAEAD
l,l (n. i, bc 'he fln' rnd li$ posidon i' (b), ùd (r, jr) bc rhc úme fo, d. s
EXIRCrSÈS andr ror n\e sappàìiry (ììncù).rw Íqr@s
*c givcn:BRRîRî and
(a) Find'hc highcr ndn! lodt !ìilnmsnh (yotr$oùtd findroùt. (b) Youhxlr rh$ly d foundrheaùgmÈú
cfler?rize Equîrion ( r.2) ro bÈyiìd iir rìldine a l&aì:ù4mn
aIA (noft€ 'haÌ only 'lìe imino xcds
A . I .L . s c r l
u s ú = 2 ,? = 5
(r) Mr]re3hbleteiúali|osibrevords(=4r= 16Ìord!) (b) Exr ncq, úd roroeh woln' ir 4 (o lxdinc l, rtul ror.ùb wd. iìndm (d) Youwil iow iii,r 'hd 'hcr ir x *o lh*conhinssLf@m,atrdhNs$rcsi hichHsP(virìì $ùe) woùìdyotr rbùcùr.oflYlt!ù (Lro)! Bci\arc (a)Nov debilk,m.orùc sùbmùdri
{hl r,ìsc
irc rbrtu bib d rhcsùù di
par (Ìoe àroded roHsr)
2.6 Bibliographicnotes dornmrrds\ arsofoù'dii afgos(1e37) hd\lnsMùd^4os0eer)
MP/www.g.h,sdq'!rc'aJúùhi lurion!d DfotnsedGeesÈohenI qql).
, .'dMità rd3ò
r00t. FAsTxn dscnbedir pàMi ( 1990),,nd cippeJ BLASTjn Atrs.hùrerj
StatisticalAnalysis compdns r qucD\cqu0ncc la) $rh É.h 5ùrudoct(d) in ! dú!b*. sNc! fúc ro k{d ,iFEcdl naiy G/,.r),e$h
(homorogous)tsqrqo, Í mui hc fÒh
3.1 HypothesisT€stingfor SeqùenceHomolog' drsist$i.eHypúhsjsqùgatrúidly
r giEn rhEshdd{qc o 0l (2s]), rher is cNon ror rcj€'iis H0 Gùhc 1* rcvrDùd r.cpriig ,r :n!ndù (4,/) sigfirì.aú.i.e.irrher -,v
rh. hìlher $or (hì-!h6' nsni6ene).
ùd rheiÈ!sld PUfsrìlr (4./) vrh
gùi.ollJ rquur!$ (\ccscdu :. t. )
ùe rjcdu
ÈEr for ,!
Fshm dÉ A|e im.ú.hosedù (:) fioddE leftnr
piir f,!ù (.r JJ \nf
n! rherî lr o' higlú, !iru, to G$ ,hc p'obrbiìiqdisribúur nuoduboo. rilr onrìnrc{iÌhrheq{úo. ted rof
3.1.1 Rrndnm geftrrliotr ùfscqknres e rNro dd r^ sù
ù'Lphlbdútoùslmbots IÀ,.
oosiioi (or úe nndor
!, Et ù'l rrroirqus!$
l/, =0rì../i=0r /r=0r. r,=04t.Drnìbedm+rwnhr!tubrrrrnl'\i!!
ndù .he cure Fom ó b @ is rtu piÒb$illry
rcÌdsin rhemtualsqEie5 (4 d)i\ u*d or onc ({tr bÒrh)or úè slùedÈs h dor
Ior úe pnbabili$ dìstib'rior.
'' . , "*.' 't1,.'
: tti
r.:'i I ''
, ,i;....,
! I dL beiî ù.s Ddirios (bÙrùshut['d ofdr]
siadlicance 3.1.2 Ùseotz Y!!ùestor cstinrtingthest{tistic'l
1.2 StatisticalDistributions
3,2,1 PoissonproDabilirydisllihùtiotr Poksoldistibutioniclbenoniqrnlir|4d',nìe (úe prcbibiliry Ìhat 'he rochÀ,io ujrble
=.r=i" " Prx Plx>.r=, i1" ;,,
{ vill hre
:1.2,2 Orl.€ne ral ue distibtriio ns LertrL.
irs . r, tbe iodeperLrtù,
nrùcrioiroi ri i\ ùen (:incedì! r uo indlrlfrenr of eiú orhù) ,l= Pt\i
qhn'io! ofr fr of (oven:ppiigrslneds ! (QDdon)qFcd
ofd. iid r1 rhoI
ì (d.ishyldn'nbúFnof f is
1bc 6mof tr(r) d.p:.d!.ì r urd(
ohen.., rr) htrrk,NlhcoccLhLre s
, 41
îùc t5 ! turfbi be's*ùù'.d; a'd,.i,tr (0.5rr di\aihùrior is Enìe1s coarinr)
\nallsi\oÎ Slàli.lnrl SigniRcance 3..1 lheorelical KÍìfl ind^rf hd (Ìe90)hlrcdo^.rhÈ
tEq@.ùr lp,l.liJ
(orter, = t).
E= L
I ,""r"a"= l be)otrdrhr $ÒPoo. rhisbúk).
{ed rmn l,rd, ùd I oo{ rhjsis doir
t lrr ) nd {tr I rrc sùfi.ieiLry imikr
ed { ùe 8Nd?'1
o! ylnIl
t (.:,i3 urdbùiùscof 1h.lPrxjmde
By sritr! Ì = 1 iDEluim c.o *. 3 Lsttf ptubtt itúr elIùùùg 4t kai otu r , { s ! = P ( r M> s ) = P ( z r ,> r ) È r - e ! = r = I sp(_E(s) Noèrhdexp(r)is.,rùivrùrk, !,.
qp( r4,e /51 l1j)
. By dpmdiis Equdjoi (1.7) inb r p
P ( s r =L e \ p (E ( r ! r = - l r - = - + : + - +
wchlÉtgoseqkrùsrúd/oflère'l r. wc hndrìc b4 ìGaì (unslpFd) ar
The P ydue hasan exlnmc valuedislrlbùiion
rr ùe bìgher$gmentprir KoE foundbycompÍie! olúo squctr€s bcs . Fón qùa'ior (3 ?),!c 3.r 'heprcbabiìiiyfq ?(J') = P(s$ > J')r r
è "= r
dp( ,(aze !3)
P(rM > sJ I | - *P(-ehú'De-!r). P(SM>S!!r 3!qinsi
= I md! - 0!lr-Ì)/4 P(sM>rl
- r -exP(-e-nr-d)
(r..1)rbencewe hale rbeso punebB (I and d rli.h is similu ro EtìuaÌÌoD
3.3.2 Theoretical ùalysis fúr datlbase search mrrysh fof ùslpP€d ìorì aiEnmú rór expraiiedii arr{hd d rì. (ree7).,rhdrollosinsdcsdprioDis bxcd ù ùir atuìe Fora {orc s . 'bÒ, erìuc(hc expÈred *qu!n!6 wirhsco€sor d ràr 5') ii gtun ji rqùdiÒnc 5)
3.4 \hùe io honoìosousr,ìù.ncs di:Ù. NorerlDr dr , \.rue eG^ b Ìhe luruq sgmcih, bu ibrvory smrrr t {lùs úì
s n\Ìssor *hci . squeiccs (indep€ndùi !ìd or shr rcoBù14rc conìpmJ wi'r rhuqmfi n , mhjptied by ùn p eìue. asùr'ingin 'rìc sùc Equ{iù (r.e) Gin N = 7_,). 'hiscqullirysÈnslohddlollulsrs Ùb 0.0r mùn úc t vi'Luebesiosroinqcca fan4 úlltr rlr ? vdtr. usd.îÙdlheo,'.,pfob.biìirylolct|t riro î.id (brcksìuid prcb.biljri*) NheDdifr.ri' sonrs naùns GM brk!Òuid prebrbìtiÌi6) tueùsd. ,rhcEr{re, r s, ido r róiultizr
sm s,, sch dìarrhè
(r! Òarbc rùNi ún úis nÒhrntc'l k disribùrior *irh ! =0.1= I (se Exercis6).) T.e nomaljzedsoEs is deroredby Dri Fron Éqú'iÒ! G.r3) we tuìd s" = In P, whùh i. ú' iofrdizn rón rcqù FÒfcfhsúi'gnaùixúdlypicdsnino rhRrorobecrr.ùhcdby F4ùrdÒr(3.rr). $d 'hoÌ v,luc roùid byEqùa'ior(3 ì3).
3.4 Probability Distributions fbr GappedAlignments lìc soijsú.,r tnùy $orc is dcvclÒlolror merppedìocir dislmnb. No prur anpuuliom|qPcfìnqkfuiglys y rhcncrhoddcsribrd È!ùìÈ..aìÒulatiig Pdl1jgnmo*ùscNb.Ibhdmlyú.y
rorany$onrs núii. îiis ii ro' (yr) p
ror(crpped)BL{ST. A dnúek
Ílilgdúoms'!Ùacslrcnrly'ictrlniD rirs r.r..3ppìr Er.mi. pîrrun03o. isfibúion of rhc soú. n órdè, hd ùc rhd lill be usc!. Thisfl.eduÉ n u$n n ftí sEnsncar signi6aDc .md besiver if
hbissqÙeDce'rcnoEnANmilgúil 4 is honoìogousroa mùìmun ot fd quLry{id r *t ÒrdùdomionhÒ'nolo Pe6or (1993)ha invstiefted $venl Es (z = (Y - r)/'). sinirîdtymrcs ,qnM ùú ? 0 or r$ rlìb
3.5 Asscssing andConpaíng Progranlsfor Database Search
dkr. no{!!ù
ùt Fo&rì mrgr cr rir),
!)2e ?
(ned i ùÀ!4in? \qrcn( a sequ.m. r4dr qBre irn n ouhndÈou (o 4J:dlhrei5cnr !rrùì r t ,J.4.s'df
. rr(?): ùcnunrr€rorrn€posnir{quua!s. . FP(7):thcù,mhùof filsepnsnn u s4oenes . FNl.) Lh!nnrre oflrlseùèefLÉfquenes
aMd..rùt FP1arn{ua ù F!ú.r.r(i)l
3.5,t I
3RmP2ú úc q@pr. O sols P ud h-.'6 t bù MN ú! utuc 1. (b)fu ssn\4 or ù.
i r 5 35 7 1 5 5 . 1 5 2 ! s5r € . 1 ? 1 6 , 1 5 1 1 1 r 1 2 1 0 1 9 * l ! l3Z5r ól ! n : : Ì1 2 :!re! !ìe6v3ó 35t9 73?37573?l6e 6e6665É1ó3636rrio!958r!5r49t
Pr : HrsnrqnMwnqnxHsnHnin P' ì HHúffiHmmffiHHlnrhrnHn.n..
fte l nù s Nhe€FP€) = FN(î) GÈ s{tioi 3.52) Fi3uE3 rG) \hoqr ho{ FP.nd FN de
ald spécilìcity
o'heF (N4r simiìlnúcn. r.ror@nmea hy prcponio rnd T/(rP + FN).'he
. tr. /: N/rrN + rìPl r s/4j
TP/('|P+ FP).drcPrrPùri
ùbí. trr d ùqu\ ns î 57(?lhsrs rherr o(ùr. r rùvi f hg!rc r th) Nor.rtì
,i / n i iord{€ri,ìg
'vr r ue rìì dor ro1.ikr srr i n
L(ioù!e) mdèdeFnd on rheòLùlbtd f iguu r rlh)
:1.5,2 DiscriDinatio.powe. a aii.[ piogfri s /6dirùúria
(of dr)r/n(d,r
/o1q ir hN ldr i d\(nni
r^ (r ) (r r tilud oumhcFois.quarc- \tuns y lJriLld !ftrruc \quoì!$ n {ed). r,Lndfsieoodjobindiúimùrù:heneaì
ru{rutri i F,gK:.r(! ùFir(i).
$her rr = r^ rdúrJ: fE qoFdig!hfl({ùic(Rocl
ii rior @vq. la) ft NÀe!d
.r rEsìrrtr.
i . {pfopùÈ
EUI d 1v. *!4r prosm5
€?) s ùc ùrc$o
s i i Í q P Ls d 6 s 6 f
.rcs to r) f eqùaL'o 20. FÒf'h. splcjri!ìry. b se 'h.Jrlv rdirn? zk. wlìlch is
a \hich m dasirìed rs hondosout. Nore iùib fc dósified a bonorùgoN)Th .uidea|cmv!{ouidbulLafnliliyil]o
dìe Roc diaenoì(r'ora dÍrbre or j b or0.0r).1xùdoc,!d r!.'l ! n'Ìi {hi!h umbcfd homorosous f+qes enmpkl ard rP, rhenù'iber or ho,no
(10ii our
rslc rsnirc5. Roc, is d(rìncda\ nPI
j,,*""-,,1=ff = ,rfc*o* t t '.t,o -9
riÒs rÒ 'I. fùÍriùa í EqufiD c 14) Ndc rrrd RocÌ n ! sù! 10,r l, {idì 0 s wonr úd I îs b.r
ù rbc urMt
turdir d P Fisro 3 f4 \hoqsrhi ùe kr (0,o. r) urdd Lhccúrc. $ irhrrud i{tr Pr ìl Fìsùor5G) tuf
ùc nmbe or ÙÒddd3otr\(d ù. qrcfy) $quoirc\
3,5.3 Usinsnore s€quenc€s asqù€ries
3.6 Exercises
{E !of!rFdù!.
b 'hc Roc {nc.
(b) r
.h 0rrh. rhumLdqlcicc\
pmnde^ ( (he modi. ù ch{adefnnc. {ìm) mqsufe or decry confùr)j by rhr .q
r found
ind r (ùe vùiamc
ing i! q sd 3 tudÒm sL!@rce, by uug Eqùdion (3 4)
h,honoloeùsb4,wÈù.scFq Ìotrrìo'ioldgous rÒ4 suppo{ 'hr .bc
G) Ld '.ek $bcto cr.h or rbÒprcs.J P': !!Fn!nsn I nnnHns....
Fild rhevihe a = FP(7)- FN(î) ro r$ù {iù whd Nis n{nd in (a)
if rsi'iyirr,Àp(.iLorr rlr(;4siiurr on,hch{toúd dit c[óo\clppf)p Dr!ù$ !hi!h progfn twil|] rricrN$n p{r]d. t )'ouretud roberhr b$r
riqùrior(r.1)ii suhsLdòn I 1.: tReùùúq dúrù(!') =r ) r r(J') = r qo{ r4re
rr ). Thh
{ = (n(,(uN)),// coNiLlùLrr(RrLlqD!d rnF4L!ùon c.r rr. slìN Lh! P(r ) \ | úr$ hr! r mrl izederkù rruc di!ù rrùiontr = u 4u
3.7 Bibliogrrphicnotcs ,\xsììnì(r99i).ìl rdjlr$ roÈifrd
(nrir.0d akrdrurfreeo)trd (rú 0d bna r{hùl errl (ree6,ree+)^iny's
conùsda (re33)(rfiisolrlr3).ik5dìùrúa (ree6.lnr).r!,6on(j996.19931 i \r{ rrrri6e(r99t
Nebk lnd Bldor (1001Ieri,ù,tuI ql!r$ tu siùd
ri ir ro.aeo ú j (:000rr) îd r_indil údFonsoì(:1000)useorRoc!úr$n$or!ii|cribrho!d,rRol,i$o!(ree6)
Mulliple GlobalAlignmentand PhylogeneticTfees 1ru|'iplealigrn@'iltmtÌmlùrci5
o rir r whote rrdiy (n h5 bd srid rhd iùtrir e aìisnftùrhoúrùùdty). ri odÉr ùdJ rvo (oi r jc\9 d Lhsmir rhe xi rnry .rcc
(or supedùir,
I. ùis Na)l 'lF arlrsi'
nid Do?lF ard ptu /ó dc\d
J.1 DynamicProg.amming Is&dbmuìúpl.ljg'ìÌjÙ'l.PÍod
ú a nùìriptc s.,lùe!d djsinqr
qiuN, lyps k (fe'ne'nbe.'hf ùiuns
vnh ùdy bbiks r€ forbiddsl
hd! an bctn (l d'Ilùer lrs onLqus!ò gnenby Equinn (4 r ) ilì. ùùnìbùorcdis
riisr€ {.,
{n ùs!rc ihÈrfúg 'h{ ricono, h (b)
rqucrus (o(,,ii) roreqml scqncc lcr-ellN,r, ùid ior phúid \orudoisor such . T'IÌotdùeÌhe rumiu rimebyNirÌgpruniisr.hîiqùs (.ù!oft vrúchf ì
aid rhe be{ (tr coreo sohúioris n
J,1.1 SPscoreof multiolcalignments
iNen.d itr n. L.'s(i',ir)
bc ùepai
(J, ;r ) * ùc so€ or3 rlfrubre Pri$ù. hrmlsìnrbeprct-'id re*ha€m rf *c tr\Ltiie{ lap cosb,rhcsp foa crn at$ b. calcuhreds a sùn .f (orum
ìr dì3nmm'.ii is'r. r,rr symrroì of r lrd ^
(riur 3!p |€mll],). úd (memher 0 ùsirgEq@'ioi(4.2)is (crlcurr'cdDs-Nìs) 0 + ( r)+( l)=-2,aMbyurirs r)= _. E u a ' i o n { 1 j )n i r ( ! o r ! m i * i s e )( 1 ) + ( r ) + 3 + ( , 1 ) + ( Nù'hdrofigivenscolilg{htme'l
Thisfolss ftún dd ri[r rhii J(r . Jr) i\ úr hishsnsùe rhieúbrc byalrgoinc
rk sùe or 'rreFrjsdiom or 'be 'yo iiÀl *quses ìn
vÈishd 4@ù! rheE m biotdgiùr (
elìtredii &ùadon (4 n a[ lcluercs
'1,1,! A pruningaleorirhmfor ùeDP solulion
d úc,, ;eluenes (r r is ùf knNr i u d 6 ' ì d i 8 s Ì r u )c o n s i d ù r . eùr rÉ ( n . i l . , , r o r ù e D F ' ú r i x . l r d l d ù d $orcofrhebsr p.rhGLisinentiÌo'nthefú !Òrb ldr , bss. (sccFieùf4.:rG). < F îcn e ! kio! rtu I )e$orc or L mùf be< ú + .1 ú: 'heEfore,sr + ,., < ( , wckno{ I
K ùJ 4rR rlc ccttI = (r,I l){ìtlrher dÉ hjeh.r súto nÌ!isnns aRe, as. R (sG.Ì ). wc rho hreio ddùmiie rì ler uppùbdid (iì , ,) for rheirieiùh
t 4'!i7 r.usior j\ ùsd iirúd ÒfD,lrlad
'l G) r4ue 4.3 | shrhs rrìÈfo*d prudDs (b) îE s@N!
(b) d!
I i. Jr aa ri
eìls Gs in bek*ùd dtr$iÒo). a vrl D(r, ú). rher úc uùe r, + r(r, D) h sc b u: rhc$orcor drebsr pÍh rÒu
by sc ol a queue. wìÈi I .Òllis pllcesib rùsùd ncighbou6(o phich i' shdùrdind ujuet in dÉ qùnc, n! oc
j\nji! cùnft. i:) ìh rofld deiehb.ù!.d! s h o u l d b e p ù s h . d i ' ì r h c o d s ( irL) . (i :a++ r , i r l . ( , + r , i r + D . Algodúf 4 r $ovs rìretoruird f.ù\iùr
snh prunirg
AND PIN'TNGENENCTREES aìso ùn 41, FoNad-lffiion
wth prunin&
ar :iconún rordoirg der muìripteati Fonùd lmaion is ùsd, {idì poDìtrsor.è[s t0'herúcrror'rrcDPnarú(rq0, .0) ù, rhefld cenor rrc DP mút {ar,r 4 . ,i,) rhelhole s(r) ?(!) D(!.,) 0
rheber slm of m atìsmetr' (pú) frcm/lo .o ! rhcsÒrcorrh.b6tirisnnqrrmnÀob!rÒu soiT 'he$oE ror dbdìns thealiEE a sbck of ùe ceìlsI for whi.h a yaìuero p(ú) rsfoù'd
Flr, /ix) ÀpM.duÉ rhÈh frndsm ùppú bÒud ol rhesG ofúè rlisntur tom r ceuI rorh. èid-.ètr, fl r = /'oì P(,) := 0i push(', 0) pushdresúÉ cf onrh. quce p o p r , . o rJ:r , ) : - P , i ,
h6sorr .J,-..!r
irr(r) +.(ù.À,) > ,( .hen
tor auhNod neìEhbÒú, u ofr ànù ú. 4ht otd.l Push(u, O):P(ù),= s(ù)+ D(u,ù) P(ú) i- nd(P(,), s(,) +D(,.u)
ainding upp.r linit ro. soru Forùy úgm{r
,{ or sqùdcs f,1, l, . . . , r'1, rheEqùúios (dr) 3i/ (a.a)c
s("{) < I I rr}.rr). ldùheúser (n.i:,...,t).îÉir ro.rherLisnmón!o.rh.suhÈquenc*r1+L ",' "i+1,..
.. rl-',..
nrs mb.
Jotrcby ùsinsEqud.n (46) $
.-I I ''r,,,,,1,., "," r f{ù!d ù rimeo (,1, úeÙ d.iors f mprcrity fof6ndiiEr is o(rr,1.
sw andaRlR. , = (3, 2, 2), aid I soirg
s r " l . ., . " í * ,
r = 0 . . . t r r - 1 .i , = 0 . . . a - r no ! ,ts.,. :. .'
u.i id \c,o, .-
conpL*jty ol rhendbod ror lindiner lLppùbouùd\is ú.rîoft oo:n1 spondiEronoviig lod ù rou (DG, u)) n ùbùraredby úe vùih' Éq@'in (4.3)or rhesP soF. Nor rhf rhctumbcror
rbodJd uscd,aÍd mrnyof rhemus (rcush efimaÈ o0 phyroseúic (orflorùrioisJr) b! riir heh wnhdìeaìignins.
4.2 M tiple Aligtrmentsand Phylog€neticTfeer rte lè*as {lemì!!Lnqret,
and rhe 'nEnor
ros. sù.h I r4 tr ciììed J phylolonúic (r .vorùúùrfy) ùco.ind sfiuy '11iib (bmnctr$) o$Lnr sqùems, $d ùe ed,ees e tn i rK cùmrurL\i prcrein{of rJN^)
co^ldsr s'orseqùeocsl^Rl_,ARrr.aRs'.ansl.awrl,a\yr] ùù4rlEl
b $irrudattud) prryk,!
o|Ío'nR'ov,h6ccuftdin' rdúryhrùn î nùùriotrfftn s bî
e^ trro 4ueNes. one*qkNcJd
cr)arLcmur, ofbds.ùr 1{o Gul\d
whor Dcihq u fru, rdirìe ililmÉnr.or i (lirc) ltìrogqrrd! a!u i! tN\( (L r ph![]genà ! re. (plr rr! orh{ mdhod1 tva) hy dhq mdhùL clmhùùg rh
.Ùlp[]ro:uldi!to$ud0d'ielcJ]sF rerhod!lo1ÓNndiigplìylogÙtù.ft4
.LL ri$ ld$cu, úrn trhorrl'3r), rq, u'!$kr0lnsrhd
cdg$ u Lhc'(o
ln ùyìosi€ric rudies.'heoL (6 tur uú,torLn h ou($! úr {,bjLd\!
a prryroa{eri. rÈ cdsrudd ùyòc Eisù@cjohiq Dr'hoduis ù N ù (ùr sùc 4\ red ro.Fie!€ LD. rhe Nnen ioss ù. Ésh orsirg bobhPpùg Gq .ieEr.5
ows,cìvúDdgìú|sqEo$'qlimde diphy|ocsdi.hc,wÒÓNidùoly . Theùe ha r l.trimr nÒdes oeavct. otrcfof flch ùiein.ì squence.
dnediotris dÉidcd.(rnùrf@Èd ei
úc difarion is undsided.)
rvc ryo chitdcn;rheùrenat mda ofm unrcord re haverbE comsrd edges(btr.h.9 ! Ai ircdd ioò.ù
r+ ofPddeÈùFgurc4'5
nús bM rwÒiDstin sqq. lrìis fte rùùefor eivesùr rlc Òepúniry roìlludraé rbeconcephor ,n ,r,s hd rdrdr4. dÙpucaliù'í wejitiptì
lh. fuìl
ìes,bÙlpmlÙgsirúcy{edEnvennlmgeoe FigE4j s tbeevoluliotr shNi in FictrE'1.6,
. rNsr(MoNe) d nrs2(Mose) m p8dogq . n{sl(Mose)a
nrsl(Rra ft onnobs.ì
. rNs r(MoNe)Ìd rNs2(tun ft odrybùúÒrÒsi . rNsl(MGe) :ld Ns(cliìo) ùc onhorocs.
4.3.2 l
speiès.í rhemw.ùpy f cÒpìde normrive(tudibed) rheyft calìedpr.trdd listlÙÒkrtúcdtrdbàofdìfi@íftebpoìogies.
43.1 ft€ nùmberof ilifr€ml treetopotogi6 Al ùmoredfte (ot dretypew. msidèr) h4 ft 2 inhal rode,ard a r@Èdhaj mh€roldifimrbpologies,TleÙmbdol utlmrd roporogies for a > 3 disì,,1 squcrcs i3 '""*l.l-
FÒrcxmprc, 7i.d0 0) - 2 02702t. so. dcn rù quíc snatìn, ir voutd bea talle runbqifal|pssiblebpologish0db
4,3,3 )
"---i A :-;
/LA/L^^ Plyiry 7i,ihù(!) hy h
J,3.2 Molccuhr clocktheorj-'
frúdio crnb€sriiùrd (sedu 9)m ù60mdr r 42 oúfioi\ rrrEhNú
J.1.3 Addilire a
È edg* .ÒD!4'hg
rhe nods. Flgùe 4.31i)
ded iion rrì! dirù io Fisùc 4 3lb) rrc
wcchr!b!r rhurods!\rnFis!rc49(orndúl squenesshoMi.Fgùrc4.7(Ì) elrr5eúdEqÉfùe.r0)núbe (srìd$ing úúÈqú rú (4 ú) inpf$ {k|niq h beyoùd dìe$opeof rhisb.ùr )
G$unii! nrcù!\I)ac)rf dorrr-itrorerùyrnpbi.j r
rieoi i.e
(ù F+r. Íù idùe
'hr iddiiny d tu inkrc$ d toù objrds (br rÈ dtueobi4r fre ùùudd! 4qùiPnd, tr
Fisùrca.eo) iìrunr{* rhisrEqu ior (r.Il) is $úsfiedror aI rlÈi,ives or the ry rhatEqùÍion (4 11)impÙ* rhd ìÌ h Eùa'ior (a.l l) inìpliestr!ùrtion (1.r0),hei.e ùlknÈúi.ny iúprì* xddidvlr (*
Diff€rent ap!rcachN
for reconstucling
r,mlÙmolinullipkarig'ùaÍ'Ù i:ì<Èd (or iu) or 'be olùms are ùre
(Th! hc h Figqc 4
phylogcDetic lrees
î î-r- \
j"'' >-' ; .-;;*'
squencs.) lrre arigmeú h 6^r iispúd b 0ndiur'uliì? .,1 ,m. a cdùm PÒhgiesNqlheoùds'weiusht'hilby m .rùnprc. sùpposcrhf we hde tou
FigE4L0we$.'hdo|UnN:rmd5f 3, s md 7 'o d4ide rhehdt wiib rbe rùs (l +2+ D subrirutions ìnrhosdÌee .ohìmns, ree I h.s (2+ 2+2), úd frc ìù 6ast2 + I + 2).T.ùs@er h chosn
whn úe rumh€rol squenes Eori rhcrcac irc
pd$iblc h6, àrd ùè .aì-
lish6r pÒbabilnytu beîg coftd (bú( $n5 ùe onúii
is .1ìù$n
-D. Tten Pdt4 h rncpúbrbirny ù{ rhc e nfcnúlnúd.rùo kio\n (j,r.:. f). The r! (r' ) P!(lt Pr,('r) Pj: (r, P:.(r1+ rr) t\,16 pún) p@an
:D Prcbrrririls (orìiÌelihoods) rorc!!
.13,5 Distance-based c..slruct'otr fljns{Ìjol1'9.trbyÙsìlgEqcldqrced
:quene. ùd rtÒì rd. rsd rodcrnr. , :!.
ro.1' 'hÈùlhq !od* (r) ae mìc
cms wiìì o(ú
| = th. yt òtut
(*c rhsexanple belo!). An
t4 tlore Nrt) Jot eattt ÒnsMt rqr?ne.
tu.r):= ù, eús aJtuanè: nt u f ub'rhÈ theknltthqfthe cdq.:lt t)e"tt",u) rof qch nd ' ùJth?Ì6 it u ll.ttdo DA, r) :- utatkte ttp htu . bdrt.ù \ .t
u : = ( u - { J !' } ) ù t ù ' l
Ìo i roo' I ù úoùs
({b)rre is .dù
etrihe mw Nde u ({ùh .hirdcì ! 4d ,
Ftùr 4.13 (i) De d^kfs
ÈNer rì
jd ir ùe 16r G) E n grclFd vii
(c.D). r.) n. dcÈmÈci òe oEddd
tu ($c
ffi8rr.d PcM^ (UPcMA) D . . ,= j ( D , . , + D " . , ) . eighEd'TlìnnúeIEsn''oical|jicdns Ldx
, rhcrDr i' tuc rÒrèwy Gùb)fte wrh m' 'ftenrhelcnsd,of ùù.ds.s(r,ù)ùd(r.ú)cùersiìybecrìcuìaÈd btr, bc . :-r ofùe rÈ wirhfmr ,. ÎÈ hrllh ofrheedgc(', ùl ú ú,ùn
,,." = i,^, -,!,.. ivenin Figue a.l]G). :.:dces Ircn Ìhe newnodc(a,B) ro .-{inRsúr1 r3(b).coÍiNjngù' :i ù. dhtucs bervq rhc*qucnÈ. in Fi!ùr 4 r3(Ò).
stue rheodsin,r dnbnes (Fiem ,1t:ltr) m ror utFrmdnc,Ìhe crtcùbÈrl dh's* jr ú. .Òisdrcd re (FieùreI r3G) doìaÈ ú!m dreodgjnirdkhics
rhe n€jgibDù-jorrros ndhod jÒiniic(N,r4hoddo$noÌ.suermsa rreneighboù
mes vilh rhcdrinlm 'unbú ofedses (r fù ùco. ftcn rhù@ is sucsn€ly cbùgedby in!6ing by I rbenùner ol FìsùÉ4.lj(r) Nev inremalDodse rtÉi sm$ivdy
curcd. md rhedesEeot
kr ùs.dr rhÈ(sdblr$ cootr(cd ro x for m!s. A ,giúr,!r/.r is a panot mur. li Figm 4 r5(b)((a B).c) h a nejehbor p!i. (a-R).(ÈF, in (c),erc.rn , rld I icw iod. .carc! (D, wnh edes ro x ùd dch or ,!c oTU olhe neighb retr8n\.Tris úeús rrrd ir is d n@ssiìy rh. plirunn ùè ìe*i nuruil dishn€
:i +--
-tr rùrÈ'(r !r1heNJ'ioh.d G) s n * q n m q ! e e i ! { . s i ! j l Ù h c d
i ùÒ n^r cydrcrrrcr ùc 4 (zi
i )/2 póribrc !hor6 ln' rh. icitrrbouf prfs ro
r!.'deiÌheEm (,, -i+ l)('r -,)/ :]Nrg'h.oligiruìdnhrcs'mdth f,din saìrouind Nei (1e37).
riúprr mdhodrir cÒirru{ìng r $òod .rcc ]Ú6.ddle'!ìrdpoin'fìdnsftró '- in FisùEa!(a). i morca beinsen ghrxndleft sbrÈs. Figurca.l5(d) sh*s n FisuE4 r5G)a moris pùced
1.1 |
[email protected] sstrr howfobùr tù *able) rhl rcsuì l0r0) of !*ùdo atignnenGis smÌù
ir is tor.ach sùbú (ìnèmrt ndlÒ)in I rh. sam.!ùbr€ ocùs (cooEinjngdì. !r6s rÌ oft4vct. rtrjsnumhùn q\wqNu ne, asihc rimprerÉeri! Fisufe4.r6 show r, ùe sr of k!v6 (hùe prcrìlr) or
G) rigufr4''7^ji!ìù90'IELqoùd'ìprc
.1.4 Progressiv€Alignment
ùly lcps .m.' bc .oftc":d larr Gs rlÈy rd ii roch.s'ic mdhol, wronrry irclLded erps rc no' rcnorcd ( orcc a sp, rì\nys r llp ).lsone mdhd\ rry r, ri-qrtd re onoled ron úe aììsmeÍ. ad idded bro
r*hniqùs iÌ.m dNcnf,s ù phnos
P.forning xlismcrt lNirh !r$onablc Konng Khcmo in dili.Èd ordc* {il
, ,,"r) . . ( r . . r ) . ( , r , 1 ( , r , J 1G
tr4Er h cvdlùtiorry tint
ft :rigr
T']ìOCRESS'VE AIICNMENI @c65 if 6erc *isB (or ún becon{Ìùrd) a (phytolcreri.)R vhi.h cadsuìdc rsosrsxìv.arigincnrDitreEnrnerbods dtrúÒbimp]cnhddol)?ldingmh d ro u* ro. .hóosingùc rìisnmena.*e cd
Prqa$ire rlienmei'orft *qùdft\ {r'. rr. . . . . r'}
f o rr : = r h n d D c : =c u t G i ì e r tù\ e rn atì3nnun^ A?, a4len c. c := c - I^t, n4)
!r i= risD(r/,, /r)r c:= cuur)
îins dÈ Girgtd) rìnlr atis]rer
4.;f.l Alienitrs iso subs€taligmmts vi'bÌhesqÙrcs{r,...,r})ed{x,r.....r, ).{hecsèquèrcÈrJ!è5}oEr ù . lR€arr 'hd ii is rr wiú rheblbks ù$,rd in in iìiem.i'.) 1|ignfr6t\:ewkkalismwúú,]pÙ;r34i1cdal@Ùfd'lnsmpìeÈ3|iginolL sqrcrcr ofhetu s!b$r arisnnÈnrm lrc súf bcMci úr corumrs(i , is sbom by &urion (4.l2), qhe€ n h ùe so€ hNed rwosyEbù. at is'heqmbol (ei'rùà'iiiÒ Èid ofbhn*) lor *rllcre P s o e ,t ( )nBthl4ro:
I belrùeqùd. r rùunEquar andbknk
i(r - r ) + (
ù cq mùe difiú
r + 0 )+ ( - r r ) + ( t ,
"= i.
Biolosirs orbr úy ssverl sapÉetrìs bdorc rhcygd an
rddÈ nnJ di!îìngisdotretonùvìry ù
!5'hE'lisnmqÌ othii.d iflhe úd squctrr lfomb.rh Gùbsortìgnmfls ra usd
by 'hc dùrtq
rhcrc$trs srúsry @FU
cd |Icedm. No cubstdisnneft nDr
hú (Pmbablr,) L
quq!6 (oE rom údì itignìùúl i5 [€her. m rhosn nx !]i3mìc (Tlri5qrre5Dorti\D i Nir ro.he pc\1a 'ne'hod rùr lq\rudins '(L ) rr se quaLioìI j. (r^ro!d d 'hc ùlhmdjc nÈrn r d*cdb.r r(rr +rr +. +r,)/,r. au úu \qúE mein
' n è x r ( , t ( r ^+r r / i r i Îhe rùiìaù
. r/!,J))
Lùb|c ùdhù
3h ù( n\î slrltrur.\ rubvri,gnnen') iÍe si,ìirù 0r5!hi3h 5q,e) Îhe ùùin@tqùvhie) rù,tdy )n4ro
(ù! fnn cub
$b baven!n. aÌsmù6: Àr = Iri , 11, "{: = Gr. i). ,{r = lr1, Niu, p!ry\o
r ì c { m , s b q q e n ú r " . ' B ' n ó ' a f o ' 1 . ó. .' n ' . . , 1 r n r !n - h o dm À r o l 0 " ,
s ( r 1 , r , = 0 + 5+ ó+ 4 ) / 4 = J ( r L , Í t = m d ( ? , 5 , 6 , 4- ) s(,{r, at =nji(?,5,6.4) =
r! (,{.,) mqn s(n, ,). and'heKoEsitrd€Hsiog orderc (,4,D). (r, D).
r.1.r), (/1,c),(r, c), (c,D).(J.r), (r.J), (?.(), (., r), (D.É),(F,c). t D r . ) , t At,, r E , c ) , | c ,] 1 ) . t 4F, ) , . . . . n Esùli in drete ìDFìsure4.r 3(4
'hc ùr &d rar q$ic6 m Nd rÒr q.le'P,ji8Ùidedllisinslis'henp€f'o rnlus ún alsobe$e4 soúr ùe du vducrlish'ly lc\s'nùùesore(t. 6).rwo
znA 1r ourn! D íL$
iidrqr (b)a +q!
tl n.c D)1rlL.t:.G.L,t r.K).t shomi,ìFism,1.13(b) fte ofd{ orrhedurùin! is (..(!
D) ')
lod (r (c (a H).t/,(J.rr)J
NoÈdn(r, r) dd (r. /) !a Dorus
.!qú!i!.r lr'. 1'.
.hu^, ^v !!q@w\ nr" ùar lhritÌr) 6. n trú u
r = a r i e i G . r r ,= u dW^| a!quù.!
eU: u:= U - 1:)
;t.4.3 Scqùencèweights
rd o.sìnpres (ilrúes
otrlr d!$)
tu 6d rorkÈ d..iri{s ir a N! obj
Lnoú úsc ùÌ objeh xrc seqùÈ0!rs.dì
ft !sir! diflùùE 'o dc n,or('hedi{!n!! h rhc.middle*queF)! hoùfbood(sinìitù kqtrnces).
(Nnìbd of ,iúaront
! ir hc rhLrîrLre(ilnìbelof ùùhrio$) oi edseórr
È sùbr* \Íh rcor in rhe noderr (./i, il
"iro' (\rrB)
o. ,ne edge\ftur d,o rlL u L
Ii FierÌe4.r9(b)Ncjertrs !ro !i!di!
1.!t.4 CLUSTAL j cLUslnL (Îonpsotr r rt. 1991b) I seps(rvhj!hcú rreGpeareo r ciìctrrdcdr (sùrioPì^!i{ snih
r c n g rt ho 2 N . i . d ì e , G g | n o d i s o n a ì s erúd (cs rro dùlo.aìsonú.h \id9, a'd dynlor!pruàiDing is p.dbned
Lh!5mdrc{d 'hcsquenesG fore o
ru ú Fieuc115 Ìh. oílr.!ì l1.,).c)'(E.D.(D'(.ll),((l',).c).(,.(''.,)
ni! mùicesis ùsd GÈ s{ridD5.r) 0,n .$.cs P Nrorcfmr ) Formdì digùùùn iaÍce. (Th. $drs rfnG e ru*trned b rcnNs \?hEsorry.)Eqùdro! (1 12)n Ned rof lìndi1ìcth. s.orc! b.rsrùì rh!
. r$coP(grpùPcnnsFmtr!).yhi. t Lht GEP (!rr ùremior p:ùdr,
4.5 (
1.6 Í
r orLúns!lhcFLdiì\.oroNù
r. adi'N' nr muLripresùbfnùriortr (,iùlilios ). ri@ rhenunhù of múúior tr lko sùdiu I q) ( rl,c PAMErn!$ ]nd LhLBl-osul\{ ndn$lrkelì sjntol.oÙil,)
4.5 Oth€r Approaches
icrivesncr)slùtr|'l $N tutur \ùhaLsùtur
or ùblcdir. tukrioN r,f rh$o ùdh,i\
1.6 Excrcises a$ci,l'!F,rudiú (l.t hr ro" for
coilid.r clr r = (?.r. r) ùìd frtrdù endceìI,by usinsPAM250úhbìes 3) choor ù upmpiirc rrlr rù' dic r. Dn{ iÌì umored ftiffry) phjìose
Shw rh.r by uing U?GMArd cGlMbe
evalniiù"ry @,.ùh
'IYyrosprrin (intufmn, rhd n ìnllmetic @ is 'ddiri€ Giutr .3yMicrl
(t snd.h!!.ù.É *i!a h uhffiic Sryp.e w. m cìh 6vc*qùd@s rr = MD, s2 = Àccm, rr = c4q rcdi,g sysen hre ! rrneùgap son
tn rhismy rhcltlm 6 beEsÀde! s ditues bN.5 ihè$qrn4 ùd .o€r for rbobd D'isis di!ìrlmts
r!5673 rhsè 3@È. G) piirshw m dismrùrsiB (b) M,r. Àlsric) cùid. É bu€nonld* disbes by úing ùè upcMA plreduÉrùft'r.cúà!iE! ld joinif g(.{u'i !.4), choos G) Mrkéom iplé,rismrbl!.donrhe@in(b).usepaiNisesùid€d rrisrert uins thés.qùffi wirh l4t dyr4i. d&hcc a cuidé. h .ù. (dyMic) éIc@rt, bìank en c ri) tu tunr}andDs ii ú. ..4u.ú f nure4 Nsms0:
(ij) onebhnk qis'irs in oe or 'he seqù (iij) oneqù'iis blankn illled ro in (iv) a ne{ bìankÀ aìisnedrom amino (rry to findrheber aìisnmenhwlhour ùsinedymjc pberilming.) ca|$hbIhgsf$Ùeoflhenuìlipledjgrefu' (d) i{rkc andisnmr usingùc minimumGompkl linrasemdhod.use joininsyoucmue rh paiNisesuided
a, E, c *h.i 'nÒndhod in sùs.tión 44 3 ìJ u!r. sone5€!l@$ (ffomsvisrÈ.t rhf youvnl aììsn.
4.7 Bibliographicnot€s seeedhgÙelhedylaiú.PfogÌllmi4
(1e33), (ree3). upmm.t3i.(r939). (2000) GErìcrd cupbd,r. 0sss),Rèiuderat. Dd.mì!ì!s equmexeiÈB is aebjn d ii At6.hurèrrr. 0ete) udrroùeson er3l. (1994d.vìnssn andsib,brìd(r99r) compft dififfilneùods for {eishring ft. MUTaALpfoEm is deqib.d in Trytú 0990). À neù\odbd on ùù_ hins hn ,lins is pÉ.Í.d ir Kin andPmùjr 0e94),ore 6ins hiddenMùkov nodelsinEddyogss),8ndm uirg serencdsoíhms in Nor.n tu rnd Hissì6 (1996)úd Noftdme d d. (2000.r 993).A heùod b8s.donrheFoùir dqrÒ,T i! p@iÈd ii rúbh.raì. (2c{2).Theu 0996).AÌn!ùs èr31.(2002)dscnbe I nedrodùsinspolyh.nril @mbinrdi6, 'nd le d al. (2002)dÀcnh. Àm.6od usirgpdiat odtr snpbs. of pss1:tr fór fùrripre aùpnúr is giEn ii rnomesord rt. oeeeb).rr@ h a benchffik dd,b4è in Tnoúpcord il. O999a),snd(lrprN,nd Hu emO daùib€ ompaisnolnxiehsljcimúcúdújc a5ssneir of sbrùricrrsicnifrcùc. is pleún ir srdrcp andcdsbin efiB). A goodbooklor.brudon ry tu is u 099?),*hidh hs mny EGru!6.
ScoringMatrices imil
l], bd*c!
rhe query ind eacdor dìc
ùùe;inìafiryof 'lÈrcri,ruqocumngin'Le*qùerùs Fors!fesidms (4',4),
]]ds.lrìismÚsÙrccrnùc!bcliver rnhdofamimxcidr{20) r ( seien trinccs for pins or tunìiù Ìciù (400 Ì 400)
:rr of rre soa
cor bcrwrú (,. D)mjghÌ be lrrsú dm rhÈ tu (q, .J dod ( , à) (or sDrxtr ú$ úen diflance) s ilso u*d rìr a sconis mfdx we {i[,
ù a etuin Gvorùriotrùl) 1j,hd a $h
o'- 1
t, \.
/\ ,\\ \/\ D E(RHNAS]
5.1 ScoringMatric€s Basedon Physio-Chemical Properties Us oîidúùrr
a sinpìemdhodn b
us or rhc gènedc.ode.rhe $ofe is b úc nu.rd. dds for chinsìnsh amino{id inroaorhù (r. 2 oi 3i 0 roi eqù,1). lor ftDslomlng Phe(codcsúo. úc) b astr Gods ùu, ú). NÒrcù!r rrrisd
uf or dNifi.arion or amino,cj&. s.drs mfnes b ed moE dìF.dy on rhe ndsEh!Iopú'*qDbehydmphobh ity. pordiry.cbiae. mmficiry, ariph,.iry.ridi.aÀi., rizc.H.bonddonofi crc. trpropùrd.sdì1hÌd smnn( rq90)Pfo. psd 'hedNina'ioo aacH Gminorcid da$ lìirùchy), shNi in Ficùre5 r 'fryrof 093ó) mxdeI vqn dnsnm ù
(cH) ùt sidrú prctsds b s,hÈ riiÈ ùe id&d
òfn (.!,) 04 e ùùe 4u:eÈr úiir'crcs!úE
c R4drld
k dúEd byinprraúor ard
ftor rlyror (rqó)
5.2 PAM ScoringMatrices iships o(ùrìng rq?3).Îr$mdfi.Òsft
ù oroe. fte 66. sconDg
rir {idcly used.
ìò (by muEtionsNe meaosubfiru úois i'ì rhislhrprt 34 $sfamiìiej !e urd, ous sejùeices ( > 359. id. ity tÒrd rc rhe 1''ú oBupnúpo*d huhho6) n
nùr hrvù rhùsm..r.d
(nú' i!'Ò.ion in I simìlfl vry) f rheord one,which ùsu
6u€d li Nnben of 0ùft trúi!r.!lr6tdù. Thei Dryhof\
prMduE canb€ desi
'ipkirigmenrroleichgnp' 2. coîrnd
pbyìogedic (evoì'rio.ù])
'us tu àch groùp, lid csindc úc
rjmc inbnal r sícs fof caú paìI (a. !') m erimde for n\e plobabitry or/ Ìo
Thè eroltrtiomry model
xitì.ù tlo v|nùk
iig rsunprìors: /Er,.raririr i\ ù1!,hp
,Ja úador
Òufr!idú4 (i.dspgi&úùot rèishboùf flìjJìorcìdsofdfuÈi\emighboÙÀìe Fsidus (boihin squen e úd spE)), ?d
r!.sirion wcdn juf wir,
-ddtà,r (/àMr. rnd I PAMmeins,r
5.2,2 crlcùlatcsùbsritùtiotrhatii itr'heofhcEd
phyloeene'ic fts (15?2inrhetur experiftno. lr'le ftecodrir
Faù. s.r Pú of{,trù'ipt d$m
r ft {vr úiiary rtc ù ù, qtuhroc c
\,,./ FiBud5,4 (!)^ sdrphybs4ctrr ftc o qw(B
(b) ft,i!ÈiioN
m on ù! d
p,à,r,rio 'hd a vill b. 4red by ó rbis by ,rr;, (bc.$ ,v h aìsocined ùe núdiD i" PAMi sîd frAl q€ kDk d . = l ' l,,]
prÒbùiliry mùìx). f
r ]ìe FÒbrbilg dúr . nuùB (run rLLo.curercs or a haE 'ht sme prcb.1bili'y).and
. '.,. ù. rumbc.dfd + t dl] + a. {hcrc + trDs nuktror (nor
. , = I,-. L,,,hcb'irnmberor . / = t! ,. sEe ùc búr iùnbe, or muh,ion:Gút crdì mÙhùon I d. Thi\ is rhcclaiiÉ @cncne ormino rid a in ùc eh m{ùtd, hme L p" = I
5.2.3 I
t /i,;4q rholrd i0cfae wi'n ii.tuiù8
tì a ooriic
Hcrc n. cai bedefiiedÀ D, = rrlr..
i nùy iìtrú'jÒ
,hec ,r is ù.òmrd RnslhepmbabiliryùrtúÌhilrury
rhc rhúirity
rhf anebiùay 'nùriÒi òdlirs d is i,^//2).
biriryúfn hr,,' d nGìre r, = ó,) tt L z ftz f amoùe r00asidusúùeft r00a,o
rùe pnbl-
L t c o Pr " = L t a a p . . # - = , L h = l 4'FordeÌemini|gMdbwe.únowsgthehch'ha' ! rhÒptubrbinyrh4cmukres(rrnne1PN)ìs4i,lod ivciúìur!nùbo\nt;/t
. Mòb = ,4 Lh[ú îóf o + b: . Md=r a, (ùc prcbrrriry rhd. dos tur i'uúr). a np]e.lhepobabilitylofhAbhcrcplacenbyaLi\o,r'rx)]'
5,2,3 N{atices for gen€ral cvolùtionary tim€ úsion pq 100rcsidtresA rttr G$luriofufy) eù50(duL\'hcmllcú|eeisbrcplaa r00hyj0 jn F4ùriÒ! (5 r) Nd! rhi(hìsdes nor.oftrpdDdb z p M (2nuùúon
iziEz !i::É
i a !..1
', ! :1:
--.; E:
! Éi; -
=Èt)a 1: Éa i
!É iE i :9 = -'!
'a-e.:E =-aat
':ù4hdkE(be'ohdbyn.úix íidcp!ùdcù pmpeniesof rh. mo.rer(Mdkov ùìodd). ve wiìr srìo{ n for ,r.rr. tpìrcedl)ym
im m'd /, rfÈf r{o m
Èabbiury Jor rlìis h ,v!j,Ì4r.
tco|dî].pmbrbi]j'}ìs,ì1dl]ìr ùdù.IjrrlpÓbabilityibi(tobelep
Mi,= M'Mù + M,,,,M,,l+I , v , . M é = t i ! ' , , , M , ,
s.2.,1 M€asùringseqù€nc€ sinilarity by ùseof M' rl,; múss rh, e!úry ora &d, À : llhjlh n4 b, dlrc'd rfÒm',;, ù . úd , = aLàr (ii 'ine r) h úù ll, ,v;ú cr ,!r;. HùrÙ. 'fù ùlNr of qurrily isro' syùftú, n dos norhkeóhaÍce ) n: (saq,neqcn!! d1hù unduryùs do rcoùtr( ind irh.s ro Ne rnuxiDLi!{ior
ù ru' ù, vnhoú +rrtùg
aùÈdi3qnù' juibydúr.r. rr ùùs ..iú a i"orcorrlrsqrenes (r). nre (/) onryhy lha : r rheorrìss
aE1z lle?
! F;F
4.2.7 2zLrbÈr>>
Remmbs 'rìd LÉ{ pi = r îd :,€r hqù.r.ics. Thc rrcqtretrc'in ùe nùme
Md, = r i henceo i5 úc n'i or 'ùo
Fofr Eqúrion(5.:r)n bììowsdúr .o.,>|:,'EPhce\
Theodd6naùir.,issymmdfkd,!bi.bisshownb} -
5.2,6 Scoringm.trices(log-odds narricet nl\ini|ÙìtybeNe0msqufl(Ú4
II îE ocÀrÉ orsinilùity beNeer*qùetrcesdes no' ieel ro be(o is noÙabso rùr, burrchrive(Neo*d ro k!N. lor trnpÌq rìli q n nore sinìlù rÒd' úù ro dr.rhis rchfvnyn keptiftr tr$ úc locùj'h'iof.(4,r) inr.ÀdoIr(4,d).as iìil,byaddinglhesoasfÙlheaìign.d Ésidus. i6Ead ol nurriPryins:
L'A,tt =t cLt4,ò =Lto."o4t,, Mdi.asolìogod},14,//l-dri.'t
R'h= \as!t! f 250PAMmurripricdby r0.
sl.7 Estimating tht evolurionery dis$ne oltr'joiùr diihú ,. rhisnatu rbarr mbrioB ptr 100 *(,-"à*')
i (ir 4 is u riftror or
a i_:i
:c eg :q Éi:
rJÉ: Y?lt i1=';
! ?1: 3:iÒ
? i:3
5.3 B
g !.ge tlií !:;
rc5.5 (RcFcnb{'hratu'irú!ù,ighf iúàr iiú rheo'isùi lnùo !!id ) tr \ ronid (by ur oI lqurion (5.ó)) 'h
5.3 BLOSUMScoringMatric€s h òc DrtDf moderrhesùiis uìue on ( leel) h!!r 'hsdn( ddvcrDcdiotue
Bros!MscoRrNc MA]RrcEs eachcorumiii c{b bro.krh.yconnr d dr tr ,0) = ,10 dirìccdP,i ,rignmq. ofn sequ$es mlkèi ún(u -t)pairoramiiÒnidsFofrhdriBrbìock rherminorid prr (úr) (ior ùd r./, =
'=IXL-..e,. vheE > ù inrerpErdrs r ronrod.nngoverrhemi'o eids (fù qoptq . t. = ,,.,/r Ge fre'rucncy ÒrobsLryc! prn9.
s r0pdfs.bùc ri
= rìr.
5,3.1 Log-oddsmatrix îÉ ob*Ìvcd furrfics ncl bc adju chmc.. For eeh (.r) rhe*p.cbd plo ! tii
> a.,, rhe obsfled rftqù.ncy is bisl{ rbù ùptur
! t?Ì < .i!. 'he obsftd
by .hb.q
t .4/' = lir,. úich indicfs biolosi.àr Dcqùriry brrÈn
rnd à " ''th''Ir1n'.$d'|fmÚ,lp3i|'''. ùù ùe ahvryed lEquqrcies arc cqtut b thcf.qùqrì.s iI the .dudt pqrtúiòn. Frcmlhistheexp(tdprchabi]jlylhr
r rnrm n.rd a o.dd 1i!, + lÉi
,i! rrtrsi
,h", 1.L"4, h."
h y tudn{iieÌhePrn (.ò) ir . !,,t, - p,,pt,+ tur,, = 1paù,, tó' t + h.
=# ùrdd
ùd tbe dped.d
= i!.
tÒ! c&b lnìno eid prn civcn 6c lbow ùr
-i.3.2 Developingscoring malrices for dilIer€Dt crolùtion.ry d / wiú ú (eroluriùiùr)disúi.e x, ùe lNìd6esgmfl'psaf*pondi.e
ùc *srìrd\
cod5PÒndb (D %):
rheorumi.Ò\ Òtan amiio ùìd gairrmm 1s?,s3 ) shoùld&rí mudì fs. îis is dom by tdkFi.E rhesgnùN jefufP4ceÎbgeiddiyrcgrcuFrinbone
eÍri9 ro oru of rhersnc
gfouped(idic ùdc a tm% idediry), aîd riotrsGù sgmetrh hxlc Equrì{eiehr). br .nh (4. 6). rhÈ E$ftitrg bìock bùÒne\ (i,:r.5) ( F
'r 6r
sp{aEry, ad rgn.ns
EsR (.1.6) roserhèr
úe oDrydìrcc symboL. lid rhepair f,qErcies
bùonù r{r
3 ùd r{
= i.
2. Findrhebr$ks (wirhoúgrrt. a Couí úc e.ùrt!.*
or all plin or snino Íids
5,4 Compa.i.g BLOSUM and PAM Matrices
s r]y ùs or chdre en'mpy(rrìm inrima ,ir thcory).ir !$ b! niùiJ rlú ?AM ]trrfnompdjng5elEissnnrlwx
ù. ld!r'i'y $oriîg mdrix). rnd 'heq.'
dd ùè risrìmm' ùd is jùdcedb b.
5.5 Optimal ScoringMatrices hydsphobi. Egmqs lssndr
an ovùrcptserhdÒ! of hytrrcpbobic
mìm ridt. rn rh. úeoryby r(artinaid arM[ut 09e0) (sÈ sdioi r.!, rn @.i ch$smúr pdru(wiùourgaps)is dfletoped. Ldse lvo squùeshùeúc beksDurddisdbùrio's 1p,Ìtrd 1p.t.Espsrively (r, a ùè lièquencyrorrmho acida). x d sqúcncei(usine{r,t,1i,)) úc o'no rids d, , re irigmd hy r tuqErcy apprcrbrds
hù = P'nÚrRú sÒFùsEuriotr (5.1 ror'nèsonrs mrrix, *e ser
6rlùwiDsrhecomftinb oi I (sc.tion3.3). ÈlyPs ot segmeúvc m s.mhìig for ùix is fomd Eirs Equlrion(5 3)
rhed |A, n I L, M, q vt shourdsc@ bsr we 'lìefrG dÈtìrchther îlil'trd .id! rtraotoron\e6.
i.5.1 Abalysisfor on€s€queùce
n! ntuù ù E{ùarion(5 !) .a detrú:
( 5e )
5.6 Exercises
(r) crcdc r rcord Phyrolenericrtoc h
úbbor/d,(6cnlnberorhoúlotrs bd$eidardr) (b) assù,ic rhd Ìhe ddn€
jbf drd,, thcn
(.) usc ú ir 'o cd.ùlar rhevdùs dr rhs {b$iturion maùix ,vi,. h n n{h ondi'ìg1ùa Guhfìrùioi r!Òtue andrhediùon sftspoiújie .o E ({hdnúion Ìo E) ld) txffofm rh. rrox ,ir!! ro xtr oddsnarjr o",,. rf voù hrc 0 r dÈ M.i,rlEjr.lhetrúùecors!{DJjn e enedr úe rhote ri,l> 0) (d) TÉ6fom ó,, b a Ìos oddsrì!ùr /lùr. ard nuìripty by ro
() corsidd úè úluès ró! h.!e ù r,,, (a I ,). Explrii by trsiilg'bc ! uls of 2. ùìd ,r 'rìù i' h msù$rc
(a) où dÉ {eb yoù {ìn fitrd dre}LosuÀl ronùg natices (lor exùì ple,r h'a r i.ì n sÒs 0ig r.jpmdr ùloey^wkùch el.ìp htnr). cir ù ft'heso neror(a.,rr) iÌìd(r dr)rorsl-osuMa5rìd BLosr-M
(b) Elli nìat rheevoì'rio.a.ydisramein FAI' bdwÈr ({. ,r ) andlr ,/, NhcnEqrdion(rlissed. G) Explainùe shxpeor 'he.ufle in Fl (b) coisider t00 PAM.rd sùppordì plobóili'yo|chmei.e^SÙme
(i) dÈ Nnbù of fsidùs úr rìan mr .lrùeedl (ii) ú. iÚiml nùrbù or iesidùs rh (iii) ú. diiiùùn dùdbr ùrrlidùes ú
5.7 Bibliographicnotes a liq rGmaúle Ìo dr aacH d&$iridiotr nacc s'.irhfJ snih I r992). a nerhodrù
Gmì\ !o!cnrs) io
( re!,ó)andHeiikorad }rè[i[oÍ ( 1991. l9c2). A Gpid nè'hod for senedi'E núúrion daÌaù,ùj.ds ìr r loneser L (19924. .ùs{d i conner cr d (r992)ind BelEer csir inatrcbut(teel) Maùj( rórpùn of rmino dds ir in conner(1994).au evlluarioror s.ÒnngnerhodoÌosìes n jn rohnlonind olerinc'on ( ree, andso
Profiles oî a ptoreìrhúiìy hs h*i routrd,onccrtr
dlhbÀc ÒrÍqbfts Thelanù Ms beD sho*nÌo bc supcnorin d.@.ìrg wsl p4ific pmpediesin rhesarch.A nurbèr of
ilrotfs.Îiòsindùdc.,,1",![sa@ PosrioniDeinc *orlng mtics (PssM, o. wight mrdc6, rhesearc mÈ rruc'èd tiod mùr,iprear4nnmb dd ncd .oìumN. cips de not dscibed a pad or rhcPSSMxndionúny no eaps
rorm rhedecE ot onsÉlior ir r m 'ion ro poiúor*paidc rors (p.sirlon{pÈifrc) goppendtÈs'o b0 urd wher .oúpÙiiglbèprc|ììclolsqueDce mu\(preflsTlrsgaeún54cdtg] Prot {irh a pmlìleúD be fepgenÉn s r tvÒ.difrÈrsior,lmy, lea denor€d $.2ìlyirglheposidm:pecificgrpp€naì.ùs' FmnInuì'ipledignmflcùeblo*o r alum a (Àor-) h rhe$oÈ or ali! nsiminorda (ton ! *!rùcnc)rorhe p.sitionr or rbepmfile,6 sh(,u ù Fisut 6.t
eÈ Ei
2"e iÉ :!.
=:> ó.1 Const.uctinga Pmfile
c membds. bur rhùr jrù (Drcb nbry)exir uúnÒv! nembss, hov sh makea bd posibh d.sfiplion of rhe filesNnhbasÀinprcfi tcwcisbr{úompson 4, r994i),$hich ìs ù ùrNion or i uedìodhy cnbsror sr r . (19371. î.! wc d.rìhc u eirenson or BLAST (PSLBLAST). atrdaìsosìrc r bnd ioftomfl b
i +.-.
:::r:E =c =3cI - E:E:ee9: e;;; E - : = r ; = : e i ! E ; E è É= s : ; P ; è ; ; ; :
I ?i iî ì: iî ì î tl ii i?i î îi ÌÍ i ìÌ î I f Ì'F Í Ìî î ii îîì Ì îì íÌ ?Ì îi îÌ
'; î s 3: e I î î ; î í î ; I ì I î î ì î Ìì
:E i - - i . ì i - - " i 1- i I i î ì î ì i , i
ì iì ìì îì.i i= -- ir t'î ri îÌ;i ì rì
î:j;- ì-. i îí ì.i Ì r,5îìiÍîri-, 'riî=Ìiîrîi.ì :.ìr-î-irii:îf 11 -
î'' i; ";
i î J'îiÌÌ rî îÌ îi rííîi:îìîîÍiî î i r'i î: ìF eì " i i- r-E:-- - " - ".r! -ii.:.;:
ò: !:
e ' - i 5r
îiÌ 1i î îi î; ri iì îí îÍ íì iì Ìì îì i
6.1.2 )
6,1.3 l *. io' frmirùr ,ih úc uef\inc
1. Tric bekgîotrid (lPrtu4
Eddpb itDÈù
, F3uÉ 4.r
5. The divùsiry ,rd rìmir{ùy of úe seqù,ffutriig veieht oremh sqmr m. knoùi
in rhc inpodb.è (or
difrÒcit ndnddr ro! pójììo connrucùoi.
a posnioniD ùe sliemen, md I b h mioo aìd 4 4,
aninoeid ìn posiÌion/ iisqùùrJ' úè rudbù ÒfEuftnes ol ìds, dd r (a sco;Dsmfnt
ùc wojghrof squcn.c t Gequeice*eigho
úè !unh{ orrcsidus (notsupt in po
6.1,2 Renovins.ows andcolunrs
lhenb€.fÙtrdioiÒlv,,ardRh'fora .orums tueindependèddfcarh orhfl. A rcrsoiabte fuicior n ! ìin*r comr,inarion
{bùe 4b dÈpeidson î/, the numberofù.ufti.Òs
of à) rro conrfaitu nighr
t v., = 0 rof Tr, = 0. md
o us psùdo{oùnts(G llb ;trùrn $bsÈ rr mur ùen b. d{i,rcd hùe y,,, rhouldincExs widìimasiis tjr. f,d rj ,nd rr bc'!o imirc a.ids.]trd î.,1 < : lve r. 4!i'ErcrsesprcpdiÒrdry{nh7.r,Nming
v., = !:9
2 v,,, i'c@ùr by norc rhan?;r. meaÍi.s
''r,ú' ílrcÙed,'oleeqMdonsdslyij'gdús
wì,icrìù iìrusrfcd by curu 2 i,ì Figúd 6 l 3. v,,, inljlsÀ les rh,i ?; mcaoirs
d! lùn 1$ chapÈ t. r sinph ùrMiE e'o Hcùik.ffúd HsrLorf(ree6r.
$'irniied d ];d/ar, bú'hi
do$ trorI
fMr speriNns dn , = J-,t
Norcrh*fo.Eqùarioi(6D,rhoprchlemof noio..ùfir'glmioÒ!cid5rrl'tcr.aE
6,1.4 S€quence {€ights rh. seqùeùes .ù & sio !eish6.tDJt. rscìptaiodù suhfdirìiI .1.3. ad d qPfesÌon for rheposirioi M ghr y., c
.-=-E; Íl'.'r,
fr tri =r. l0 irr +ó.
= r . rhù. rorRlrt, v,r, = îr,/flr. vììi
3 a gapis rov.fd (50).sift n|.ùÒgaPdtnìÒnFla|ryis'ollo*ùed csdiru Ìbe p.ùìry l'o! ilfrùcins
rhr slp Èosri
6.3 Il
6.2 SearchingDatabas€srrith Profiles
veni.aledge.u\\hò!.jnFigÙfc6!'l.rù prcÀìeenditrgin posniont (tuorj) rùd I sub{qucn.cGub\ùiry)o./ endinsrn posiriotri (4). we herc pt*ùr r vri úÈ lai .olùm is (dr. ) (DÒr! Eap)
a:= o
I4 '., tul
l,{ir,(4 ,". d k tu p.ùrry io!I glP in a orb
6.3 Iterated BLAST: PSI-BLAST itrord BL^sr (PslBLAsl) (aì's.huL c, ùr r9e7)ftc nlii jdcri\ro fiRÌtrs," rhen.ìare RprchÈor 'hc (irrs (6c rouid
?,= ÈLAslfa.a ,ù = Mùldptà{ignmq(
0 is 'hes' of {qucncsrdùd ii 'hc$dch 0r )
q ì- Pó6rusc{ch(r., (Redù.e(q) - 0 ' ) o' nùnnnr nnn)bet unrir.d,rrl,'r ol ttd{
* x Éusd !c6!D or 3lPned BLAst ploiìh Thc PfÒrìrcP h* r 0! fù qc
Ndc 'hd A ù fte ij6' MeDar
or 'l'e
6,1.1 M*ing tùemultipl€ alisnnenr selnss in 0 vhkh ft Gxedyr.!
(, aidr{ennk6EteErcaroÌbeexinìpìe ) . ku!!rrbifdilxrp$i{iùNotrLsidu
6.3,2 Constructi.s the prolìle 1leprcfikìsoshded':]dig'ì.p.ì Ltl'ì0]eignmùr'Thrsismri,.dúJùgh|ìè
ù'MliscoisrtúcJ rcsidueiIcolÙnÙ.hrchjnedin]ìr.
ffcqucicic\. {t I (for eaclì.olùno), aE úìcùìxrd b} ù,c or ùc obsc^qj dlhesqunlcloghh.&qreNeweiehlsft
bucqF\n'ingorfiltjtrdcpcdÙdy úe dismenÌ i.4 is 'hmn'! mdcd arin rd mjno!!jùs (ú! udirseapq,fdq{,
rhe rinarrmfiie Y,rù6G.ont {L or 'he rom be úalP,), qús. pii i hr d h be iù ù\e .oftidùcd polrion (rììe p.sirior I is j prico N4unll} ft coùtdN drcleshÈd rreqÈmy /d b òiìîc r!, Ho{cxr rheiùnìbù oi obssyr onsGeqùì..r ii th! ,isoncit ma} be y@r ùu I rado
hùer'ùr inp eneDcd Geesùsùr Òi ó | r) ùìm' & is cal.ùrr'cd r'or erìì ùìitro riú. sd É Àsagcd {ih , h
,.',,=n,n'è&, (seEquriotr (i.7)itrs{riotrs.5). Nc ol,+p1). ,
6.4 HMM Profile fiidden Mekov modeh(m{Mt ììa! pÎoÈinlamiìyrmlysis,llem6'om.
vù rìighad hicher'hm 'r. pbbabiìi'yof i' ÌtMM eddtirg
a (qùerylseqùetrÈ!ìd pÉdicrI À r úenber ir rhepiobsbilry
6.4.r D€únitionsfor an IIMM sú pú ol ff$ 16 N ar
lud:fe r srn {dc 7òmd r rop lrc î,,. For (4, Î, Ìhe€ is i probrbìli'yof mori'ìgtrcn fíe 4 b ffe Îr:
ùis P(i, j)
rhc pDúbini6
rc FEFd6
t P(i.,= r.
6d Èn b€ újud
ìl irc
wb.Érr = îùatrd4, É 1; we $srm lmbabiùyúdlheD.nicdffpfh'is.ho*i .rtt=
f \tt
É4h$bcaremirrsymbol, jroùa
{e ds paladen
ùf .s bc adj'ft
0 < , , ( . l i ) < t n t r r r dm di ,
P \ q . D= P t . n ) p Q n t) = f ] r , ( " ,r . r f J p t a r t a )
6,4.2 Co6tlu.rine a pnffle IIMM for r protein family usxrl
oft rrÈ wnìì r 'ìulridÒ rìjsimen' of rheseror s.qrcrcs (or a rrmiìy)
reumìred in oì IiMM 'crin3 ,rfdrrr' ù\d rúcs a mpuri cr or (idmrry risn.d) mcnbs rqù h Ò sro berbìe roco'ipd Gri!tr) I sLquei€ wirh
q1 {3 qlq5
rú Èrud,
! wc4 !ùNù.c
uo rrertrh(ri.kù*).
rid úc llw
!tuì ù! &ftsùrù4
6.5 I
moE lnùo r ds (moE dún oft is p j,\.rud jdc k, n\do, rhccnrsiotr p
6.4J Conparing. sequ€nc€ sith an HNIM nh im t.tMM(bnrd lhdúrhe quence B Hl!r\O. one mnirrLy iìndr onuÈù ,hr hó ]ùcllsLrîr'hcnÙmh(',]lhÈhc'he cfúfdeeldi'jgdslb7]edgm{di.e D-'./ =
L a . L P L i , i ) P ( 4 r L1 , , e pmbabiliriès , r.r JhE rtrsiioG "(i.
6,4.4 tmtein famìly ilaiab,scs
PFimrhesqùÒn is nÌìpdrd Grisiod)lrh crchIIMNI d tanily nenbùship inùy (ù rhef imrhmenhriona sor i5 .rL!ùhrcd) Ds pRosrrE dùhr1e Ns i(tri pfónlr dd lcqùqr ndrh accchiprerr) thdi!xloydscrcLriooship M: Gs rheyùe d.5dibdl hùO, Fisúc ó ó
ó.5 Erèrcis€s y,./ v,c ror úie dficur
.qariotr inf ciìcrìariig r,!
(a) Mik a pbrìh bs.d òn ùi1lhcn rheq€ieh'irs À siditr b úc qc sd
(b) Do rre.d.ur
otr r'orrhe*cond
'ioi (6 2' fÒf rhuporirion weishr. ùedùfN!ofrhealiemÚtdom'
vnh 3 = 20 for sips in dÉ $qù.ncd !
6.6 B
s ii q ù, no. neesdily 0 fb
havc ro bc r
rhc nq' .yde. L\plrin vhy ,,''..;|
ÀFr pmbxb'y;, aúdv$r i. mdDad Eee,i wh",,.,hr,u...
b m{ù sb@i(tr rr. i b
d.'.1, shÈ.ird i b rhe'rdroù ú'!
G) conrhr in HMMm.rcltu 'hca srìowù (rù$!!hrd P{hrhmùsh dr n
6.6 Bibliographicnot€s nhskovd al (re3?).cfirlkov (1990) d .;nbiLoyedvcfcrijk(1ee6).È.fi rèwci8ùnde$dbediiîìùùrplnd, (ree4!). urryd.l (199.1)rdpSr.BnsTrajrchul
al.(leea).Ditclirer[irres ùe dsnibed ir renùofr ald ue,ìikofl (res6),id sjpLndser aì (lee6). ú6tc\ n siri ii Duóir c' al (tee3)rnd aidy ( 1993). PramisdcKnbcd in sonnhùmereral.( I997.1sssl.
Sequence Patterns
rnonúúìyric)i0rend'otrs,suchI bin ù Grd urrLJ, I n'Ìrd rnn Jnhd prdl iùin ùc dorù Gome'ms ared a Do'il).
i\rrìdìrmyhdenLmormsiir irir_l). nhileI rcrr alìg,ùndtr Ìe'hodnig|l ndd Édibi'ìA !ì'niltuìriesanoi€ sqoeics (or 3lilmiy),$chbF6ÈudPrcireHM
s t F ( l x ( 2 , 3 ) - D .v h i c hi s i ù r r d b y a !b)sèqderù hesituìirssnh Rdr( rhr
pfup.nis'lì!q!mP|e,PRoslT!gi! rdhdfics (r erc$ or eúyùs
shì.h Ìdifc
biÒ$dhc\ir) rf e ney quene mf d rc !nd/o. fundion catrbe anrtlsed frd Òic
prMs (.rc nr och rGi,a tinìiìy) h mor cm.iú' nd ar$ pnÌidcs a morc {isrilc 5J $rdfi c ri Òî1!mrlyme P dN de$ritìitrs bioroeìdr\ mùliEtul similiriries rrc úLhd ndi^. Mdii L0fshdtrdDlolÈnjcsoflhgoÌei|'orlù dÒs.nb. (nÒikjrirr) f.rù$ $mnon ro bioloei.rlìy Ghdúlly ù roodiona y) .ore!a6loeviìmtgh.'lftitillfrÒ'if.
7.2 | Dir'rùcn, r-34,34 lNde ùd ir is 6suy
(or roml|nns) fo ùrerch rhf pùr or r *qùù!c
mdcher I l,drem )
Írens(Hofmrnn errr.rrrq) obcPRosrrE yofltuprerjnfanirj$.)
7.1 ThePROSITELànguage
pù.i'hqe\ 1 1 . ForexmPìq raclr
Fordample, lcHl da.ds r$ hy rdi
asin posrion e ìiscdbdN*n {f
j(r) cotresponds 'o x x x rtrd r(1,r)
l R ( ì x ( 2 , 3) t D E lx 1 2 , 3 ) - r . n-x-rDNsr-rrLFrù tDENsfGttDNacÈF0-{èpl-rLrwc) -tDE) t.rr,mt rDENasîaccr-x{2)
7.2 Exact/AppmximatèMatching icr mrchiis, rh. wilrio!
lrrów.d ndù(;b.
). oi Ììe o'htr hdd. ve also e dd dry :pprioÉery (o ù ednnishrù or r) nfó rhemoe 'p€cùìied
[email protected]È,
xpÈfxinmely (lo.dn disbrce r), andcxE mdch$ r&ee ofrhem exidty
7.3 DefiningPattern Classesby ImposingConstraints The pfrcms.b
irk, (ovedappitrs)pxtr{n d6sr
depeding oi rbd
virdlid rcgion h/dd iJ ir h,s rid lmgÌh (n nfdìlr rúino lcidr in a sequerc), o'hîwi\c n i5/rsiói?
by r iìxd dobq
(vhùc vidcadEsiors.anbeorìererh0). TbcrmgúlscÒrPRosrrE rms ate iroNs rb! fcPrùtoNot componenb, c.g. lDEl1r,r) Thh u úr hecoosidqed húLer
rR(l I lr,2 )- lDrl -at1, r) r c. (r). îrì. rfr 'eo aE anìbsuous.'he ta{ 'so
lqibÈ leeio!nlP|$ùcdltr{enebdpel rhr r (2,1) hA ndihjr4 3 (Nc r4hd
parems T.e/rdi,hj
of r *it,l.d
i fisd vitdcùd rgion ha seribÍÍy I )
7,4 r
7.4.1 I
. {hf l]peorconpoiefs(6rd. mbisuorl, . rlì!rùnbùor.orìponens(1. lof mofe), ! vhf ryp.orRild.ùdrcions (6red.nerjble).
a t s r l x 1 1 , 3r c n 6 . Mdimunì (dd minimum)iunb{ o. lonpndl . [ldnnum (andminjmum) lcqrh otcrlhoìnFoc
! Mlrifùn (rord)nenbilnyof. par
7.4 PattemScoring:InformationTh€ory
Inlo.dalion lhèory
rhrhir4 1r,l.A\ùd!rsynboltd Ljor
k t4. {rft eid, sr-'ìborhis r hrlsNd
1rr ,r,o/trd,h, / (d) ó$úkd
t! = L.'hcn
1.1.2 |
ftc obpy
io, ! ùù4
I crch.
/(,) = 0(mmwkiowrcdec),andrh l(")shoddiicaóeford.crcÀjisr/ rrn isobùined'ylhefoxo'iqcqalion:
wcrcún/G) = É Nheìr. = 0 w be ds{r
ard 'ins
(he ìiw or kfg. lumbert. MiddÈ ;romdion
Htù= -- Lht,6tt,
r){ synboì
= - Lo. rstu
sùpposc wc h!r. I binaÍyi'l0hibd:
r p= lrl2, r/21,ùs n(r) = l birì
. 2 = I r 1 4 , r / 1 1 , rd(Èpi) = 0 . 3 b r s ! . r = 11.0Ì.'rù4(r) = 0bns.
ntm ìn rhenuEhù inbsrl I0, ìocrt.
(dr'nhúnn)otrhcàrùodrs.s.Òrùsr Lt lp,l b. rhebRckgmudpLobibilii$ TÒsÒcrposnioi',i's',Jb]?tr',,,d ''ìtt È|ita
ltuù . lt ùJanrìraa.idr 5ddùcd ù E!ùúòn (7.21.
riri'yìs2; = p,/r&.rree rr n'bcÌduúurddrbùketuMdpmbnbiri,icsor'hc
t tr ; (dÍHse
ii ui.!nin1y) c,i h. nrÌr
L.rd by !$oiEq@'iù5 (7 2) ùùd(? :r)
nx,) =
\ *r-.\;
we se rhd i1 & = r, ùÒo / {r,) - 0 (N fcdldior itr utrefai0ry).andtrtr r., = ,L G ú or oie anìodi.ìd), /1f, b{oNs eluxlÌo Eqùdùn (72) wi'n rcgion' (ìc\\ rpt.iiì.), $'1om My ordoì'E d,isn b $ore awirdcad rqion r (whea 'hevirdù,r 4ioi is spùìîed is r(n. rì )) ó
coi nt
Pú'iig €qùriom l7.a)md (7 5) k,!cù
/(Pr= Ir'((r) -.Iú
o'npo.enE. rnd I olrf iL ([cib!e)
Ld (ii rhi\ sùipro ,
= lac.Dl. aùdpn = p. =
tl?)= tle
q^.t+ ttto,1, + H!
= :r(jbej) + r(j bsj) - ,.' 3(jhsl, =-,,oc,+6sj-Ì
7.5 Gèn€ralizationand Sp€cialization
drcoy n ! surdsr or Lr (i.u andv m tuiÒi |n,r rhestunerpúsr.
è x1r,r)
lcDlbaseftnrtariù.ia :(2,r)
t c D l i s a s e i e f l t d i o n o r a - r ( r , 3 tc . i(r,r)_c-x
a pÍrm
mdr Í 1tucndora sqùùìlc (hDufl enì'zedpdrm l,u' rel 'rs orhd).ano' ( p a r d ) ! v i r d c i l r c g i o n :xa( 1 , r ] - c D c ù h ! s e n e n ì i r d r Ò À , ( : , 3 -) D {'!urrd,,,tr'heinrAcdfsùflìrj72rioD.h.!cÀ xi2,r),c x {aRDlis
7.6 Pattem Discov€ry:lntroduction
or L (bú' ùo 'hc hktk nry ror bc biolog.illy iìrÒfmdlc)
$Li,óDryù c c.tPÌ.n).ctsDEc! nn L1!r1Fsì.r(t c hswss PRorùse
D,al0,2_ r rorl.
a r 0 , , ) - t o l r . i n d r h eo ( u f e N s
siiìrìiritìe\ b ùe ol úc sgreft
d hish{ rha! I gi'àì 'hPshold(or 'hí 'he u. hrgbd) A$oush we rc m'idlv irrm{cd in
'-. a-"'"r p-br-, 6"d mmnd simirùl,ieshd{Èn i s' or ÒbjÈb. :"., "r,h" Ds.rìbing m.úodsfor pú.r Ginihnlv) dis.overycar hcdoreions diflÙcnr
o d
'*g',*'o'""" eetreritniotrs or ùem)úo o(ur i! oL\ersqucnrs
F"e-" *'d, *",.m.c
n jt cioughrhd rìec areodrtrcs
' reisÌ,t-ùnre dord$
in d Lcd r (t bcj!8 r coNd) oi úe
*r pxtreris(rcquinrs'hd rhesodress ot rhepa.tns cro bemdsDrd)? ó Is exrctor lppmnmdc ratlììng ulddl
7.7 Compariso.-Based Methods r. of objeb. ?situne úmparisù is inc basÈopenriotr,citherFúNùe bcrwrn objers,b€'vei ú obj.d md a sjfjranry ds.riprion (q!. alisinent. or berw4n rqî simirúigd$lnÈìonsa pr:Ni$ rI (plYd). ure me objd' r\ pid. i obrds î. rarrs arcdÈi .orpaed sodú' Òncfiidihc I,rrG) orrhepìvú ùar ha sftìrlri'ib in anÌhedber 0bjès. Lr Gnsr procnssna). srd siú Òncobjed,ùd mmpue$rssivety rhcóùs
TP (tr. pm8.cs d. armpre (HùL'n ùriis i (po$ibtyihpricir) fte woùe 0c .l1|(rll-byru). Alì,i(, r)/2paìsisconplnsnsrep. omdlúrnolommon priíis( i mjhn ries.conplns.5 (e.g j'rc^dotr) ire doie onrhÈscrcslb ro fird simiùriesorùriig ii,lir sinpk. r orft objersG diqreofsihihdry ú Èsue 7.3,vh.n rb obj.clt rc Eqw!$ Ap:]n*isendhÒdsúnbedtr'ìdèd Ndc. how4ù, L'ìr'r d6cnpúon(pf'cn) òr ùc rjmitaìrjesis iol !{$sadìy givenj tùirì* Nnhrìì (orr , 1)oúù Òbjcdx,a|d 5i'iiLùid$À\!pdrrn.sldirúcrpan
pfs eîh onhe(oneor sercfrt)bN roúr
'heuÙl'jetc6gbyNiigùeLPleefueh' llÈùmscala|sobe.onpfèdwithpal hdfteD rx obje'5 (orsonÈmi mùmiùnbù or dÉ objsc) n i 6ed in \rricroù 3ld a4os (r q9r), wheE rheseqùem rrcflcúineBidùe sininiet ind 'h. drf N' sjrìiìri'ics iÒ' sh@dhy x ì obieds. led i3hî8i.iw (Tmsiriviry dxnl\rfs . npri4 rrd ,4 is simira ro c.) Hoq4
7.7.2 riFETlllLrdiÀúbqPnociPd!
7J.1 Pivorhsednethods simild'r*'hlnajEimplúÈúfjo' s ùr rhùcróto !$d. is itr Royrbfs os92) @round(sijnitù'yúc6nrdbyúrdif an.e)
tqsl r(0,r)
H s Lor rnsl xr0,2) c H s-L
7.7.2 Tree proA6sire m€lhods memde4.h.ÈúEF{4lilgisilgrc d jÒinedin eidì cych. ftrj.iijns\dùc L.belledRiln pxrtro(t sen{ùziic
1lì f fops dìen thse n only onere, cokri,ìs
R|iPLcJ'rcmanelhodt,ysi|ljlidsnilh reeo).The so,ii!$rretu ir $ de6ied s I sbig o!.f úc iúrbr 'ii ! 14.r,!.r. i,i. È.r., r.!Ì. ùd irs std( i5 ùe \ ditrùùf nùÒ s Grp), nnus xrinc -!l| Ì
DFÎnicPlefuútrg;5[dfo.Pd'} ùrù j.! slD Fnrry Gùirg rlic Dp) whù rhcE rre srps ir orÈ Òrbdh of rhe
(!ì ,lrsEdnr pÍrni
. o E( 3 + ( 2 ) + 3 + 0 + 3 + 3 )- r 0 wirh $oE t0 .a rl$ be fou.d).
! arisoinscr,Jlciv*pLr =prnqFH,rlnh$oft(2+r+r+(_l)+r+r)
7.4.\ 1
, ariciing(rr.rlgiv$ p1r = psErs, wi,bscorc7. :).úd pr I
oory oneprúÌi (úd hishef s.orirg) : PROSTTE rir! fpJ!è
. tacl
7,8 Pattern-DrivenMethods:PÈtt
f |gf r) of ltu *qucn!$.
de if rhec m o
r dr sddqnr rcgioB.ú bÈrìqible (ù orhuù 0.otrsrivdy gìyjosùbihdy
x {1, 1 ) L. ii srìoriomÀ c-r{1, 2) I x L wildld felionsor|oso u aÍe
7.8,1 Thehlin
sú..hiDs, scfch for nexiblep cF sp€ciarizN.ioDcenbèm€'o, sp.lirÌi id in 'hù$fth Ge tr'ù), hd.he shndrr
r ! ù - l r i . . . . . r , t . ú ù ú ee q m n (+).Nor ùùÈroE rhr in I *ene'nc s. (Ttìcrhdlrd lilùe ibr / n 50.) ranrrdk,iiaprc orhns.h<,riú/i(i < r < /).omponcn ( ohf rcrsrhl). Reflt ùn 'hc terElìor, P
. ,t 'sùÈs,"faursc,m{súr.h , r; 's úe slÒll.*snÈns mÍóinc
hr rhDsqueres beMGRsraR andLTSRRU4 riq onnn M(;RS.CRSÎ S'IAÌ. TAR+.AR++,R
.,rriìaEprcprúasd ore{tùeroiris.hqer(whùhkdf
,i *hrch 's s.d rn 6e smh. rd! ,hai Ri is khld lot ottt oE I lllrcîaî.
7.8.3 Tùe p.tten spoce ,.pftm (
< /).!!drheenp
! Nodeìsiddliùirs thefde,
. lhe iool.oíais
n\e empl} pdlqo (
ud oie.ompore
haEempry siìr.ùd qioot:
(ùc Fof: chtdrtr
rqrb (rl andmùimumrorarflerìbiìiq. .ai
r t
! rcduE iof hndi.s rìl o(ùrcm$ of rliÒpfuú c tRo) r pnirdù! rìirjn3 dr sq@ne, ú'h ooftnces or q (/o) (i: rnd t is ii rheinpldnerùdon ! úmnon pfcdre)
.si rori :-irhdtù,',i+ln)do o = /, x(iì) i ,o:= F(Rr.0) 6nddreonrtm$or t a : = G l R o .H P ) n o h h bc ?aù'tt't rhen . b ù'de(Nodì, P.0,ro, d!) DFsn{odl, eì$rf 0 ,.J r.r risl, 4o@ rhèn
Q = o l r l a : = î \ 3 n . Q ) :H o : = c l q a , t z . , r t ) iî O is t0 b-'.jkh.La ù.n qùr.,roò(Nodr. 0, r!, ro)
DrsNodl. 0. ,p. rtu)
rindinsra,0(P) ! child-prrtnorr siîce0(P)isaexrisiorof/,'hcnBoi5rsùb*,or3p.r(ifr 0(p)bca/Èed ù,r. PC;rl {rìe'lìer arsonúclìs ?j,("). rhi n dorcby$iq rhe{rurcnce of p ù r (atqlys aPE6n.andss f the(i + r)ù rynb
7,8.S ,
,P = tArÈwÀ rrERjm. A..l.,a..{rrRmt. lhen rheocd,Eice\ trrchre, of 2 ù€ naked. ùìd
rhcn ro:ro &tur* t/r.fvw, .fiìcìeÍlr
Jbmd by kt oririoN
r c E L E v H^. E E R v v R l . ù (strú$ ol) ,p rid {lùbu5 où ., i .,."
r; rcq a\y r, $e rhf ,. in. alh,ór' ex'e.sio!o; , (r) = F-x 1i, j )-a j), andrhc (r < ,.ú b! rom,r by ùrir'gth. *c rn rh! 6rcdc\ù$ùo\ (i
R a r ù = B o " , - \ l t . JBaa t o , . hccilcuLftd by dyÌiÌnÈ plo8mnmins(*e
r i r ; ' r ' i > r i 5 n d n 6 ! d r ' o ! a n y t i . i () P a n n d b q r c i d c d b ! s q ! c n ! ! rr.bì'ì! î lear i sqùeie, dd ir "
7.8.5 Ambtnoùscomponents
tdmd*, c.3 P{(ir-b.Nhùbisr r(i.j)
0 , t ,| " " p i
r eireiding úe pàreFs b 6e fishÌ (bu' nono 're h1ì).
l r , 2 ) c x { 1 ) D a l d r r= l a ( c o D a o . H,a-\cHis usd (s* Ficu€5.1),'n a i xl0,r) c x1r) D-j. whe€i=trrKRl $d j:tNQl
r ! Pf'cm P ú \PmiaÌized 'o 0, ir i
7.9 Exe.cises
nenúityis2.sidlhe!Ùjrùn$}o ln) How mriy pifteùs dos rhk d*s coùhinl (bl cD r 11,2I DÀisrprrdiiùè
ri = rD = i. r,. = l. d 'h. *rcb (a)Findn\escoeorrhepfrm c-r 1r, r I tÀ.1. (b)Thep.ù.bìsc-x1r, r)-rÀct riJa x(1, r) tact vÌ.oqtrrry.
rhepaiNise lcil xlignmeft rouM ror Gr. Ji) ft
a panenr G) wfne fof sch aùsnmeDr = r ) 3 q r ! J r 5 +o5i ftc 5 tFisù! (b) chm!È thr hishè(*.on!g p m
(o BsLdor rhc$ùcs. d6dcvhrlh (d) Dù 6 L{ irsnùs,nd lúd 'hcfc
lo arie rhspdrcm Lhdyouhdc foùd h ruh of úc on!ìnar$qucn!c\ (0 vrè {heprbn Ì! x PRosr'fEprtn
n rPPfÒkh(Pnt). we haw ùc ùLphibd = (rhc : rdidm nùmber of Èrch in dìerÈ). th( , 5, 2 {4, c, D), 4,ú e P i v e ù : J ! M c D A c . r : c d c D À J ' : A DC . ' (a) DR rhefaÍch EÒ ùd rìid 6r lo ùq c*,À-x10,1) pxtrenì À-x 10,r ) t.Dl (.) rxo p!trds (4. ,l) cin bc .odbiicd ir / sgmen! mr.hins À hrt
ecuren.a nÙs bc < 2 ir ùc 5. EipldtrwlìylquúiÒno.7) is .orE dd É4irP). srcùfirs ,e!.,LrP)
7.10 Bibliographicnotes (r99qlMorc.boutÈccitr ùd nmuÈ ( r99?Júd stPhen0991) s de h cnshsìd mdd (!qs8),ÙdÙrmPhsc'i h'ìùlid Jonas*od,l.(1995)Scqtir aidrj@i (r991), io LNfctreeri (r991),Neùq?ld !! 0S96). dd Durbù(11)95)$dBuzmad is dscnbeJì! lomsn d ar (r9q5)ùìd Jona$oi(1997).Asuacyorddùmi.i (rgcs)ExamrrcJ oî.ompdkonbsd Oeeo),vùsorùda4o:Oeer)ùdRoyrbùs(lee2)lji'iDlesort4trúldlcù (or codbii.d) mdhod' !a in Neùwild a ath ( reea),romFD d ,r. (leet. AftrhodNù3 sd wolr.nÈrù d d {res6)(rofnndci!lcidr) rùissi ( 199?) i,itritrúl deraiprio! lasurgr $ ù Bflùx d rì (lqg6). Oùd meúoù fÒrnrding mdil] {r iì BìiDchrb d d. (2000)(for orholosoN *quencs) andPevaÙ ind szc Omo) .nd PaGi Ò',r. C00l) (}or DNAsqùenet. *rY n ùe Cibbsnotil srnPlq: hnP,1,{dtb'ffdsvorh'qg/gjbbs/eibbs'h'm] ds.nbedinLrwÒ 'r. (leer) dLiud.1.(lees). dhcove4isin vilo (1993).
Part II
Stn Des
Structuresand Structure Descriptions fi.
rn! tbr pDnii (mobiìr) md caÌirysisl vhile dhc6 ae mofe pisne shrcruml
snìilì ndo,. no\enrenb ósociitd siÙ\ fuidioi o rmaúÒn d /ò4,3a r,'dr (H bonds)
8udúdi{E\ b llfsc
1'oi|y2oleside.lrdi!cùfomhydrg.! ièldd ri !s /ú/!rrri. G poì.r nìoìecùìe kp rdr.][co.búaFlÙodcUlclrn dir, f l)lniatLy dì?4edl. The ùúìo x.ids
I $,i'tr {ús (4ueour eùimme . i.. or f ùìe hldrcgli boidíis i5 rhd 'bc (srob!tr) :îÈinsoBis'oflhy']í,phirj.ùn!!
(dq{ 'ó ùdividùl rcsiducsùr ro'ù,
(nesa!i\r) H-bÒrdscan b.lúned
p (N H) (posilik) aid .arboxyì sroup (c=o) be
i or dkùlpNde bndles (bords) hdveD ùc inrgml pús of rhc (ructuE (cdlxdÒrs),ich * ùe uinc rom ii Fieure3 :r. bal úiviry, ii rhr ùrevhole pmEú 1q d bsl a rxfgepM or n) h !ffs$fr. Fo *m'laJyrructuaelement(ssE q
rhc compd$i Òrsqucnrs h al.ìor shdùas is eìtheron ar& r.x/ (morly rcîduo or otr a @^! r4d (rìorìy ssE). c dìyidod idÒ rso l]ps, dep.idine on rhc findùgimpdùr.omìor
localGpÍial) sìmihn'ics.$ch asr!L!csù.1dbiìdhs
npoddr ùnirsitr(rclxÈd)rructuE' a NdeihriashduÉsnÈf]$n'ilì
rhopiÒreiB iÍo shdùml
8.1 r
Èon ÈdMontr à d (2@o
'lrì. PDB(Prcriù Dtu B$k, hlll://*e{ $b oqTindcxh.nDd Resarchco at ordoryfor stuúurar Bioidoflfrics (Rcsr) n ùc nlh pubrr ùúbÀselor pfoÈiD n& ùhireolqP.nmeÍraìlyddemi hdnsed'lefoxÙwirg.h,ptf.dcs
8.1 Unitsof StructureDescriDtions èús (ssE, A rrsncd n úc rn.ord or a
roPorogyn úe demeDÈorderrìotrg ftDp€ni6 ofùeeìemetrh.e s. physic
a/ ,Ìd?4 ncmiig ún 'bc dù{ript Òr b!5 scnPrion caù be oi .là? (r.') t\d or a
seenlè$ipliois*nilofspdfyinerherchitcfuftmdbpologyofpmtiis' For ùc ,ùc lod n n dooghy :PÙllii
8.2 Coordinates 'fhc turdams'.r ùreedi'ncnsiomì(3D)shdurc d.ùiptior cùr\À's .f 'nc , i$ enei tu 'be PDB GeeFisu€ 3 2) rhe Bon,ncc(NMR) NÒr rbdft 5hd b.Ùitr'jrlyinlledeÈ[jrrdo$'Fi
($! Fjsufc3.4).rlc ridcchain s sidc.hains n,rkúfivelyÉpftsúedbyllÈÙ|sideclìjjiron
8.3 DistanceMatrices
.!rir (Eiro! ìmrgcs) (s
Fisúo 3.6)
x rnb.Ò nftir ùflriB (morc lhu).nough iilno,. rn oirù b indicf.lhí rior 1Ònlo ùud rlìc rîu.rLr ( fùr bxrd!d.d\), thiirr or d st or poii6. hde illp.i,wise (Eudidein)dis'aies iE nr{orerchpoLnr.e.s(ar.,,).F @srdú s.s ({o rúr ù.h pÒi ). aid ùnc rquÍiÒa cb be derìied on rh*e (otreror eachdishrel. uo*evd. rìy tfecùce coodÌre)
. ,r,krf,n!e!cqù,
sm(!f h!
::rìr.doìrrrscunkiowNGri,i3rhcdi5hnlcsdisQ r)anddis(r..) sohitrsrhe nÈ {ay. rìd by tr\. or rtu dirù,cc dii(c. ,) i{htudn(a.c) \ sunùiuousÌiloì :r!8hk,h$!(l +?)+r(,' -3) =rr
6disbiù! nbtrordnbccrcquncd(4Ì-r0(Err
... r) wboDeoùs ù 6e oPPosÈ d
(c) !È. rts tùlEs
irc zhc dons. (b) ftc ibc
d numheror coofdinds (Ememba dú' sÌr 'he dkhùce bd**n Mo poi'ìs).
dìc I doms lpoií9 ùrc kiown. k !ù 4rculdc rhc Edùs from m orisin
r. o lrich fe simrrrt!h!n 6b rb cd fo
t) =itti+óir
d i ) t a ft t r r i
. rd rheongin(vhi.h m beroúd by useofl-igmrye srh.ofcn).'Ìì.i dc6iè ùù r x N) mfdr c of 'hc dcmù$ kd inrdm4ioi n NMR cucnnens, whch tudmì $nshims ind úc m6súd \rìucs. :oÈ drùi.cs dr b. r.ùrd (r'c!i, fof : pmbrerscaled rre oahií d úhon plobten(csP).)'rhc con{fliil1 ùar ìn nndure prcdidion.somedhtucs ft rr|osd. ard by usìnsÌhecoisMift (s lù NMRqFnmenÀ), funbfl dirmes Èd whcrbffrbcrusúins $( of disúnÉsun
rryiig a < r < 3 [ehd. dc.
8.,1 TorsionAngles PmÈiiis a sus*sionof poin6Gromt n '''cr r=Nr{i{r=N,+r{i|{J,r.. .hr lhey ft sp{i'icd by sddz trd schnnìer( 1e7e)6 La7 ror \{ . 1 jt fù' r{ ùìd 1.3?ror c N (i'ì xtrs{oùs). îltr rngrc5bdy lcn ù! r!, b]Ed: of ach :r
(in rhcblckbnnernd:ide dúint. ceîdrìly, ùÈ poinr hr fùùld úe bÒod5
abm cr in F4ùF 3.7.nÈ m!ìe r ùìd rh. ; rme q,{, ft oshúi bcùcc'h. ::: poùr (cr r. Nr.q) i\'h! aqLcrj, íÍond rhebotrd (N,.q) (rhe sb orL\e i: (cÍ,cJ) rothepldeor(c, r. N,,ci,
n nkè tùnd
thebaú (N,.q). îhj
rnislisrc isdqo'cd'r,.
nlb^ ùoàd.
[email protected] FAn iLnB4
ùe 'omi {ers.
! rhe fa.dùn ctfr hasrcrarrve h ns rhebotrd(cJ N,+rl Ìn\ù, .i ed Gpplr,mdeìy) 130,.r[.d ,r
ùúú.e ot a prcreinis rt.rcfoÉ comprerry
tr ptor (mmedafr{ ùìe rìdiú btóphyrjcjsr. (j N Rlmrch dmî) is r plor yheE rheatrelcs(+. !,) aÌc rrored. tn fieù& 3 3 ùe reddry !rLoqrd {ìùs or tr. /) rÌc rlored rtors wirhrhedisriburo. oi ù.
8.5 CoarseLev€lDescripfion llrcdwru||3Dshrpeolapfutin.ùc]r,d' 9s1ì[maresftú*rÙQd&$l|yProÌ'Aiis cùr bc ò{fibd
by rhcoreaizarioi (oriir..r'r
8,5.1 Liùe scgnents(stick)
xtrdrrnrl,sr) d. th! ssrs. dìùs
4,5.2 E
I "i
r (!n ordde ù!a5 tu !{roù
3òe sÈrtd y ìÎosed uhs or (ú, t) (darrd
:!\idúc or rrtr ssE (orof r rdg cro hy r r$È|qùu$ ndùod ($ Fi3ur $.9).
8,5,2 Dllipsoid s'o desdbe 'he ssEs (or iry orh{ rngmeD6)
.c sedoi 3.61,d'ea rhelorgsr Jj +ocí (ii.k) rpc$nudor Îidcllir
8,5.4 S
0ùc) ri ! $' ú rc5id6 ft!
fuid n osúè
roned syhydmger botrdsLdHbond(i., mùnlhf'herci5iby&oembondbell ercùpor Esiduej. Hbondis thusa ìosúr tundioi whnh ù tu! oh.n rhct i\ ù
8,5.S î c.hoìnj|mldcbysa.s!iv.hydrcgetrboMsì Hìond(i.i +a). Hboid(i+ r.i +5). 3E.hdnismldcbysccsjvelÙdrcgd'boids: Hhond(i. i + 3).Hbord(i+ r.i +a). n.h.lixisnideby$(dsivehydfogenbmds: Hboid(i.i + j). Hboid(i+ r, i + 6).....
dúgruF (Fìgúe I 3).
8.5.4 Sl.ands and sheels
groùps(N H úd Gc) ùe i! rhepbi jdur t in loolhcr rlcn 'hc budjns of 'hc tsidEfoi'lehydrccenbo.dsbl*o $a$iv.hydrcgeDboids: i+2)1,tfr ìid((i+?.j-2). Hbond(0+2. ì+a)1.... ttsbond((i. l), Èbond((i. wiú ! rùsL. ir\jdtu or dic úIcr 'ú l H b i d ( / , r ) . H h Ò n r ( r . i ) t . t H h ò ^ d2( (. i + 2 ) , H b o n d ( ( , +j r . 2 ) 1 . ;cd ($ iú bdh qild ftc idúrtrd
rÍ8od $ditDs
ù$c c dfrPPioriE,Ldyd=
(TOPS) 3i,5 1bpolosy ofProteinStrùc1ùr€ :1oBr i,r.ree{). oir! rhcssEs rcil
r?0údú - +r20
Fi4G3.r0 'úPs cdùd rù ùr irudlr rdc. AToPs,iasrouil!foma|jàlionol
8.6 Identiflng the SSES f rheidenritìcaÌion of rhessEsh for m sy qmdy Ìdsdry {h*
ùr úe rdenriúcaÌjon is prcbkndiq i e. rÒ m ssE rd\ ùd sbp. ìs uniEÉr delìniÌioDror ssEs! rherew mft.Ecùod6ideÚifyssEs'md$me 5crìh c trighxúd dirù.óúìd!úbly
8,6.2 | 8.6,1
Use ofdistatra
rlrr cú bcSSÈs.Forìd@ìteda-hcìi u l a r d r o b3. . 3 , 5 4 , i . ró.. 33. ? , 9 . 9r,0 6 . y 6ing d ide izedaEe p:úrror d helB c abns (ssúon 3..1).Realheìics usùfly s,x\ údvr in rarrrc3.t(a) whw 'h, hdix À (R\idues3r $).
Forarid.alized,innd 'hesusssie k j\ gùdiiìll
t d\lib c 1Ùj Ùognj,c adj $úcem!
HÒwer. r f or.i
ùúiducs r3-22(lr),2129(r2)ùì'152 :6 rrr), Ni'nsdpdùlèlcoùrdioisbd*ed /r rd p2rndb€seeiFr iid ,4:r. rcsing rhc n!ú di!,!oi!ì, !( ìooÉ (*c FiEùt lj 3)
8.6.2 Dern€SècondaryStrn.tùre of Pmteins(DSSP) dodiry(ddior) ssEs1ìrr shcturcsis pmb rinsIDSSP) byKib$h atrdSandù(!93t,
a Q! 9)
. an i{L s bd h{sd = : rÀ., : o. {d 05rornd G.ùdrie).
psùqd fón (ùsh
ùd sndq ( Les) by ùc
pNpse.DSSPaìlovslolan'bdl,Ig bìidirs d.iÈiÀ ollès rìh 0.5kùrmor rde6re!bond. BurinsEadol urine
refgy rodd !rc ihown in Figurc 3 r l
À nirind bùlix ÒrlÒi$ u (' - 3,4,t ton rcsidusi ro R\ìd!c i +r r ìs detnedbyHbond(i-l.i+,1)atrdHbotrd(i i+i)asìllu{núdirFicurc3.l2. (se Fis 3 12) NoÈ s orfiniElrhdi$
,iE hydoF
6od\ 6. Esid@i {i, , ee rrovù
of 'he,ril8. is deiìmd:
- ì.r) andHbond0. = tHhotrd(i ì + 1)l Ò f l H b o o d ( jt . i ) d H b o r d (j i+ t ) l = tHbÒid('.1) d Hbondu.i)l r Hbood(i Lj+ L)ind rlboid(j l.i+ lr. idlc i !tur úc iiyorrins r5idùù t sbids 'm|dsrcal|o{edby*pli.ildefid'júo|
ryùe ^n'ipanììeìjùde.(i - 2,t + 5) ad rìprdrcr brid-ac(r., siLl,ro,dxdipr.rÒma hdsLinjnnd.j..
ù,"_,r<: trii:;:t!,^
idr$cr 'urcrtr4rh!de.tu0)ùdidù
3.7.1 I ùioù
[email protected])
n md ús lnÈd 0D d$ c) or bc !*d 'o
8.7 SrructureComparison
moi rbsùrc'urct. in 'his cire f,.,,.
rcilon' andfor rhedNoveryornìorifs(comDdiri. conpditrs sh.tua aD fsd
rofsutuc rio|/paner) înd hnds(localor global)
dscdpúof (d i prù ofdsriÈ
prir{iserrì!ùrd b úùr'iprè rtnmm' 6i
8,7.1 StructuÈdescnptions for comparisor iirìil
irÈs x€ 3où!lìr 1. c. ron crùup, asidq
Écondary srucnEsi ^òq. !K
or iÌchircctuE (gcomcty). ropoìosy
ri o dfl ro prcvide rhe coiDùhon (p .n di5qrycry) igtriùms viù a so.d ú'i,j||'\'.'|"''P'p'h
irèrcùpturcii5(o.lhii r DUr,f',
rrucùFs dìouìd gd difleH' dscripdùù.
). mis poúr
SIRUC]I'RL COMTARISON dcgiPliotrofnrurÚswi'hsimih$crine ^njhú|Nrybdgscib!emPl*orì picres (ùins) d'd 'o d.sùibc Òr.h úir $pm@r' lid (mon or'en) 'he retarioDship ,rsid!.i rlBaafr
.rrEr re ùsd ú dre prioi !ùics: rbo (so!p). fcsduo. balrbonc
rncÙa q?^irù€ rnd ros1otr. 'ho rbriÒns fc hin,ry, sulh $ gcomcr
Ìhe shdr 5ibLyÒ!{ùPPù$ gehdnary
n bùrd
n ri!ì'lcd Nb (Ps-
de6n s (ii dfte dinersìois n .ould b! rubci
aidsP.cebi$dde$dplÌoNml}'sema rlri.i húwt.n dsncns ({jnbne. dr4rion, y ljrcn in crcmc
andrhrn iìnd 0o.!D Pfu$
conmon I
4.7.2 S
3.7.2 Structùrereprcs€trtalio. :i.b nfu (crc'idú or (n) ir fcp($Íred ampl.'{c|'rs]dutaiboÈFcsddbvi's \ ! ù{ (qdùqi) or si (modùed) or
orrcsjduei-2,; - I,i+ | siolsrhcr4r^iirheFsiriors )is de .ppropnfe orly rù nnchùs (conscurivo subfrires. rd I *irg rcPA poririoisÒf6cd
dùelhecoon]iir*ofùcci!fo'n'l prùdo fotrj oLuùl vcdo* (h! vcdo6 hdqeer ssÈdiis mùùùdtJPc.physó.chemicrlpmp
cd ront
h !ddi-
'he dù
8.8 l
(1) î!
@uodc orarubiddnhodcmD
ftis cú b. srn rnr borh spae RndcrùìùLb!$d
unr, hLdd6 too rrc u*r (úo oily s*
rlc qrefml rcpft*trh'ion). lfumr
cd aid sors assieièdb be Ned ù a (ìú{ str? Geechaprd 9.:r). o cri. bèshinsì! lùr'l coodinare syf.r rr|yingomti.h,sùgùecFdindsol romcd. rheD lbr r prir or rcsidB (or. Iem ú.n \Fiaì neiebbou
1.d) dyraarc @gnmn rg dieine
úch shctur),
dc$.iprior fr exmptc. in
irh si'nittuNliriom nn LhLr{ìed'le ro.d .mldi,He ryrcmt specùìììashùbrÀ !( 6d rospeedùp rhe.Ònpddions lsr tr rM SÈú ncrhod(ctì.pd Ù 5), n b r$du6 (h{iie lraril| disrù.o sroN sonÈrhrcnì('d) ftc ùi!\
do mÌ .oidr
odtr Grorerh. hekbonL)of 'lE ùe,e
iscnb€d h aiorrìù (dboar) tum
8.8 ftameworkfor P,irwi\e StrùciurcComparison o Òbje.À.eachor wrridrir rPùsLcd by Ìo MoF fonniìì}, rù. sb obj@r\,1 rJ,. rrr), (n,:, r,!)...
/.,.. À d.rìicJ * ! u oi Pn\ r(/ , /J) = (/r.3r,) rio cquili'lcnft is cdled,i 4risùnlr ir rhe r ' t u p a n ri n r ( 1 . r ) ù c . ù r r i ù Ò l r , c , f
-ns (* ù$d ror$qucnet riior b. , ekft, aùginù' of (r, r). rdk' n
of roriisDns Ghrmììy)
, c@) ù! Èmrsioi ot Nlry aii LiEr.
8.9 l
a.+-'.\ '!-/
G) h Nr dioùy ú sh dua riem{'
l*c 'tu rcr).
rlieùrnen'bca\ m G). ed ùd hAr rbc rb).we se úr rbdprtud flìdùs (oreno'n eelì súúurct rc Í1r ùs smc ù (h)
h. dilnd
(rvhl!h, nr cxìipìe, is doE in 1lÈ 'nerhodsú chapù 9)
E.9 Exèrcises
Lerdr ro$ionan,sìcsa rddùr b. ( ?0, 20),(-72 60).(-70.r20). ( - ó 0 l r 0 ) . ( - 6 5 . r 2 5 )r.0( 0 . 4 j ) , (1 0 0 .6 t , ( r 0 5 . - 6 ó ) , r( 0 0 . f) 3 3 b rìnd posibre cìdils rpoin$rcgivlnùwodi,ic
dd É
ioN,(r'. rù...., (a".Ìtwevaî,k,lnfr&
(a) wie a ùfrdx ror rhesDnol rheqùadLricemr rn
(b) lnrerd or mùihriry rh! cmn i ($o Fj-{ur 3.91.wdrc a ronnularor rhisflor rrc $c&.ding (ii stqucr() , $tuì
C) Drw a iìem ilìurntiie rhis o) rhenunhf Òrdìfiènrsh*.ropologies(howrsnmd\dànb..Ònbùd) s is ,r,ru, r/2 shÒvù\i3. dhmhúdÙe8d'opoloeyùm|y piÒ'cj'ronenhù€sinilallopd'acsiqPùirwh,
Èr úe srp sor is 0 (|ry Nnhoú
8.10 Bibliographicnotes s iîd ropolqa is Le\k (200r) (200r) de*ibe sered or rheropr$ dislu$cd
er il
lh.ture ddrbnrs, 5@B!$r!ni\ (2001). Eirhùì'idr d ir e000). sonEEYis Dnhnre $Òdeù, dethodsae dsc. bedin Avódi aùi Tarror( ! 9qa).r'hcRmi dEod d lrq6l).,nd nsùs forjudgirsrhc
trsis rho'm ( 1ee.1). Ellti,id EPn5su (1e3r).roPs Ns rìót r*ribed ì'ì Florcs (reel).ad Nedrorruhmdic'NduE amprilÒ.s aì.(leeer) dr 0931. ùd r ndhod rorddLmùù!
iiBasuudArhu(ree5).ndl.6renFuú d d. (l ets)\.mpìes orsubnrudúrds.;p'ìoi ii E$rier d xl.( lee3) rd Baeley ind Ahmrn(r9e5)Dirc{ Jomssnd !r (r9ee),ady,niuker d iì. (r997)(s6qti î3ylorùd ofiso (1939). i (rec4).ch.{úrr.(1999)lìis6)r^tqaidrcvdi ossz)Gd9,cd,ìdÈrer!r reer),Kodìern(Leeó)($aùr):BasluyÍdak'iri(lees)(leai'E. À mdh.n rorrlildie flexiblcrmtms (o,bìnins ! hiig.) is'lsùibed ù Shabky
Superpositionand Dynamic Programming
9.1 S'rperposition
t is cremd rì $ ,isir-r,/,
Jqr.4 d.v,r'rr.,i(RNlSD)rùs RMSD
sr$fc (src) qù' for kccsdy
9.r.r cùordiMteRMSD íp.aosi'ion
can be dom by r ,!do
L ú ( o r . p r ) . . . . ( d r . , j . )h c' h c@ o d :hè eqùvdeoe . (4 IÍùf ! srd p, iÌ. rairtirs dfthÉe viìùs). The pmblem
I )
lnùnù sodiiúri06. k)Dr snrtuGr o$ov' o erb prjr (c,, ,9,)(ud oftens. Ìo l). For x ,r{/,?i,,
(hfte dhùtrcs),
ùd I tu14.
/io, (M rsÈs, múd srh of dìe*. l a|d 1rÒt oìe rcùnonai itsobe pedornedin ùe opùfioi !6hd ù ìin!, dìediHrioi ofùe lineh$ rohecalculard fof.ich foraÌjon:cr Eùrersú.orm (cdhefter ij 1977).) rìî{ shiftiis úc ùiùojd: lgcons$rr .$rs) conìnoncmdìmte syrÈm,dd lhgnfi
oI .$h írucùe 6 Ìhe oigio or I
s.nh.d hy an orr4dnl mxùir ,?r.r (lD sprct !iú dúcmDur equaìÌo l. (Ther bdreiúìcngrcs(3)hdrhùvltucsofrhem.ùix(9)(cÈrb.dddr977.pp.530 5r5).)^ m!'nx ìs onlìogomr irùe s ú rhedkrùcs bereco úe poinboiÌhe sme stucùE de nor.haeed (cf. ;sìd-bodysupe+osiriotr) 'rhe romuìi catrrbercronbc dA.rih. vùrof r, lid !e *aÍcìr for a pah (R. /) vhi.h mùimjzs ùe erpEsion (asunine
ti i
Lr = '6î
+:-u+ -L
I -
-G "-
I(,?.'-ai. *ca(ar.Éì)
'o r commooùish a!3Òn'
lnEualon (e I) rftishrk sprified.R
D (RNrsDol lllclirts
:re Sirccrhùeis m ieed ro.rlcd io$cvr( ir ba . Goftrimes $nour) v
ùe needld findù3
ageorf rucbc /,tbea B\tsDD{! r) = 0
jr RMsDD(c,,) = RìrsD! (c. ,) rorarishduÉ c
hù,i rÒhryci dos rotineù chriÒ! (!hei arrNsghÈùe eqmì'o r)dRMSDD pr n bd{Èn 0 ind 0 2 (cohetrrd s'cmb{g leto).
9,1.3 UsingRùISDassoring or *uciure simiìaritie rlùrs
wi'n b{ RMSD value(s) flow
I !tc!!.ù mdot úùpar^où, th.6T.krt I b tÈ 4'aE 'Nt ol ttE nunbú ot qu,4-
sD(E(11.r))/v4ù, whùc,F n ùr n runb{ otdemeDr rh .ùn b! 5ùPs-
b ifPfovc dd{rioù of ÉgìoÍ 01 linjtr bp.ìog},gc|trdiigrruduilrymrc|atdrcejùs
9.2 Alt€rnatingSùp€rposition andAìignm€ùt (avcr) ùrùdrnsrrohr,!h {ruturc is tussivù. r0 = (a,j,rrj).(/r. r, (rr,, ój,). ùre ÌhnrÉ r1sùpe4.!:rìo! (Eridus) ulm rh qo fn.nres (dlLÈfsFoosnion). c
rhci bc tr*d b ddjic a
a n$ cquricncc ,! oithe I equiliìe (ùrhgR0)!ÙùÒobÒi.ùnd Ffonlhrsn conve4eDe(r,r , = rr) or son. nìdLnu,i .r dù.dbn ùd ncùod andFìsuEe.7(a)
Ji5(4i.rr) úcdir!n( bdrsònrcsidus {or(rl .alolate a $ore nor x dnbcc
î := th. Ìan{annatùn !ù RúsDc(E t)
a' :=rla)
rùdrpiò (,,j) do .Rrr=scorc(dh(di.àrrmd (r, P) := ri(n, r) DPo! 1^. n) úi.e f (lìndpÍbr qihsor, Ep , = l t a t ,b, i , ) , . l a i . b r . ) l ,rì,il.rr+E/,, 2
ro= l(2t,, (a: tr).(,6.h).!!./nl.
ror bef $pe4ori,on of (ar, .r. ar, d,) on (h. ,r. ll. ,r) is
dish.ès (lnr rhesùpeFosidon) is ùl
b|b,b1b4h5b.b1b' by this ari$mcn( {n'r raNjnsùc los t) L =|4
ht,i35, h4),t4t, h). tq, b!)t.
ry mosq
(RMSD $rru ruEbrÒr 3liFìèn
drenumb{ ofpaiu (aJ',,,j,) oo P tor vhich .hedisbmes ,r,j aE berov a3iycn [mn k:" !E ùc rosniÒB i,r ùc 4odr or a dar sp.eósìùon). rr 'rì1 way Ònly plid
rre $orÌng of (ar. ,,
fo. m*ùg rhe
$qÉÚeúùp.iennrghltnelthesinlaitybeMetrtheÙiioaidlypsof 'lr wo ftsidues G 3. usine . ?l\M maùjx) rnd rheìod shcnìr comp.ned nisht nly bdwLcolr'r r.4J.4r+j) aod (rj ,rr, br+') lxc sprLi,l sinilriry cú, ror d
rhs ro!1r! rù
9.3 I
JNd F:TT: lniFosÉofue4c
. (i). (b) rrrc d òe r
.frjb!si]llÈen(DP)ú'hlúu!' . DÈdrE!!
(c) rh bÈ cwL fqs
bf re reo
9.3 DoubleDlnamicProgramming
\ úc iìdÒpciòi.t
n vóhrrd
miì\ Niiìg DP (scdiotr e.2). Hoscvc( idc :ìú.dependiiembNleììaìiem dùa sqùÈic dtnmÈd PrÒs3E (ssaP) :ìgiu'i (dcvclopdrby Tfyro. uid orrso ( | 93e)) T]rn is b!s!d on ! merhodcarkd :lbrc dynui! Prusrumn4 (rDP). T' ':ring rhe(bsd rìisnmeÍ bdveer dì iesidù. piir (4,, ,j), hd rù! sch f tns k,
N.rinr ùÌ (,r. ,r ) n ])rr oî úe alis.menr. :r b ddnc, 00*.rqd) r,trri.,i.,tere nn! rdd.
eìDP.andúresofincmùìx[s]ld,lg'tl pú (i. , rheF *jr be defrr.d . spare (lN level) mrrìe {!ft $ilL-qd!nuù,b$Gùrcl5hù"iishÒe
tuiìrccrhc(ov rc€DrLignmenrbeorùoùehkJ.'rr),rhcoduq DPisofnhm o Òpriorrplrh tum (dr,àr) 'o (.1,rr) ùìd hon (r,. rr Io (,.. rr, or by súns 'heso'c of kJ t) rÌ higr! vllr 'hf 'hc o f ' s r f o r( 1 < r < , , r < / < i ) , n d (r < r < D. t < r <,) (lrìrir,lhc krr rdrùdbóhnnshror,isir. ,i3,r.tù rs, íÍd hnir(hishlevelrd fmÙniigi!'ro[cbylú'iùelh.qnri6ú {ch 'hf Gr,.,?) Li5 o! rheoprimal0o{ revel)padìwhei I s i\ ùscd!s 'rtr sd n3
93.1 Lor
,s:=tot",(. P):=DIìJ(,1.3) LowlevelDF.roftd dmùgh(4. rr ) rbntr (ar,.,,,) É P do n sN\P):=n s\lP) ri s?lr) fu;]ÀeDP!-'c,,s,bclpthhP
a\ Ns (nhior s v,i or viuc\ ton
9.1.1 Loel€v€l scoring malriccs \rrs anmdhodsforcilcutirireus! Gho* lhoiccorjardt.oie*a),ordeiirins'hc nemsf.j mdarrj. !.s. br ùsirgth. c,,o0 :,rd1del
sysÈd (rs [Ds a rheydo nor ìie or r sàigh' ljnr) îtr oofdùdcr of
s showri^ r;tue simpre sorrng
ea (rr.,, GÈFlgùF e.r). br4
.c i'ì thc d;rdiE d rhsYcckF 1a,.dr)
aid rr G![s rhcrilhbouiie rsidù$, À ùÒ!c). ilo
'bc rrucùEs
.e ftunes f 4 anúrr, fld n meóucd
^' j +lr
so i . îrìisconponsr
neighbous ii rh! s.qr.!c
(èi rbc odchmg of tocat secotrdi4 shdùct
hòe eqùdseqùsdarrisnrìamd (hr
9.3.2 .
9.13 I
wcac ù!i úc dqrarc of sG. r) ù higÈr ú bù l ldlltèd s x dùrsins runc'ìÒ!Òrd/l + rhcqEdbúion Íiom onbìned targedistues in spre. hghcè:grofd'.vbbÙ.'rÙD sprcebùrno' nu ìn sequem(i.c .ùdird.s rÒ!bci.s ii i si,oeùnd c,rr.
lùrdior GI *luès beinsir ùe inNli
(d rdt
t0,rt fof nonnegriveJ atrdr) À rhecaùsjú
s b bù s.d n I fùins fundionandr M .odù( rd cÍh conponerr.&rìningrhesìopeor I GeeFigùfee.5).fte
9.31 ùigh-level sconngnatrix Alì'tÙglPPeiallyÀNeJT|trylndrcnof
r h,Yceapsil rh. -Èqrd,Ìesions. 9.3.3 lteraled double dynemic prcgramning
(sar rlisDmenr pmcra'ù
úc 6ur ì powerolrì!fqErcelere'h (iorwopmcinsoreqùrronsh)6n p.rrorA rcipklhdÒmFing|hcmvi[mdidJl n!DUN.loeeúÙtfuredloasdÈvd,m
- r !\ boturc.ind . ,ù, di6r
G P):=DP;r(!.rr
0. nì
L!\.teEì Dp foÍ$,r$rNsh (dr.rr)
nldd! a ,\itc d,l a úù ù s l:. f):=DPal^. R) E = rhd n4, rut, b^!d ùr a últ boúútù\ | ùùian k sd^n à
r tsovjs 0 iinúl'ed ard'he*ed srededr . Hos shourd c bcqdded?
!v ,.
eorj'nù,] nsidr prii5 (iid oó{) 4n bc on s$ndrq rmtur rre (or voùld lor mmnÍy !a' Ìo onpae m srphr.herir wnh a Éh.nEnd and blrirì (hos wirh niìù) bur d..fpoicn' bs{d on rhc lnmo eúd{ú'ys!j9&Acd.siYir!r pmrur (o ensùft dìa' aìr 'rrcc hrvc I rdsùbìc uìuc) eNing a mai.ix 0. vN.h n (0 is rhùsdermùd rry us of ùe n\fe componeft) reiened 'o îs rhe r,i.r ,",^ No \@úc {eigbr rE inrodEed. Ìîsrad 'be t úúsfùh is tr{d ro 3ir a rcushry hc p!i^ s,iù hjlhd
\!tuA m 0. or for
ser-riDcp'i6 rùd ùpd''rng 0 Îchishe\vi'lu$ir 0 rorerch.y.le tu ìysnúììtrtrmbcrdrpùn\is{r.rd(r0 r0). rhcÒùry(p!sc orcìc\(hopcfulyhwaÍdrrh. truc rq!ird,n4) 0 is !*d as i bNe roL iir'€ùèùFr Évnioi (úÈ br6 tubir) 0, n úe bì$ úffìr ù qr e ,. 'hs i\e
aP+\= ot t2 +tast1+l st+ /zo), 'hchigh reveimxrixfr s) relemnry 1rg. lQ)
bc bhs hish ÈEr sNd q Dld
t :tuód) f d$ $G..) 'hr &i {[su{r
u (br _5.r=r)
trrj ik Ly qtrùÈ r{Ìhù,
nbuionflonrhei, 'i,ìr0 rùrù!h ibdiù rLgtr'ù.ù.r|pîù!.hrns Gf íkùiry) d prcdonùedt r. È!ùs
ùd rr ù 'he d4ms'
(or compìerìy) non !ri$
' ù or rhcst! or rhcrvo pdeìm ki ùd ,, $d úe cyce nunbf (p):
À= r0++.r; . = o f{ rhcmiriar.rde). we s* L{d
\sc I js rhcqlrc rumbqad É kih bis frdr h* x lur conhhurion(9(0, l) = 1).whichdúcù\c\ ùi'ir J|rcf'nÒjìrù1 .ydcrbÒris.rldirÒryiÒ.oiEjbuionfrenrhebìîsmnix(t(5. L)= 0 032).Thh inb'hefil|gìohlìvi.wpfdidcJbytlf
(hish-rqd) |lrh h crde p + 1 Thcr ! ncw q n .ùddrLd, nd Lhcb
r=1 cdinrisùr9 ó f bcomcst.rhcneiúìon ded pài6 n nqc rho rìe reigrhol dreiG
9.4 Simila.ityof the Methods r equivalence. sar dos ior ned 'o daidc (of Asú'nc)oi.cxfrricnm.itrù.dr.ùÌ3
9.5 Exercis€s r. Re3-irdrwo\mdùrer (,{. ,) or dE
G) ouúhed dsodúD*r
ùtrr! c!{cr
ETERCISES crrcùhrèRMsDc(rrgh' l) ror 'hè by rchlirg oDeol ùe \ÚudÙs''
Ld ùe Dori.ion of (aÌons Eptsr'iig)
G) c8rc fu rbÒsúo.ùjal
3=(t'l.6.b the Bidres
ccnrB. a
(coífor: thecoodinds ror( I shÒùld bc (0, t.414).)
(i) shN rnd rhc,nfnÀis orholonal (iì) RdfE !l rhemrdinres of À. ,i (codÒr:rhciù{ coo irdsofd,shoùldbè( r. r)) (c) Dentrea dh'bù ù[kii D ror!u p|in t': 4), wber 4i n 'rìcqNfdi (d) Defiie i $oritrc ùbii
Ì by dlvi
G) No* 6id úf hths'.$ofi0s equiviìdu (subrignfcí) oj rùnsrhr. rurci / = (,r. d1.,r,4)
d, - (/,r,b!.,r./)4.rrl ii 2Dsprcefo de6ie
lhoose Ìhel)ri. (ar.,t loi lÒersd dy
1 - 1 , 2 r , r 0 , 0 ), ( 2 , Ò ) , 1 r , 1 ) 12,1),
( 1,i),
10,0), 11,0),
(al Fin h r disùnc-ndnx , rorallp rúiig mdù( 4rJ. bydilidiig 1byrhcdnunq ii D (roor dùiiú,. gipp.ù|ty.and]cllhegippÈúlyh les inr r.!.\rhs la: ,, kiol.ùtr (b) choose(/r, rt id úc ncwror rev diiúr\ s rn (r). ro fird the n*, ordj'jú$
rór (^). yor c.r ùc rhc
ardr Gtrdr !ìi: :rong.t. . c bc6e.nsìebc'sùr x{d xi, . R = arr (ùeÌ.60fdinar or!t, . 3r =dr) lhe ) .oo.dùneordr.
. srr=.ùrx. r) =coto),
. 3'r =colx.I ) = c.(eo+ d). . lr = cùs{yrt = coleo d), t !, = colf. f') = Òló) tr(r,)) xÈdrcodjnd* or.poir'i úo ùecoo iiars (Ì,. !,) in rhcne{
t r =sr (r 3r)+310 rr, r r ' - 3 , n r R i )+ s r r o ! r . eordiilè\$tm'foltdlyiig drmm. pócr.mmù3as*pìaii.d i,i k). G) rhre ùc hìgh.teeìsÒritrg,ido (d) conp!rc rh. ùec r ìgnmeùr,,n
ùs ltu Psiu (ooeHidùc i' i .Òtri! u! r '{o of moE prì6) Disrmi wherhd j Eq@'ú (e.6)shors lÌs rbcbú\ m
r Ll is updn.d
rJuc À'.schúr x = rE(L + dJr+r/10)
(r) shor rhd whenI affrcmhesinh 2( Hinriùft úe eqùsrion s 4,;r = 4r'l2+ I en or C appmmhò (qh* sp is rh. v,lE Òrrhècorespondi4cdl in cde p). rha lìndatr t r0( i. )2fa d ( i ì ) l = 0 . 3 . O ) Ì . r 4 0 = 0 . 5 . F i n d r h c ú Ì t r È o l 4 r w h è=
9.6 Bibliographicnotes
FùndmeD.ddicl$ ror epcrpÒiirìoift Mclehrm ( 1e72.leTe),xabsh ( le73) rd cohenands@robùiÈ 0930). MorcahnúlhèftùÙdsofaltndin! inRlof,rRosm(1e73).Rosmutr dafss0975.rs76),sdoreral0e36). (lqq4),noìmindsodù(r995), cohÒì(reeT),RuscrrddBdon(r992),Dined c{reìnmdLcyn'(l'r931 zu-KrDgadsìpplllss6lP.drjem(lee3)atrd ordso (le3e) an eanynsrivc 6ior ú ò{nbd ii orcisó $d rxylor (1ee0), rhoyr n dÈrnbedii rsylor ( l eeh. leeea). vhiìe rheaìgorìrhm
10.1 |
10.1,1 T{o-dimcnsionalg€omebichashi.g
dior (rgid body Fsslnmxtìo!. -{iììy
gilen if, the imc $rLc) ooe,pprùch is ro L-y ro phe rhe query oi k,p oirt coiù.jde lignomg the eù$) rhn n
ìlrsjruùlùhddGfuiÚi!|ìshi''eisacdmiqùÙfdfof'hnorc'EnThl ii Mrmdc qn!m5. ardú.J, isúr poii6 rbi .ach rtuÈ (in 'he rD as).
: ,
I "i.-
rw dslEs,Dúd !i qw
p.iÍs G enpuÈd in 'he irame, oîriiriie r ftrtz&?llr!j!,tr. Hmù r oldmd$dgfiigdbyfFnic!|aÍb&\js.\{cs4 frnnaEd. (f rhe diúrce beùen rhe dd {iìns.) A framcsydcn on f can rho b. ú rhe iùúù ol point pais (or to'n .r.b ieuE) h{ùs Eq@rq rmon equ! co
xr\ò iftrcvrn, Fi!trc r0 2 i[ùshrs ,bc rrLrncc tfamc sy{m\ uJins Lre lìguE pojds lrcn Fisùrc 10.r. h Fieùe 10 ,(a) rhèpoifu ( r,r) ( r xrd l) m .hosr s r basÀfor Ìhe rheof4in is d Ìhe squre ino {hich ft r|rb (ùe sir ol rhè mDaÍiioi) we se rhd Ìhe poi'6, ùcrudiis
(0.0)(ó2)(n,0)(e.41ó. 10)(3.3)l-1.ó). (r,2).id(ó,2),, irrurHts!cÒdplì.dioB Gbplrepoil6ùtrnedlheboldd'In Ìi (b) (r.5) h sele'ed6 'he basìs,.nd
(r c)(r r)(0,rt4 r)|n.nx8 r)(3r).
c) rhe|oinc (x,.)G sleded rs rlÈ b
(0,0)(3.2)(3,0xó,2)(r0,4)(r,o(0.6). i( coincidebdN*n (a) ad (c) (ùdudùg
- oneirr,rmdy (1,r)(2.d)(3,O(6,0 . :trk 10l.nimeìr(.r,tnd(?,s)rhi Òri'ì Ùr i'ùpkmslalion or 'hè me'hodOr i rlmplÒ, dso .oisidrìu neighboùingsqúRt. .or ò. rruÈ syrems(b)aid G) it i oD
f"irsdarcy. Forùnmple.k' (.r. d}) be $c b3n5in r. lnd tàj. r, rhebisisfoi ,, úd l.r bornk,. ,") and(a3,,r) corDùde r ùar Gormor) rde syrb rheon s' ir (,., dJ ùd (r,, ,J rc usd * b4q\ Nor, hoveverdDrsinitú'] ad ior I
bc milc mcúion(,(,t
rr2xr(,i - t)/2) fram$ ae corpmd, 'hf.
ieùìevin8d!ù ìn I (hÀsh)hbre a rdr, na.,ia n u$d to nap rhedlra (or nor co'nmonìyrhci€rl ù (lc!!l) indi.ùsiîro ìl€scJìbempp.dblièshcbu(
FÒfsinprhiIx ht N rof Òur2D lìgufe r sy#n. Ge rabìe l0.r). HeF rhehash pùsi'i tumim n ur lmde i' h,re iù rhesqùùe(p, 4) id úÒ lìfhc syùn lnh b6n (ar.dr). rhendìebish (1,, ,ù is n .boNî in rabrc r0.3.Thchlss ( r.3)înd (3.5)ùr borhpracnin ir bucrebor t rhùedishiirrobrhe,(,! i)/2r4t, DFfeHcernmesysBns,depdìding
rÉcHNrauEs aFoMErPrc
ltunÒ\ fof crch prir (usiùgborh (/,, .r) r):/2 rd (a, /j) asbrsÀ) rhe îurber or pai6 ìD r NiI rrrftfft h. ei'rìf n(' ua(D ll':.ceE€llribrckdlill
gúr r0 ?c).$i'ig'r$h 10.:1, ftee vots re (1J. poiob poùr Í 'hE on3ì is !d ìrdùded). rh* for rrcm ùe c.d.r !f.ì,e -sjrù !È !o voreslor (3.5). Hencerhde aE fouf coinlidirg 0oiir\ bdw{! noòr (r) lid ,hc qu.'y (irdudiie oriein), oìd o.l), rhepoinr at rhe orÌgin besen nodel (b) nd Labeh (e.g..oìoùrs ú(lor rÒriJ) Ni
'hcúoFìdÒnphA.'folrquÙypoinl rbr equiny (q sìmilùily) of lrbek nù
e lab.(9 ii îddiúonrorhccoorur Òi u
hr ùc rq!aÈ\ h€iunbeftd r i qu{c (r. )) lirh coroùcishÍhcdro.hebuckeri,(! D +,c or tr 3D ùbrcl the sme s h imPbmen6noD
r) +o - l)
Í n nowsfrrghrfo$at'r roeílnd rheh
10.1,2 Ceometnchashinglbr stùctù.€ .ompadson peris. addidoi,r\ seomdnchshi'e r.
0 ad seofdnc hahins crn be u*d i
Òfruitu i'ì 'lù*ìirúsiÒDd Npee(e.g. úins ú. C" orech L\iduc.orwemldods r'ómach residud.Fofusinggùim.d. detìmd(bÒ tus $odd ofsiss or mree F'ùs) any'&epoi.b(r,,4,!?) oorfatìinsirislrìehrlide(nÒ.ùilinEu)car ar berheonsii, ú úìci.ris lie itmg rhererof (4i. !r). rhcr ùh onrheptDe:nd rhe .oofdìúus or rL 6idù{ Gro'nt h G 3. îoìi.o fid rype.sond4 fruc 'n rlemcn(srusurc),ad N *pl$r.d i sedion10.1Ì rlìh.rn bÒj'iprùdcí.d dpfun|yqù6ch6hùenìepÈpfu sd: r(a,.dr./r,(,,2r, whm 2' ù lnc iùds of 4r in Lhcri'mc(dr.,L. "rJ.
AEOMETRIC TECHNIAUES ^ìgÙ't[nl0,t'PEpn.Nlleph,g
î.r ladì ta,ttú.d, nurcuttùìan lnptc lai. o'. a,) . M, dr .nturdb ùè t ltqtt ,rht Rlt cal.úak F = FlRk.ap,pL): HIF):= HtF)rlM,
^5 rhcfdcolc f nc vnrr ba5n(oaa,) ù diFcrcdfon oncvnh. nn cramrrc. 6r\r GLúrr. rltrrcm Ìlai r)iia, 2) diflmm it|rri.. rtu'ic5rorùti heprcpmc*sing k o(r,) permorer(wherc .). lr À posible 'o due 'he compbxnyby b,shhble fof 3 nodd n , (,) 'ne (ùsi.g t|Iee ro'nt i5 delitredlbr ùre (rr,. r,r) i! rheì'ìd*d bù.kers L\.nrîtd rhèeid dl $ù pai6Ghctuft. cr.fle frame)nidì hislì voÈsrc inve tr sdAl c$. n h or o (,]), wlìec r i
Alsorfhm r0.2_Re@snnionpha* o
.tuoE J aîo$ th j, bt, b.) in tht (Ea at hasìt, d.iniÌE the ftIètà1ù t ùìt Rjt" .ahutuk F= ttlRtt,,b! qL)
ror ebhtuir lM, R) . H(F) dovlM. R):= untitsainaaar'eaìncideresú n tuad ar all qu.4
rhc rcsul'Òrúir$p h rrnlÒ'(rf, n*r. nj,J. sho*i'g ùr dftftoircìdencs Rír (tur M) atrd RjL (for 'rè qrry) fc rL5 knowi, ('bc dk shd!rc: usell a sPe,posi'ion usirs ' r*driÒ!) Nd. ùrjoiiiis
by $ir3rih
ùd bc'orc, !c ún (orHsois
r 6prc. (c,.,!.a, rob.a fefercnce fmm dreafr' hc a rdprer(4 rr rJ ùÌbe núf h. hthiy 'mi|Ù md 'he orespmdiJ'8 rry oiry rheÍoùs ryús !i'hù a de6red tuDt
wnh ceok d a chos
I sphùe (of
10.13 G€omelrichashingtbr ssE r.pesentltion cqìndn! bóhù! ún rrso b€ u*d lhen dre.rmÈms de ssEs. Typicr y, rbe cd otrÌ núhod hy H.lfr aid sliòf ( r9s5) kDs oneor6e rick'. DependLng on hoq
úan rÀ apú. ll beinsr conrmn. Fo \nb s(riúion orr'À (À'boîs a^ùq(niiÍo
Thcslls of rhcFo ù,w r[
. rhsDr. orssE(a hdix., fmnd)r (i.e.'ftnidpoiÍi'ìòercíe'€ncftamc),
$ irc derìn.droreach(rlsat f ùcqu!ry(r),anJrhch$hbbroiu
prir or ssEs
ìcxùr, licùr +!.à1 (".i-ÈhhÒúirg) bu!kc'\
ilr) rî nich' tu'olìrn i/ ii erlssutuundinsftcoien{idbr 3, (s i2). ii2) marh ollutr ir linilf
mìdl)oii6 b ks Ìhm ," À, le$ L\ùì ti4 qp. Gomesequediaìcomniib
.13) sivire rù lùrnd inYèldgfiÒD.
NasÌorreierrboÙfin!bmÌds tor tt.t.,aú) ldit aJssEstRt, Èt) . E do dejùenE EJ.EÌf îrne Rr tú .a.h ssEI R4) aJR do N := bqr tr kahbùt t'1ckè^ b hr ú Hn, kî ^ | heltE ssEaJth. tisl x tht \ztt 4 q R4ùe !úita, b th! a,s ot a I rrcr-' irnt?dseth?úe lò. th! .4t q'! | at. ar) oJthct^\ aùrwE tat. At.\j, Bsf'
10.1-4 Cltrstèriùg
10.2 Distanc€Mat.ices cùndfic l*lims is Nedrù rìndiryI rquenùd odù j5rhesdc (hutrhisdó rs ncp).co4pù$D rins di$rce n mrics É3m L05 is|Ln ofrhdnúe nfri\ forpDB4fy r.h..;d Fjsurcr 6
rè,idtrrs-(25?0)ofrmd.Bùùshowroù{asrmùndmridhgomrshossbrccon iúrd6 bd*{i lr.ipanuel fnndr, m 'pdcdiathelir)úrnighrùdjcaÌeúrrhessEsiòundin rchc(s{risù( t0,
,pP- ,
" "
:n:...I : I : : I : : : ; i : : : : : : : : ..:l: ::.... :;É::..
.:t.:t::: ::t::t:t::;:t t:t::t::: .:t.:t:l
:::: :::::'':'':'','
i.ii.iiiili.t..t.:t::t:t: :t::t:t:t:
't,:t,:: t:t: t::t::t::i;i:.
ó .rr'rd\J'
: ::::,:::ll::::l,:::ll::::ll ,,,,,,,,,1,,,,,, ,,,,,,,,, ,
(spri.r) darions*)bdsÈ rlÍioN donorìnpLyúd úè .ortspo
'!És orcrcnoú\a, = Id.r..l 4d ^i ìn Fieù J, Espslnelf À shoM 1,, l. l, re,wnh$brtuc'ùrs ,r = {d . , , .rJ dd = ,r lur. r', r'. z'1.suppos runhq ${ 'nc nhiN púcn (disrdca$bEúik)
10,2.1 Meduring th€ sinndity ot dislancè(sub)mhices onc4ù. srmcrine$ob6firdóm
rrr\ aodrhcdiilùro'.c bdw@nrhcco ú HoLnùd Ssndr ( rs9lb) (n]r sE io rheDAII pnef'n) cotrsidù (wftotrt rùss rl genertÙiry) '*o sùb'nfiG lor.qùd $i2) r1(n ,,n.r ,,,ttof{rudrc,4. n\rr. i- ir ../ni ; D^tit
i2,\ ' kù,Diln = )
:hered ìsrsiùìil. q 'nesùre. id rht simirùiryof úÒ sù6shduÉ\ dd ùc m4ns a€ synmùìú| simdjaiig rvo Bnm! orc rc de$rÌbed,'ùerif
D ó ì \ D ^ ( . ì , k ) , D i t i . D = o t D ^ r . ì , t )D î \ i . n , y,dÈni.db b. r 5 À rhis i]ds dìr
' direrenes È$ rh$ i 5 À FYc i po ! dilisfúes !c!b. úiù r.5Àcirr
submxtì.èsrùey m ('o ore d-inrl)
5 0 . 5 - 0 ) + 2 ( r . 50 ) + ( r 5 0 3 ) + 0 . 5 0 . r ) + . . . + ( r 5 o ) l = 7 . r 2 l r r5 l = 3 0 . 5 . , ì ! G 5 . r 9 , 7 2 ? 5 )w i ' nD ' " ì " ( 1 3 . . . 2 2 , j0.. sr) meioscomprif,e'heFÌaiion (r5. . . re) rìd (72.. . 7t rrh 'he Fìa1rcnin I chcbdNccn$bshdùr( ( I 3. . 22)
mPle'Ìhat'hesf*poidù8dfuces]úd5 Mn.ú ró uìd r3. Îlic dù{k ronn for ó .lso
a'@^ti.t)D . úi.n= (of
t! tl
cEoMrrPrc rECHNraurs \hùe D1i.r.l./) trdìeddxceorD)(r.r)lidr,(i.4 Mcncompannsnrb r .uclDì{r.t)odtr(j.,me,so.sortr isserboriÌtrdrd'rdÒturccrL. o 4 is'h. errsdcsirilùì ryúE+o rdchosn eqmìro0.r:0,i.e.20r dcri!'im) rlpìci for ssE situh. adjiÈú rRrdsinir shedrypicrìlyb$edisúiùsdr45^,aDdsbouìdmdchrosirhin ror.ft diiìc.kÒ5Òf2 3 À FqdamP nusd.defimdas,(r)=*p(-rrdrJ.
whcr o - 204
10.3 Exercises Nhìng'Folsinpli.ity\re.ofsidr\frù
10.3).rr ( ór:(3.3.10.s). b r i( r r , ó ) , , r : ( 1 3 . 2 ) ,. , ( e . ? ) . G) Pìorrh.poirùir ! dirgúm(lmis 'lìe l,oii6. Ìo Évears.lÙfts úd úìÙgLcs) (b) fu rcnov roderjnefetèEneiix'n bùpÙ 9. DÒrìr.drcric rtu* ùriis úc pan!orpoùN( (.r..t ind (r,.ó|)(ri rn(r,.rr. j.Iis'lùùerr'hesstrd' (c) Defrnea No dmensioniìììish hb o bùckd(i j) ir ìrscd,irs (Ì.r)sdisryt 05 < r < i+0 5 andr-0.5 < r < r +0.5.Fiììonly (dl Faf.rh of 'hr turrcn!! rnms o 2 rr sedionr 0 r r onp.2 14(2Dhóhid À Nri'rn rr kJ, q, bcrh. bf( ro!i. ad tr,r.,, rhebai\ fof,, îìd ìer borh(... D ùd (",. ,J coir.idè in Lù (q,nmoi) rarc 5yieF eú ir(!r, 4) $d (ri., ) ù. usd èr b,ss. Nob.ho\Ereritu limLllrirrîjld
shw (Erpbiciììy) 'hf ir 'h. útcuhr meoincid.mcsr sùowùru$oos trorr.d robc ùe ae rben ùe ompùi.or is noÌ exd.
.,ld súb'nùdc6)andrhcsurùìaaixr
Clu Sin
(har ubrrùrurs,1ì úd r, bEequaì
ùb!ruqures(,11. . 15)or rmd and(26. . 23)or r d,., whci rìc skfic fom or ùe siEìLùq mqsr 4 h ùsd. mlìdforEdÙ.iie'heflmiry'ìmewhen n 'hf fof I ripìer (r. /r d iÒb. r rcfcrcrceframerhft ú*' b, r ùiprd (4, àr,à) ir ùc q!4. $ch 'ha re
10.4 Bibliogaphic notes ù bcroundin tvotfsr (1997)anjcl6 for ( !99r),Fischdd al.oss4). ova|dwórfson brìon is lmn Holm rnd sùìdd os95) a merhodfor co'npùirg rìoribrcrmrein (ùoùnd, hùsdl is dtribed itr shf\ry d ar.(2002d. i rblrì srd sDdú 0ee3b).
Clusfering:CombiningLocal Similarities sìmihf (ìoaì) sùbshúur\ bcr{eeùso imtl,r $bNnd €s itrrolùsq simiLTsùb sddÙBcnndùebydÙjhngme b. I bn ninÒdirs, sir€ rcr 3u bbjec bo{ùqíicnnudeìyùsdinrhcI
1l.l rhc
CompatibilityandConsistency \i. Òbj.41in 'he expbdiE
ohhe ctsrfins
iE FjJs or.flqd,,r.
hcra. is exptajiat itr ChapÈ 10 Finshion, |d i;b, empx,rb'e eed.
jf Ììesùbsruc. nslnsu(n,, rtóunbecotrsisri'Nnh (nI,r),*hj.hú.yùc tufù.oAisrìnror (!r. r, is comparibt. wirh drssbfrucùr (rr. rr, i.e.snnihr ,o x .&u n desFe (rrd .rd.onfRinr bejig dÈpsndenr on rtì! 'icrljod) Fisua I L I
Fts!j arsndi
@ !!J. ,j) ec @Ep!dbb, @d s úe (rL, dr. tu pi! (1j.,t eiù (/r. 3r),nqnhs úr {r,. r, k 3joú. (snF,ùh) vi6 (rr, À,
Nob 'hf -,r/'r./r')
isù biilry rcrniù bcwei crmrds.f.r4ÈzÍ
(lir or clcmcnrt rbnì erh or ù. tvo 3?Éd m'.À4 (Lo.slsimilù subrflduret bc'wei 'rè shcturq ud dH joiù ,h A *.d 'mrchGho!.lusù).onsk6 or tG', ,r(u r ,t(ar
,t(36 ,óll. roùn
sed 0rcbs re joi|ej itrboNis€n
in ,lP"t= t(,r.,,r,)t.aneremnc
a \,
\v4// t
B. l,
^ _______:
djlvrhÀr,ùú(4 rr) n ró, atrnèÌÌ {'ù (rr rr sùe(1,.il) L^orsjmrtrolr,,3r. Gmmcùicb6hins n oie mdhodror
(t subfrudrG) wirhhth soe. oprionarFfinciffL îìis n oncndone
a n Ja r d, = I A , . . . . r ! t . a d
(a6,rr,(as, àt (zr,rro)t, {(a5,bt, (bxfù $anpre,seomeùic hNhnìr.rbc ddeni3 (@ul)ir3)nor depends otrúe !n6'ù.oisifcn.}sumoslhÍ{e (t,-lo-.'ì.ro
," rr.ra.r,..ó.ìk",.) a, bú, (or sed mbbdt car hesrouFd.
HN I dlùsbin-Èpcrfonìcd (lLùsbrii-s ùlgorirbm9l HN n
'nÒcltr{ùs yÒÍd (rhr sÒr
Searchingfbr SeedMatches
10 15a ammd crch cd. andsc.! n oie flom ach rructue. (rhis teps . h!$iis
[email protected] b{ ) G0ondh. h!
tqùiriie equallerFh\. iÍd usiîg disù
11.3 Consistency T{o dNtn cr c1 (a chrd consistineor one or *wLl seedmdchct can bc joiffd ir 'hey re smisÈd L€ he eìemenh (slbrrocture) rmm / ù dNtr I hc ci. thc dho( eúlcd ri úc sr. wy îÒ ùc dùstrd ù. hcjohùrtrúc
úmisFn *irh (cî. cî). ro dùidc cóisis'cncf {nhù E/adDi bdrvftr dcncn,\ orÌhe eme studurc !rc $ql d raitt
11.3,1 Teslror consistency Ldúc'wodurùsbe c' = lrrl md c: = lo"j,vbúe r. ind 0, aepai6or
Bl rúùirm'ioo
1r). Rdfb!
i! búvc{
È. \'-, ì
oi.rr (4.c)3(rÈ$úrrudú.(rr
ùri & ùo(n,
L. 4 rr)^sinùb$úrtuuÈ(r,.&).6)rnqf D qndhdr Íq ad asii,, 1iN ùr{orùliù
3,)À$\nbrvdi(!i.rr) î81tuft;dùNùs,È,qr nFiù@nùrd cqjo bypù0n56 úìr4
dotre oi 'hc pai6 (rno*i
* ro.or .&r,als
on rd). \rheúù
pùnneed d
ifan [btuùy Fn P. rnd or n (Dsi
andcr= (&.rrL).(1{'6r)',,t
ll ll
11t.1 . od dsd ., dÌ Pù!. r!€y c@bjded ú rÈ$hbdrcorcdBorl/i,/ù, . ,ir./r,ili, . r L l t s i n r u( @ N K L í ) @Di1i4ú(rr, rj, .. Bj^.Bt, 81......4.). ro'hesuÈ1rudurc r trsitirity hoid\, ir is cioùsh ro conpm, ror dampÌ., (1, . rj,) vidì (!l . 3r ). Forr sinpleillusFnonNc c 1 7 , 5 , 4 , 3 , 3 Ì r{n5d, 3e, , a l , a d ' h e 7 rmm ùre6ni setand5 rrcm rhescond úrìsil ùis. bú ru' 3 rsr rhÈfi^r rnd 9 noó ùÒ$.ond Hcnr ro croùgbrocr Ro pails(ho* ndy depeidson'hecon'e* (se EreErc 2) a 2 cr îid Cr !rc looldl uponrs slo doie on 'hec 'No unrs (kno{n ó rroaar.turentr cnÈta) wiù turÒF narioo.globaìcriEnr arcorb q€n Subrhdue\ (Òft rlm eeh shdt)
11.3.2 OY€.l.ppingcluslers
rr c := {(A:,,r) (ii, 3L))rid c, := {(r,, ,t, (,.{r.,t1. rtes iNocì6re6 virh r, m cr.burqirì R, ù cr. ^ dú'bejo']'ed.sirc 11ise,ruiyar4ccd
11.4 ClusteringAlsorfthms r:" '''
-J " -P'i'
11,4.1 Linea. clustering
l: .lt.j,*, 11
e := ate sehúed s.ú ttutch it lx t uti:nú
úìth e ft.n join sm h e
nncb $hi.h ,b€r 16, Gors hshst
ro oie or úc .lsr^
r lÒùed ro
;ì,...,,.,.. :., ;;. ;; , ",,,.,",..,,, " "'.*,
ù4rè a .ar4t
ttTkt Òtatn rcà na
tììtt !t?d nìarh t t)||hhh4Ecùri:;knt Md.h\tllt:: to aR oî thè .hrùf tc ) jdì tù n, c: rehÒ'! yù f,ù r\tít no ùù. |èet Daú6
.5 (
ttt wr a.î}eld Dar:tgs
aE jÒiLà ta eltetth
11,4.2 Hi€nrchicalclut€ring ln aggìonùúivc hi.Efchicrl dùlÙi hd rc no{ \imjtù cqne tusìÉr) is jojncd ùe dùús
beirg i,ì84
bur rccofdiig
11.5,1 c
s(c) (cr,ct s(c',al rhcs.ÒrÒrsúùpins ùr(cr. cl := conrcr. cr) rndIs(cj.c, > md(s(.),,!(crDt i"c(4. c, n 'dc i. c andcj rc conisÈn'. md dÈ fùe iiùùss by$rins ùem
e := Et al ct6x6, eaci ked nath heinî a .t6nr .r?:= l(c,.c) 4 € c rnd .j € c inú(cr.c, I "nd c | := :laìalct. c) thc hìeh6troùn3 pansiÌ a
e:=(e- {.r cr)ulc,l
11.5 Clùsteringby Useof Transfomations cúfcnD3 hy us or tunsfomdionsc (@Fjgre ll.7). {.d mdchcs ì sed nn.h consisbor a koishrctro lir orcomparibre ssEs.c.e.l(nJ 4xÀr, 3rì..yp., y rolndby !sin! rhefchrions spoidirgrohs (cd) r drMincd (Lvhth 3n6 úc \opqsriú). îhe r66ibm*ion is rhei foùid Òm{r\ rhcme giviigmìrinumRMSD,
= (rl. sl) úd .: = (s4.q) ùù b.erctrped. !c mur 11.5.1 Compàring transtormtioB decideìf 'rreyd siùrlf mdgh b jùs'i't b.comprcd.auodncrm,nvricnd$dsùiò.(r9qr),,;r,wopforii. fdp.nd y
'oe-'' '. --a
o rù Es ù5 qd b erù! afdG rÉ,$ o R\uùA {hri {! r@ù b hrc aùFLi Ftumf.r
rE irue Íi rdÈ ds1{i !rÈ\
fú{ooidioqÍ!f $È@\hg(a
lie rìoqi (ir. rr ) tu (,l,. ,r.
dùcnmlfomrioijseAútdb'vifld aid saodù(199r).^s iated itr cnip
at iìarics (n, ) qnh rhEinEEe ot rheorhd
d quaúiryiren\edepf'ùr fon rÈeodrnúrdxu) i'r rfi\ drúrcn\ùuùg
sorlr Òrc,iÒilrwÒ*c úr , = 0 iI À and Ahsr.rc' rL (ree6) d.ribc r íu'iÒ (lìc rhrcJìold úb a rlreslìoìd ùrd n 0i).
lllir.rr)(,4 , r0(.1:.rr(.1,. rj)l :nnlnr. ri). ri: = h
s,rEarbe'4r = rtrr/r.3r.7iL = ,t, î; = M(ar. 3r. h,r ÒnÈrc ùc* 31qtuf'hcsanPreft rertu berbedste
L-.1 È.
ftunxr o(rr,nrruft4odúE!Èio^e'{qúcuGGloopd 'b)('t ,t i rerd. @dú€ Ecbs k.{{tr rh..orè e ampd forrhebsr sùpeQosirioi of (rL. a L), 4r .hcmgtefor (rr, ,n, úd 4i1 sd ól' bcanarogoùsry de6ned.we iiid úr ó Lisinì b ó:, sd 0l L is simirù e c.. rhfefoE. ir mjghrberhf (1r,r') crn hc dunc((| k, (/r. rr. ùìd (a,.rl) 'Ò (Àr. at vc rhorcrnrc nccdro conpm ónpeddn4drssh6îinFisurc |.3.(À,RL)ir.onsife wirht,4:,rr.bùr dic 'Ùù* òd{m! ùèi' enEs (dNhedliis) ft difcfor rheEroa, we pùróh úe rcúrions(routrdro be simro oi rh'.obhned sub d in Figue ll.9 TheombircJ sùbrhcrtrÉ (31. ,r) is foÌaEdrir G). rhe bcr a É r o m d ' o b c l i m, a n d ( . { r . , ' ) c a i b € d 6 6 d b ( n , . , , ( ( ! { r , I t i s s i n i t ù îiÒ smc n donùfÒf(3i. 3î (se Fietrrc 11qb))ìi' n óbd ójÌ. ed úe rbs djr ad ril úÌpdd. rh.r xrcd round'o besiniìù, 'bercfm (r r. 31)ùmor b. crNÌeadro(,r:. 4) ((À'. Àr isiÒ' siEiù h (rr. ,t} Gs rhesumordres!ùGsoiihedifldcnùdoi3ochàis),andr'hrcsholddefrned
compad'erheÙulonm'ioisby'i.; ibdwiyÒrr {200r)md$rc ùLcqr4<
11.5,3 6 0reJo!r. Elchdufùc, dcrìi.s ùhi.Iofnùon I forrhesourcerfÌ /1, bedre trisîpdnd b 'he€r odoms fÍom'hesouE otrefbm rrd o'bcf)ir PP shi.h @ s
i i ! r i f , r r . 1 r e d h ' ù e b d \ e n L m d î L i s r h e n d e n n e d a t+r Jt r-lr-rr r i
asnnÈùo ofrhedusesaÍecj = l(7,r) (e,5)(lr, 7)(la,l0)(r7,r:r)| od c1 = (14.rD(17,E)(22,17)(25,2r1.trhc 'bdlh..r = (7.3)(e,tO3.7)(la.loxl7. l3)(21.L6)(22. Ì?X25.2DQ6.22) = rl) Ld thc cm$ndins hr tnr î* bc ar 119.:3)(30.2a)1. IO.5)(rr.6)(r7. (2r.r6)(2u, r7)(25,2r)06,22)(ll,27l(r5.23)). TiìerthedirùeF beù.en?; 'hnqnnu|yúGúI'ÒrbeinedNtfd uì'i'g fmmFiìiog qcb paúofds' dar (zujr). whcr rhcrwùrlNIomúúotrs sjvetrrhsùold (D rheinGBr r-r À), {hr
tlì€ rc* traEfomatiotr
ftn ú considkdd ú b. &ri'iÒ.onsumins.
Dpsi'iÓd,*peiyvhei'leaú€dúe loùdindc syr.r is rd ùwry rromrlìcnr$ ftNrc TaUie ùe denle mighrrc$x
0 Ìbe corcspoìdiig rom ,, herce ,Dad 0 fc ùefed s6 is xnodùed sr or'cuftnf cìufùs GrctìdRàbsd by cr4) ?.lonsuìlylhetJxfomrÙÒnlù\upeDosilionof dr crc'iÒrr n P or 'he eteneft ir c
rhebìghs' sùf ofjóini'!
my r*o dN'd
ir c
jo,i(cr.c, ioinso cìrrùs e - 4n Ícd Pots ks nnNbtÒhd!a!^) r := tht î@ttr,tutkxtù ltQ dntu^ ú e 3t := th! paîs aJaai:knt ú^rè^ ú e, tu rre ro?s iîj.úhs th!' tc 4, c B) := tt ! q'úì:îeú pan h a úL hish5t rok s,tua = r/ú !o,( ol joìn(cro. crr) únc (crc, cRs)+ rand sorc(ioin(cro,.Èir > s!,ù do r,,.r = $oft (join(cra,c{r)) c,,:=join(c/ ú. tined d6Er îú = 'he totr.fÒntuhulJo cÌ) e:= e - ctoúcRs)úcir f =r lrPorrkt)vrif Òrnrèú rùh cll liùauc.e,,hkrtan oú ùhút.ú. 'he rùa iîjokìÌ3 k41 dlaDs! k b] Eù'ovitg al dato îù c P
qúttkrcr d.,tua,t rùi,tt ht cú (c,a. cRs) := th" caúbqú pdit iÌ 3t túh hish1t r&
crr ì,corr.cfrHl....t.wrìeÉcrLF reans rheprr(,4r,rr.
,""b,ù+";bL.h*111".*" d*t"
11.6 Clùsteringby Useof Relations ('1r' 3) xod (an 31)r
r",*,r n" .-n *,**r
Jl|(,''.3,]ÉJp|foìmJet.qul .",,''""''''','
ir p (a' '1*) ;t",1"" p ls nsd rhe pii^ Ùc consùterÌ
'i.,.:"..--r.,o",'" -*r. I"J,Linvso*u" "m""..
/"1\I \\
!'iemll'l0AiqrnPÈsshBÙ{ duiù (À,.rr(ir.3rì úq 1nùrhc rini Èc dd pt,1,n,) : 2(rj.rr) 6!r ùùr 'hr (^,.À) isoq\nborpii (rr. rr. h{ùr }rù r(rr a.) l prrr4lh5 ll.6,l
IIow natry rclations 10 cÌnprrc?
AsìiE wchrvcrso.rtrrd! cr = (4 ta. .., p_t'\'i q = @ t. 01. .., e, t, vhùe each(r,,. 0r) js a .ÒúPÍjbÈ P ìdr oi rhuiòm of rhc€lariotrtomùh p (ur
11,6.2 Geom€lricrelotion
gi€nl*oskks,bolúsyPÙam4fs bc.ver rhcm?crh s ,neiN crìùìeìig r
an qampre is rir.i
rron Ahuùrc!
and Fislhr I t996), *bùe oie .igrr aod
rh! urstcd b'wùcn LhcA$ (Nhcnl,m.leded oDa pùrlld plrm) uid rbr nioimum (d'b) rtrd núxinùn (r.q) dúnc{ ronì ÌLedirion (*! Flgúrs rr I rG) For No piin orco,nFribr. ssE\ tu b
. IrrI,r,ril < r,ùuryr=50.: . , i ' i , r i ù < É , s h Ò tru. .f t x q m r c . r 5 À ì
G)ftc FBnds or2.!n
ù{ rrej{cd q ! 6ùd pe!ùd
[email protected]
(Ùs isd{]y\ Dcsrhh)fhe ssÈ ee
0ùfdhL rid > rjiri + .. rlli < ,iù + É rre6)byFÀnsiorof$icyL .
qNhid n inùshrd in Fiem 1l ll(b).
I 1.6..1 Dislance .€lation dilgonak(idm sidùe dn'ancs). ro d
rdineúcdùncnh(iitrrcsiduedisbo*t. ftsgaEù!rc|ljos'InfieÙf1t,1, lnb ,j. ro sc ir (r,, ,) h cotriscDrwirh(,11.tD, wemur .oúpft 're dfba ro chrk whdhcf(ar. r, is coNiscor{nh (rt,4). we nNt @npf úÈ trnh ',. ln rùeligurerh$e e d$ E
CLUSITRINCtrY USEOFRELAT1ONS cr = l(,4,,'rt1r, &)ì,ndc: = ((an rr(n,.4)l.canrhe* bc.o'ibin.dlve coNkbr. Hd n i, sfriqenrrochc.kco6insncy beNeen {n,. r, od (,1r.4t ufedÉtÙcdrownNlhcmpl]pltnslf Fisufet t.r2(b) showspÒ{brc rrudrrcs fton r[c dnbne mun.s ir (x).NÒrùrh4 rructureshr* djncrcd 'opoìocies.(F or Hornandsrder099lb))
sd.rirudinEquaiioi(r0.2)(ch!p6r0.2l)coìbÒtr{d.NdcrhdrÉdifrdi.ds ohheduù (ni.3j). (A1.3d h rhft rùciih dishnes(11,. !, 1rj. B, and(ir. a{), andrhedirief'* dìeidrefdi$nes1, (Àj.3,
D^Lr (Hormrtrd sddtr rs93b)dilids rhesh.tuÉs ine,(o\erìrppitrg)ù,i. ùc{ir|b, (,, ! + D: oed,ppinsdiro* (10.2). ùsig rhc{ono!in Equarion sì îc urd loi rdùcirgir simitirdkra ofnsidrj somcb.hnic'l opemriois or morc'h.n sir Bìdùè p!ì6 rhe , (- 100)bc{ *ed, !E rhenchoaDror oid ÈipxDir (cr6ErÌDgcydo. afiú ún .(", rhe,i (, = 10)b*r dsus re dìosen addis doie bya Monc crkr Érch GimùìÍcd rnncrjns).tn rhisrep dÉ se.d10bc ÒclNrcr Theovesn,pprÈch is ilusnrd ii
11.6.4 Us. otgraphth€ory úorcjs aÍ edseb€ser rod* (ir.1r) if 'lÈ rtdioi,{rr./1})
mFfl E
mffi E
ft simìh (graphnomorytusm)aodha\e the hishe{ (!rÍanry)
$oÈ rb.h cú e
4 .;
q*| The(ioddpodur gmph oNruqLdìryq{uus! rúich$\'eii aleùfirnn5
h$mcmdhodsGÈ.r'Òrc&mpìe.Kocbúdtee6)m ?4! piodu.rgEphisusd.
Grd b FrefalÌze iioiì cor\n'!r?ior
foF - t,_,/(n,.4). trr. rr). whùe / n ! masr of dissinianry. rhàì s.hh rÒ,rhciìichsr fo;na cru:86À dùq
11.7 Reffnement . I iìfur djurmeú
or rcnidh.d
milhr he
11.8 Exercises limgeo|,j,ÚdihÍ1}issiiìjlf b&
f{ 1./ a 1 1 . 1 r . ( 1 r . ! r . (aat .: ( a r À 1 . ( a r , À n . ( r 1 . r r . ( i r . r r ( 3 : , r r , ( 3 r , r r $ 1 6rt rrc @Ntuhr Ù bdngde.s r 'tu . b:úrrn..
( i : 3 r , d : ( l r a L ) , ri ( ! r r t . / i ( h r L ) . 3 : ( ^ 1 r r , r : { , 1 r r J ,v h { e
rh5nu'!'hí(^rtj) \coRtrèd!,ù
^a\ aj\ - pta\.R ),pta!,a3t- p\RL 3n,r('r(r1 \ùrb.daAd .,, /..+,r1i,hrd iIi:/i,(ll3]3,
,:) îEebE, ùee
o *amine iÍhe dusb (rr, ,r) .$ be stuùpedwirh (1,. ,,
(a€ on$srnr)
l n d c r= 1 0 , . . O , r , 14.... P,.a
G .oNanr runbs) or ùc pri( ii .r. rhcnr, is .oiskteù niÌh all pai6 ù
G) rah p3ii or sàis shoddnw bed
usen\is llb\,a5Jlb1,o)tbt,4Jtbh,à6lth t.lrt ainor bc eroup€d. (b) NN r rùrfonn iÒnrof och ot to crlmjne whichor úe *d prìÀ (ffon (a))rhd e consisrn. Forrnc so seds(,,1J. 'j) òd (ar. &)!ocon aE apprc:dnire\ eqrr ir 'hc ,ri'rrri (c) compùc ,nÒdùs6 vlh Lhedìrgum youdrtw itr Ermì$ I Òrchap
spm.,andcoNi'lcfssEs 1.. rhcmh e diEdion.dd bè\ùrls. a: ,'(c): (4.3) (4,3).nfr)r (6 3) - 00,?), nr(a): (3,t) - (7,7.1),n5(É):( - (5.5.s), ,{nz): (5.s,4.5)
01. 7'
B:Rns):(5.5, 4J t5,e),tjltp):t1.,r.r) (rr,o, rlr): (e.2)- (12. 5), (3.r,4.5) ,jlo)i 0 5.9).rnl): (0.5, ù - G.?). (a) ùaw úressEsitr a di€rrn. (b) rro ssEsre conìpribb itui0y rd rc) as rcLúons. shrr tr$ rìe dnare (d , dt. (r'e ieÌorc aeks. romk n c4y ) FindrherclarioGbe'v.etr Èrh prn or ssEsfor erh frucrùr. 10oic d&imrr prÌc'. (rf yoù{ish. (d) w. $rlì sy 'hú 'so rcÌ iónsh or drecoùìÉdblePiifs nu in (b).Yo G) Exrnire ir ay pRin or úÈ fir q
11.9 Bibliographicnotts @dnve' aì.(leer),arexiDdmv ardFiscbù(rqeó), Fi$hr !r rr. (tees.reel).
Àeù€r-er^rl. 1lee6).Ptuù mdaylcho( teeù, veòibkycr d. oeeer,cnrdtey yl''gL.|j'dcore)'K.yqrglJ RNser(r 993).DALÌ h dcKnbedjn r{oìn ùd saider(1993b).orhern,qnoose
ibitity n b dchnerhec;rch s@a u | $. or
merhods aE io Dicdon.hs( lgsJ). Falicovxrd cohetro 99ó).nd HÒlnad sùdù h q E d Ò ] , . , 3 ' r ù n, ù , , d ; ! u ú q src xìt{,rifjcîl (chrpbfs.7.1). ^ úmmm y r|t er of tuns vnhin rhe.ùùi, atorg
berorrÒnsn {.h rù. andîre ssqdú s.db.! byBsgtcy ad aìr,iri ( t995)ùd
Significance andAssessment of StructureComparisons
. Diftd rìignmcnbiE rchi.vcddcpendins qNhichbio|o3ia|FPeniesedEla'ioNreenphsizdnìúèÙnp'jsúThìs n n eÀoiabl. rosqurc 6d lnìqucaì TteskúghrfoNaldapmehloru*à\
rnbúìon. nndin addnion.orelÌ shap€(ìnets &r d,r, s.oidaJy rddm .or'Èr' xndÈckùs derlig. a5 a qreFe qamptq imúE 'b hepr'eiJ's b€i's .onpùe Éldom *t ofshrùRs collainsonly, búakoinc|udeEpEsenhlivesofiypj iu6 (noeL) shoddbecaÌculared for qch onpdì$n b nd.h rhenomdom
ConstructingRandomStructure Models
risuEì2.r îD 'odùi ùiu!, {d d ' doi di.É h\ ÉhLis h ù! ùN ùind by(ar.dr+r.dr-,
s ur 6cd. Tnc rimpk! pMedurck . sí si'l8. px.kinsir ravóuEd(o rrarrhcsructuE rcnlirso,mptrùîìcìqEh irar$prfned.
12.1.1 Useof disrùcegeom€lry ,,!{ar
(4,ódi ind Ta}ror199.1)k a pbsrn fof.orxh4ii!
pfÒrciod.!dì: in
.Ii.hfomnnodelìed^xspherc e dons is sr b 3i À. . Thed'rind dnbìce bdrft! nvo(iri vrc$sve) aetu À drn, = 2,iùv by rm (vinurì) rtr8ìesror erchresdù. òc bon,r dgrcp didrh.b*ioo ssr( e ù! c4c (oi+r,or+' isrhean,sre rhd
(+04)i / = 2d1úr,/2r
(sondims crÈd .d(oys.) cf, r,€senenÈd oflhcmrìnPmPcnÈsoflherilive![.fut.Dbercbireddudigújs.hhgirg,
12.2 Useof Structùrè Databases súucors (q) riú I tundrDdad
ddrbr*. rtrd ny ù Fmow (of ictroE)
12.2.1 Conshucliùg nonredùndantsuhsetr
lunns úd no rvo rDcùÉ\
havc sequeie
elrucror ? is (sbJú $d s.hncid.r r99r)
r ( , , . , )= 2 e o . r ( l ( + , ,))',56r, úea n ad ! m ùeleùernorrreshcrùrs, !D,rsquene i&iriq n $mpud rrcm
dBss ror (.oish.dtrs) ronrtdùdùr sùb Jd$men.globa|]ricms'b'rlow
12,2.2 Demùcationlin€ lor sinilùiry \fr ùi RMSDlrùcisù$d*arofc.
Es (N) atrd i cùfle drawnîo spdre he' (ronhÒ,iolÒgÒut &o,n trdoN xiórù1i6 Typi.rùx ùc RMSD ù.moLogÒut onhomologotrsshcMs (or rnetrib) tue f, rì,r cÀamrr. 999i of ùc fruc\ frrr aho\c
r rom( + r(N + t'/'. whùcr is1.2ù3.
12.3 Reversingth€ Protein Chain
ù (ìnpbud16 i- ,,
r ùlde ùe robc n1rùùe i$f(lr Esidwr {úarRNrsDd0r dhù 3rd À (r rcr40 Éldùr Nd 'h! ph}@cyuùÍinry ocri5
(rhc L s !rc dìfriùL' b nahhin i Gndonnylenflrd Lofismysp$iflcovenllsì'ni|aìtyl
shdorcs ) wrrd wil be
iy odrnabLtùlLnearingÉ
coînkroqs bc'wn rhnÀ in,I
I lne d'h.ln (rnrl $@'( !c \(Le).
and lhc hard {,f
r2.4 Randomiz€dAligrnenr Modèls
ùniqucrssol rbex$w.i an advrnùs
u\onlb ealigúìèr|sfo.a Fúro|enicns an b€sdpld
P dli'elh$c j,|Ùdùs by
rp boùidàryonib ìoser.dgÒ(rayb, r999a). Thh!\redfÒetpi$'helimil,i|IlAile
i sndis RMSDamd b. rÒ
renr (rùr 's, oic *iù a to*, if Nr rìlimrl
RMSD) thùc n .hq i eood.hà.c úr
of ir onî m;Ds iircrion ( \uùìiiF rhd rhero{ kÒa b$n
12.s Ass€ssingConparison and ScoringMerhods To b{ ddduÉl (r seqùúrid) corgei d!ùbè* (rc! chxPd 11) h!! bmome dd tuidiònar sintirìry). ìPDBì
Thcn erch shduc
ù PDB ùcrsiig rmr
(€qùid.g Lrìdùe sofe msr
be codrfe rd eeh 'lùèfy), xnd ùE a beNed'fol*mplgbystrrili'y/spÙ|n.i'ydirFams (Bieúf
r àl t9s3).'ahc pxi6 Ér th dsl,.hB:hoìdTforx$umedhomology poinÌ 'o b€ 'he *hse rhe !ùrnbùor útrhoEdosou\ pans(ds pos.iB) oosi'iv*) posilirù\pùqDy(.rrhdcm^pùquùry€PQr)*0.0t
ùÒùrd bc.qúr'ÒrPQ.(0.0r) d r. rnu urctrrab 'hc sn\i'iù'y (hc ftldio! oa ùd honoìoss rhd halr , vitE nbo€ dÈ dìrcshord)ad 'he dìtrftme beMÈi rhÈ, úruc md 6e EPa. one en or coune do thi, ror difieEn' uìus ofEPo.
600 ifuufcs $oine syrenú) 'o p€rromì .ll by sll o,
1ve us ùo pmgrims (ad
Plhabsrspe.ificityalúìispoirl,lir. scoPi5 dso 6cd À a sbf,dd for ass
12.6 Is RMSD Suitsblefor Scoring?
rxúryì'lolÙmpìqgidEsmlbe h rh,i 'ìof m rìc corc (bú Noghb nigh' be uscdin rhecilcdarion). n dos noI (for rtcoÈnt)
peniìize ci
showtrrtúr ihc óúù sonig ryrèns suajty
12.7 Scoringand BiologicalSignincancè othd nrocture (or vith a rpacibri. sledìotr), 'bei sors vin rrl1
of rhe we
which soB
rdr obviotrdy lorrftd
pain. Brwen
ftÀ ÈobÈn
ffre'nci* Òftqùd$ (è lbovc). olhss, $ch s 'he DALr nÉdìod(chapd r t), hrve doptd , siùi|ù,Pptrfh bNd o
12.8 Exercises uk thispmgún (orùo'bcr) rd.oi shdstr.h.ùb*s r'ormlnmunl€rce.
G) uE 'hn b edimderbenunìbdor (t) cln yourropN n eÌplrndìonofr biÒrÒsì.rrD.Nredsetom jr, or n rbeE dabasofs000sddÙsAnirlby xrr.ompùiiÒnh dorc, 'nd ùè p!4 or srudús mdsed in a tin $ftd m iicFaine $oE Gsùn're dú 'r. lÒs rms bcr) cc lir ìs w,*ed doNn
12.9 Bibliographicnot€s rdiscù$ediicodrikO996) ailryrisofúd (|eea) d BeÌamou n rnd skolnck c00 r ) Rudom ruùnì mo.elsft disusod by rayror (Lee7a).Dnson is dsdib.n ii Avódi rryrù ttee4). asssmeft "ld d.$ribedin Brùù ersl (tse8),cc^bù ald Levin (les3). Levin 3ndc.n'cì! 09e3) od srrk d ii (200r).compùirc ùic ir L.vtu andccrrcii 0993) aid fnytor
some*eb sddrts* rd (rÒlshdins) hrpJ/lffi.rccc..duÍ*àrcb/Lùbs/dùihÍllJ!0irft v j]' VAST,trpdb,h'nl{dda h'rpr v"w..bcjprp$irhNbse/rìo
M ultipleStructureComparison 'lìc
ror doiis ùuuidc rruduÈ conìpùisoù is ùe $mc * 'n! doins motiipìe
r. coshrc' nùr'ùrrcùrigí,rcm\ dorùllfsqpúoithefrucùainrbeinr
r. Dncowc'mnon rúirpaftns Gddm norìft - a smdìefpartofn\e$nrc
a lùnì poinicourdaìsob€itrdrded:Endinsrupù*.o
ù] nÒúli Gonriruc! or
Thegrppmahsftstmelyrc|xl{w n .od al$. blviq founda conmonI)drcm. 'qffrudinexnd'iple.lìgnftr \ihen dislusi,r (xnddevelopins) ùeúod
j.r $quo@ qrmPùisoisre eirended.
dìmr ore rnduF is bisÀ (pivoo.i d 1..t,{',....,.1-tbcrhciNdutu\.andA' (lfhout thc ba5n Fi*, rheceniEsd maf of arLrhcsúu.b( be ro$ orssrc.niry) sùp.Do*4 só 'nd dÒ rdr.\ m I aÍù fhcn br aí dúde 'he;rhúord'de sd oish.turc ar 4d 're ùdsrrdon (denned hy rhrcèvùlù.r.îh.irorerh rrudù
II,',ot;r: -.,t' nr ol coqdùab s' i Òrrhdu. i sÚÚm lciehr (rr){idrposirion sciebr(1,).
T:i-\ at)R;^1
ri= q,
vhde .R; h rheop'tmaì Fùrion mrn
p:=at^, lrp'pov
fof drcctuE ,1/
= oE aJtrt rnanftî rr ntuc'uE' ù ^'t'. nndbs tt'. nùtb'
@rùù' lù |.' u|ú+^ ùntir RMSD(/,'.
Ll=t ùi R:Pai
' . K or n6iùút\ )
ttDhù aJqcts pefanìd
t I
ii I siddbms
13.2 Progr€ssive StructureAlignm€nt e 6 nùì,iprefnctuE aitimm' n b Ne om bedecmìjnedbyfu!qìculdiJ'edìb a{xnstùelyhiRdbw,rsL.ÌPi nerholsm *cnde'l Geech|'p'err).r
uiaìe or a coDsenus squence(of prchre), rorBc sh.n riieriig (súbsollisrn.nt ssa?(sa?cmalsobeùsd)(cbipde),itrd dnMUufAL(Tayìùdar 1994). FjA! ùrh p.]l oishdurcs iscompftd (ùsiE sAPorssAr) toass sI pÀinisesidkiitis 4d rhènùù p,irui$ sìúÎlfìriA b4v hd rtu ndd siril,r oI úd* r *ch shsÈ(re prcersiotr .ricmeú). Noc úii 'h. oI mÒrsinìlù Gùbsèo,ligrdhb àligicd rifd. AlsÒnlim r 3 2 'nÒpfe(w.
P r c s 6 s n e i i s ù s Ò r ù c s h d ù s 1 1 i . r.2, r. ' 1
rori := I ro,, doLr:= n! I/,t cnd tof ?uthpÌn e . c\ e u doatiet lc) ,c^) îonn,ltheîae úd t.tlct , c, ) hc thepf
n1Ù \ih
ù3h4t !ar!
r/ =(t-./ - c)uc,
lì,à ttE rÒE ,t tlienih{ lc. ct)
ù€DtìrÈleb^ (di\lù.Ò
li ordù ro dain
bdNcc! irre É.crbois
usqrs Òísù$s. rd !;i be'her&n rriemd úo'$ bmiH
(,' is ùreNmberor
r:a t* =; Ltîi
,_, nclns b|xik)]
rl rj
o r lcngrh ol livc Dosiriois Ld dìc sùond v<'os (vedo6 rmm Ìhe scond P$niú b
( 07 , 23 , 2 r )
-6.D (5.r. (r.6.2.5. lj.:1,3.7,5.6)
!l ij
( r . ó . 2r , 5 ? ) ( 4 9 , r . 7 .5 . 4 )
sS^Pi 6.d rÒrxlcllxbr sìniLnty lrrec m'on (h.n.c ! prirwi{ s.onnes.hcmcn usù]) rhÒs.Òringdcp.ids on FÒfú. rrw rdd s.Òfingfrdfir u s (stuargofiúm 9.2),rìc sofing bcrson ù.
Nbccd ùd,m.oishtrrs,14!
= (h
rr: ùd ù is h òlcnlL.onsùd {cighr. ehiehlad$orinem' x (se chapÌefe).
13.2.2 CoNtruclior ot coúetrus
({hich c
ouBidedìe sope or ùis bNk)
i 'hc dis'rner ofihe con*sN 'o tìe(ìstrofiic rhcdi€dioÌ d is dkhnce),/(.,. q) = rl, àG,, a) = 16,,r1.r.., = re. Tnisis = t2.5, 'trfo'is!Òf ighrÌchi4r(r.., rGÌ..r) - r5,r(cj.!!) = r?,lhrchnposible.
13.3 Finding a Comnon Core frorn a Multipl. Alignmenr grps itr dÈ iìigmHr
(Ior.raùpìe, Lr.sùib.d í s..liÒn rI |). rlEn @h iodufc bevÍiabilìt}hlheposidorsollhÈíorsfc
rhèmrùÙu wiùù, gip h,he murripìedjErem
tbÈhold îor !ùtubility or ù. dÒ'i pÒsiriom p := a: G| )= nE cotun[ af ti aL := ttt dd$ ú !tueÈ at eumc ú c' c ! ,= ttE @q4. !rctur qfa| \aít fo|echsùb:fucfure^],dr 7ì sùrut'di1e ta,p,c ?) na,t rh! bnio.tutiù tu6km At Bùsrl Ior eaehúhnn ù M do ú|1'kie th. w.ùbitit, Òfth?pditk^ Òlthedans o | 1- rhè.Òtttn in ti th.rt thèvtidbitit, < v aJq-d6 ptom.d
13.4 Discove.ingComlnon Cores An in.ui!!. wàyor èr@ndìrgúc pri 'o laqer nulriple c16k. Nine r senùil .lDsÈidg pmcdùre (s chap'erI l). rn DgnÙì'iplecìsÈnig,sP*idlyúcbc roton.ilmighlb.wonhv|nleEpkciJE'àe dscribe lor nndinsonmon coa MÙS'ra (kibÒeiu d ,1.2001).îìe geomdnc chosnasa ÈtsE .g. 'heorh&sùù .ar
ìpleFd nr.h c l nDlde lrigdúúl ol (seomdfnal 6suEs, om ftom ea.h sdda) r qium^. îc 4 $brrudù6 dcrìn.J by ùe doN ii rrìcdigrmd ùc lppÒifrrtrry \inrxf (.oisruùnr.
so!rcc n ùÒ Pójcctron ot a nlrriDr .nce. The rc! î. ror ercL Go'me,rr*nce). is r ct or piir{is d úrchÈs. prisis clusedie pRÌdue (rsi',c d.'nùdscxpllii.d ii chlpb rr)b rî 3. Mlte nolriple sùbdigmer6 (piopoirs tn! órcr) Èon múu, q$ircí ÈúNise$biìigmfltfronaìl'he$úG'
13,rt.1Findingrh€nùltiplesecdnatches ror rhe pMmrer i (dìe lenlth or the Fd mfchlt rhis trirl he à .orpfùEnc bLrlEtr sc lnoud or wo* (ircENrs !Íh i',cFriîc r), atrdÌhe nrmbero|nor Ndtrl Red na'.hcs (deu\irg *i'lì ircuslr-a r).Lsboqi'zdal. sct = 5 in rhenc fon! (d!è rftr Èmhresidùt .h be.dmpr&ry sFlirìcd Gp 6 tas úiDj rchtion d 'ninor inúgt br 10 ims dishîcs (sc chapù 3.r). Howsci by usins tri'tr jnrefdisatuq lod dGriitins rhcrypeorsymmc'ry ('hùc m four+ccErcEjss r),
thùt cq{.on!
od rft ùch dÒh (c) ,ded. cjedi,ìg tuprs o$isiinc ofonry rou! c ùornrolrh. riupiùs úd oròrd inùidry
(úir ùdcnrs lùid bc rdr oo! 4d ù *ould not ic.cstrily 5c a etr}|rfupìescoNdrúine
trìis tuPlcs rrucbE' ÙPLcrì:
L r r , G , 5 , 3 , 9r 3 , ) : ! r , ( 2 3 , 2 6 . 2r r7. 3 3 ) ì r r ,( 3 r , 1 5 . 3 ó . 3 c . 4 1 r , a r : { 7 ,e .l ] . 1 4!,7 ) , , r 1G: 7 . 3 e . . r 3 . s , 1 3 ,
Fmhù'iorc. lci rljcI rùp[ a | : G, 5. 7. ro. 1] rrcmùe Erefr.e belrshedroùis
'ios or 'h*e in ùder b frdd lmd nùf iplè
13.,1,2 Pftùiseclúst€ring an mùftipreFd nr.b* fe póje'rn oniosúh (of rhe,r - l) paiBor (crercmq soùe) sivins rof eachsùchpaù a sr or pd i* rld r{ch6. FÒfrhccrampte !bd{, rrc p'ÌNi$ scd mfchs for (,{'. n,) nod ú. cÒtr5idscd húbr ùe rherc ft oily Mo difffil Fdjdjms fof /r 6om ùe tor mùripìe$.d narhet
Ther for erch (crerctre. $ùre), clùsbi
!o r.
i Jrùo g .ì Be;os n pejù
13.4,3 D€t€rmiùingconnon orcs \Drcc a sr ofclufùs (rlismeis) oodrcinglhehighs!$Ùbgnul'ipìe bc $npúrr wiù .sh orhc( a pm.Ldurc being of od{
IIT=: ,r. wrìee ,j h 'rÈ
bdigrgjrcrcdinÌhebucresfDmwhkh d6br c . @trhcbd fton (amoDgodrs) Pr, ùetr . is rghrrcd in 'bc búkcr whùc 4 is son,r rhis .rn b. Nd 'o dursn. a corc (md'ìple ilisnrnàr) shoùldcoorsir (pd of, dep.ldirs Ò! now ùc
c: Pi Pj .i..i cr*tnrs
,].,: c1.c:
/'t Pj ct.4 ct "j /i.4.i..i. 4.t: .i
,t P;
is bcìrg dÒodar.ìupÈ rinn rheEftrence. úich ììd simitd sbfmdùfs Òiry b snPmc Gd gfouP)durcn qhi.h
Fnr, !1u(ù\ ú,8
ieP reo) whrh c
suruL thd \c Ìralc rhfccsourccs(,14. rìd 'hr roft bù.k r gd iìùl'rpb ced nnlhc\hr4r.dùdlnbyPJ]jiTa
rb.dsrîs cl, afd $ùrLdusrc*f
cr6Èf for,4r. rh d6rF mmeaEdibrÌnrM'ÌoD e. rrcmbùckd, (cj. ci, c)
ùid{ cj)
rlre ùusdion
ofi - I durds (otrelionerch suro.ù di@
Ld rhc cllica
is fofd
trov b. dotrcby ùsiis ùc À a candidre fo. a comon
b0 (f m soure! 1:. nr. ,1a. .,p-ù"Iy.
Ìh" fr^i -.b.^
d ú"
= { ( t , l ) ( 9 , 5 ) ( r 3 , 1 (1104X, r ir,l ) ( 2 rr, ó x r 2r,7 ) ( 2 5 , 2 r ) ) . cl = 1C.r3X9.r5Xll, r3)05,20)(2r,26Je2,?7)125, 1ZJl27, 15\. ci = {(r.e)(e.l l)(13.1rxl.1.1ó)(17. ?0)(r:r:. 2r)c5.27)(r:7. 30)l
13.4.4 Scortngclùsters RMSD vtrìu$, ùd rn cvdtrfioi oî rh Lcb drd
coFs shiftd by sùbs6 (no' ,rì) of
13.5 Local Structurc Pattems prb$
n rharlmy of rbenor in€rring
ùe îàturs orrcnrakerherm orrmdw
be$ed {hm rheyre rosd ii nóf úù I Danol prcrcins.somcpfuìn
tulmcmb$ rhe aidEs lomin! a s
.r píbrbirisi.) .hior €osùr
cal,sqmnce ptrtrenN(d.rmini{ic $rh frucuB.
n.ÒstrchpÍbms aÈ.!ììd t .?, P.ij4 nolfs dsrlopedby Jomsn erd. 0 999,20{où) lrc tu,hdjr crprai.d ùpb noN
13,5.1 Lúsl p.cking pattem
rorso,ic , > r ùd cIh u (rpie
ii! a biduo cir
vhrturr.r.:r !efti rumbsslcoo inart.,u, h ù diio kid mrh sr (M, q ,l ùd aJ ds.nbes .oishirc Òn ú Hpsriverr,d heÙx,/ sìÈdrìd rmp (o ! 1!,b,4).rnd.fiinrcnanÈc{end cd 10jndtrdc orherdnorcÈ prcpenies(e.g.expo*drbùncd)
apftki"ep!'bn F - (q.rL.:1.ùr.q'),,(r,,ì.:i.iri.di)nmÍchedbyi ido$ (o . . ,,) in drepmEin suchrhat | úcrsidrs!.
. !r d urjquc(ceh o(m
2. úc rniùo !.id of a, n ii ú. ndch sl ,ìr,:
,di deino
er rmm rriclninÒ (Nl ro rhc lubóry (c) ì::
il ? {]thii ]tr RMSD of al ddr f ( ( & j , , : r ) r o ri = l . . . . . r ) . c sd orqìodDrs (r.ì, rof aardue: rrìcsidc rhdi lro'ns Òfrhc Íridùc ('iù rhù qep'ion of gì, rof Nhjch 'hc d.rbon is ùren).
p - l 1 a.3zo.t4 1,F. ax 4.4.5..1. r5.r.c.d) ( 6 . 5 , 2 6 ú. 6. 4, , 1 r r . / ) ( e e 3 50 .Ì r . 2r. , 0 j .
ap4*4'dif is x p{rìDs prfcn P rhí hó mdipÈ marhs (@uftncs) n n$. or (or lnbìn oDe)prc'eìn fnctoEG). PtrckiE pdrms (ù nÒ'ifs) dc\.nbc .rùstr*
13.5.2 DscoY€.iry packing part€ns 5 1 r r . . , , 4 4 tú. e ce di\lu.c\ bdrcd ,ll p,tu orEridu$ iosìdca srudure r A,risrr,lrrdd
NrL or i rcsidre
! a l4t3tàoúrool !t4
Ns.,(of n.i-!hbnùrnns r'o'rhú) * ùe Èsiduo, and drn rquiremd Tte reiùbN
t o ìs rhe @r/ior d n\ rcìghboùh@d f.ìng Ns/.
'be stuctufs (is mrchedby ai ìed r rmcùa,
rte genenìizÍions.rc pickins
3úrrÌed (fewer fesdus) shdurcs rrc
s D î +.rb.varrilp)
rlakèrtishbò(. !.i4! tú aI Esìàu4 iÌ an !rud&:
P:=rnhed.jE,lbrn6ù Pf := DF probesdlì(p)
E acE G4GEA Ir cin becenùRrizdlro tr t]ig{ijnbfilhereilhboÙhoodÙi4.s
c$rdngh ù, Nistrry sùìr3acsT9Rrw h I simprcdcpri jìsr s.hh ùcd ù findatìgcncnlizrioff ofù.pDb.
r a p en P (eqml ro dÈ rtrchor)s alì ncighboùnnngr wirh anchorcqui sb cdby lppddirs a e\id@ a 6nmùe plobe,fomìtrg Pd (noÈrbd a dos nor P ii úc doidiboùhood).Îc naÈh.s 6 P (M.) e aDalysdrosÈ ìrn\eycm be {hedrerF a hr 6utrcieÍ suppon(Gcm ù enoughsúucùrt Í n do*, ir ù aeRii
Lerhe neighhoùs€lEne oi thepmh€beÀcEîcqEEA Ld dne odrdneigbbou *qù.rftr bc DELWgRTÈ,lL@RqîrEc. aERmglrRla.-Its rhcsqucnc. ot x pekirg panencm beB!GrE if ùe dyìng'hemNtnseivdbyÌheNtri€
13.5.4 Soring the packing moiifs
RMSDlarus. rhesorc &kodlpcnd5 oo'rù si1joi\]ùrids|hcsNtiùts.lheugltrf
s P ot=
rlhùo Ì(P) is 'h! inn,nnfior corGd ùr ùe fquence patr.nì (chQrcf 6) ,
13.6 Exercis€s
rd3 posrior'Gbove orberoN dìebîs.)uiiqùdydriìrc'hu4ru0tc^s rryte
s Ga[ ir MS{) h ùn eEEne ee
MSuDSan iiMSAnekrysúrùÉh
(a) Èapìrinbowyor trilLnpr*í
(b) EÌphù wbenandhowylu*in gedùÍe rheconsèNus shdùrc in Msup (c) Dicuss hÒwsupsm .ù bedpandedb Msupsm in | {ay sinilf b Thèrirb iiÒn r mdhodror 6ndineì tufc. etr bcù rog!ìdea paisi* p !n úè shdtrG) is. rof oraúpìe,ued e iobr{€d ro a mhiple.lìctrnHr nbeusedbgljdesAlùdMSAPb 1r) hPoE a vry ro $e sPnh pinfln (b) How.m (è.gsP!,f) PrM\ be (Esds2){ ftyingsDd.?
e shciB
b rheEkn RMSD
wb.n I prÈú ? is dÈidùl ù\e 6N a n ,lMrr
kr Pa (or a P). e€q narh ro P G inv6{ch Pa (ora D. rn 'hisamlysis, ù$d GliFcd b úe o i! !\e pa6m) Érphin {hy
(ii Ìlè positiùdsin rbepdrd,
úc ú fc sho!ìd bo .hars.d ro lvoid rhe
wehfl e rhepmbeacDrcr16
(dìedchorn rùted). údn\eneierboùr
13.7 Bibliographicnotes | !1.(r93r)rhd scs |Î$,is $Faosiriotr iu€ fúadey oeeo).shrúo d d (ree2) ,id Dbnord 0e921, d*qìbe mdhod
(nbrn inrlyktrd j (r994)ceErratrd Ldir(rge$)N.rsinirùbdiimpìef merhod ThcinutriphàLignnc óf sdiand Bìrtrdcr(1eeo),Russcrlnd Barùi(rcal Diigdi'1.I eel).id Miy iid rorsor ( 19e5) !r forù! rhcsliìe gaÉmr.Leo
borb$ rs rh.pivÒrh,1d.lcnion (E\cilicrd at t9931l(mtì d rt 1996) (lee4)ùsanrl aodzirud skoìnrck (ildftdùiùrùs)appfolù HoÌridaymdlviLler(199?)hsdevetop.dImdhÒd ii (!!ya ùdThomkÌ ( 999).\IrlLxce pdhsdls.nbiDsdrlesirs:
xrdrvouionresr).YdaÀfhssimia. ronls (FFFs)is pÒpù\dt by rìúnN lod s|órtrick( tees) nÒr ù6 n hibolirz eraì (1001)T$r (srrú5rycr rr. 2002h1 ùid MAss (Dfq a !ì 2003).rhelùer bisedon ssEs.Borh sParnd$cohediironas$nerd(reer,?000r).firdigpútrmsffodmukir cùfcir md Alrnai (lqql) ad deRiialdn rh (ir ddnioo b sPn() {c ù \yaÌo lnd van@(1993)Sudd 0999)lidCo,*di,r.(r996)rcpfsÒtrrheshctursAlirer er.phs([era sèqùejd), ù!c lcÀnD o et rì. (leeeb)havededoped a p'!em roPs srùfs (Fro$ r ar. eel).
t4 ProteinStructureClassification 09ó9r in úc PDB or Jatr@ry2003),md b be e€snble ùìs ùJle Nnbù ot dssiiid'úoi d donomy or 'hc objec or úèir evohrron(a shctur is roft .Òn*^cn rhlDsquence).lî addirion,{hen a úc luncúonof ùc pÉ1ein.Ift n disco iI h. .*i.f (rìnncs{ch)
r\ dssificrioi unir.îiÒn daùbÀ6 ,
Prot€in Domains
or r piobin ùÈ Òrì6 x\wùd
virh diíeEí
rùlor rùn. úùghry. !î rdsidtr6 (ùdrurcis trda domrin m:i$
ormc hydrophobE
.sidùe\rg 160Íd290 165,donai!2o1 r6r-239,lnddomrinIof 366i3.fte
coF, wirh Jierìdl úiiìcrd fsidud tdonlin n 2er-3ó3). r r8-i65and s@Fj!ù€ r
ior (nùnìbù or coùch)
búwcli thc
t s{ondrry sh.rúrcs (includiîgp{hÈt) \holrd.mlv oos hdweÈìdinÙdd
14.2 An Ising Model for Domain Identitication rlylù (leeeb)prsdts r bohm'ùplpprcrhford.rab
ù Ged.Crhcmoddr Àsa. lsircmoderìs i ì ad h;s úodd úN's6 0r ndr*, {
n" *n, *u. -.r' i"a.,' ;t,t"u*o.
h oùr iDDlic!.ìoi. ùe mdes rcpÉrrr *t,"s. Duriig dreirrÍio!, n is hop'd rrnr Esidus *"" r. w *..i. "lo.*n.a u.r"^"t" ,J 't'" .." a"."h ,tura c" úe emc sút lrLueorc Girìpre){av of
rcsidus.qhiú a"1""nl, ;, r-- *a .,'a* o t""r "t ib (sprùtlv)neighbounns -ì d*.".d
ifuh" *".a
r, ì*.
. ùe jnnid suk valu$ (he $l$
srd ùrchanerir cqmr.Îo
r rmrìns ro
ol .hc sùbs ar ran):
n. ' ebb",, "
r is rhc úc
tp1€sentdss-{! 5:,' ,ql qi4 o. tsjdue i. rhen a Ginùùhloùsìv) chnsine ndd'oi Òù tr rdeú
=.j + is re$ úf
"ri \;
0, .nd 0 if rhea!ùncil
ir 0
l]re rumriotr I iscdhd ùÒÒùerù!tu
wheF 4j n dE inMronìic disùic b..$èi n ùù inlersd dnkne r/4j. md r rhe
the ú crboN or rsidùes i ùd j, p|
PROIEINSTRUC'TURE CLA5SIFICÀNON islesrlo r0.,orif rhenumbsof cycl6 . chÀii.îtris gils sutì.ìo! opporun y ror
ei havins refonwins uìues in rhE sùc * s i w c y d $ :1 . . . , 3 , 3 , s5,,. . . ì ,t . . . , 3 . 5 . 5 , 5 , . . . Ì . 1 . .r..,53,,5 . . . . tr.h e n' h e r,5 \ )Td r$ic.r.\bmnef ..'.1 ..( . )4ll " h B ó , b . r i o.rvcruled rhc[i.kns iN ouf. Ttelsi4nodeldesnbed,ùùghsinpl
14.3 Dorìain Classes Trìecofeof rbepfobinsn mde by p€ckinsof rbÒs.ofdary rdduÉ .].úhs. Sìn.èúf e oiìy .!o ryr$ or ssE hkine lrú i' ù. prcrins. rh.a rÈ oolyrh@ rypèsorprìiwisccmbìlrljo :
ro dred.frDiùonof 'rnc (h,j n) d rss of dodain.: maiDìy{, mrìnly , andc ,. 3rì{. ùd hdopusry fof /r. rÈ s-, cl$ (sc bcrow)Ttc bofdùèi tu rrsÈs rlwysú'rghfdwlt'l'F1gÙÈ14',shds
14.3.1 Mainly-r
dìd adjcú c heÙcsG om*d by !naìì4rhù dlèn.irìy-, rndc, dom Ò1/-srddsG.s. owiq uPb 5%/r4radt.
a/t ([v) !Ìdo +t {id1). (DDM bi scot
i4.J,2 M,inlr-Pdùm.ins rlt rlrjiry,p donìirs orisi ofom d noa p $eE, úd pssibry , snd uuú of c herices(es. ì6s rhan5*). ftcy ùd ric mmbù md difdion of 'hc p luftherclasily rÀe/ doraif thu thec donaiN
Ìi sh. cla$ificfrons 'hs c-, domii\ f" dividedjib lvo: 4/t domms hE a minry ,lrcÍidjng atrmeem df c-hùcs md p.súnds arois rhEr€wene. Thc ,r{heb m nìainìypúuel. d + p dùnai6 e morc$sresard ùio Mo (ornorc) pds. ùd úe ,{heb m mrrìy anripannd.
14.4 Folds ci6 e rrcked (msed
iD iD ùchihrùe).
allnù$dofditdenr rotdr (,n4 i!rca y is rìrirè numbùord ìLrft pIoFiB). rhar orly $DedúÒeo$ibìgl)rljngsdbp.logic.rrrógcmmbreobsddprcbibìy .Òn4 6on ùe ,nd chenicsl .ÒNrains on rbc .h!ìn. scwnl pe4rc nNe ùedlopldEllhgnunbÚÓ|diflÙeÍilol (rr 2002) 'hea de hund 300 dincfi. fotds. ned ro havc I sÈ{d lllobrbilig ofbr!ìrg i comùn arcsror (bdng hoúolosour. bùr ùcy mishr aìs hrk rhc smc fold dùe ir'lft ancsron (heìngulosoùr. shdùrs
14.5 AutomaticApproacherto Classiffcation
samphi by.urúinr rprhenunb6 ofssEsdhryiiE$me prcp.nis, ùd tookins ]do$bcdinpEvioÙchaPe6' lte dibrir ù.d lbrchsificarior wirLc dopredr.ìnilar fold b(rN iì fep,sqN J cvolurion).Fd rúomric clNifr.nion. ir depeods konre. ù' rh. cú.offs usd aofpìaci4 sdrùc ìi ftc sme erup. Îhse cù.dfs hoc bèn demìned enpir,
14.6 Databasesfor Structure Classification rKrur rssP.D.ìiDD,
FSSPh a ford crr*ficdion
!e Gn resibte
bÀed or sùuc'ma@dufc
4 su, q(
,i rrisnmcí z > 2 ae issn.d
dÒ diEèeÍ lbìds (rwo {Íùdm\ ro bcLou b rìs vme roìd rypo.
Dri Doniin DicrnìnaryGÈ hltprwl2
ebim uk/dirrdo'ilii) d$siiìado
har/sùú úm e.urhop/.
14.7 FSSP-Dali DomainDictionary
(2 = 2) @6 6rd
Diri Oded puEìy ondE rD coordiD!'6or prcÈint. rhe Aìignmenstrè cotuhctd rù sch pairandsorcd by lnc z vlìuo{u1c (rlcn€c linkqc) ispcrom.d bNd Ò larue of ùe ariFmcí
ofb\c Pfobìn$ N
I < z < 5.cùtringùchcdrhcz w e D,riFssP i5ahoúPr€ù\ of Prc dominsm d n óÒùrdÒmains.'lÍ. fiv.b'hco'ldsha.bstd.hìghljIm. sionaìshapcspec (by a m.'rod rcìaÈdb pn.cipaì compón.ntaralyli'. Fir do$|y PdPul,kdÉgiús m idù'i|ied, wlùh 'hey.,r| dhb6, m'ìdy. d/p'
dr.l. rr.d. oripuùet , bùcL {d d t, meindei'Ìì*e ddtu hE dases.
wi* z vdùG (ùc oLi'gc or
atr prin ir {hc dsref) above2.
14.8 CATH
lis. 'hor rorshi.h 'bcscigtrihms asF 'o
1t,8.4 d'ornricrly Àsigrcd Gsùg the mdhod by MrhE d rr. ( 996' iaonii'g ro ihe I {iùh ùè (ruduE ^ $an nùmbtr (utr'lù r0+) hó r, b! mrNally dasihed. 1ìÈ rr d!tunrs aP fcFsnred ror).Ior rìidin! ú. b$r rilk rcllrcs
by srick tv-,
conp{silion. nìe sond.ry rrudm conpo.nìon rcprcsflÈd by dìe.o'r'rj úc Èríilc numbs or residùsrarììdgir crb or rherhe sam&Jy rrudùs:
'heryps (heìr helix,beìix rrund xndsú!rd-súard).rrh viìì Fned vbef ú. ssEs í mcNEd À rhenmbtr or cd rd di.à cs rc$ ùan a givencú on e gps of rle irhrins ssÈi
ard ìc$ rhin 5* , composiÌiorb ddirioi. úae prcrì$ nu* bive not rhm í% a{ ud È$ ùm 5ri / pmtr 'so mdtr crr\rd\. îù a-l (hdlRr qn be frùi\er sbdivid.d itu o ùd, and d/É caGcodes ùsirg À sotrdq tu eerenÈeeor puÙeì p shds
'hesienhrjdN. indepe'dd' onÒpÒrolfar n\emomed(odobeL2m2)'r.rc rc a3 diÍi{ed dhiÈcúes. thes m.uftnd tior of ùc scondaryrrudÈ dmed Ìì kìown rchihrms (è.súù beú.prcpcltcr
14.8.4 Topology(fold fmily) sh4ufos ù gÉup.d inro rordFanìiììes arrris reverdepÒdìrgÒnboú ùe ovqax shae.údotlddvi'yol'hs|@odaly compùisotrsigonrln ssaP (sc chlpÙ 9.r. Pli'mdds ror duf{ine donains dr,hùk ShduEs wbìchhaE aossAP sóf or?0 sd wb.€ d r@r 60s ol lhc opular4 particùlùìytrùhin rbenìainlyp rhc d p ù@ras sndwich aÍchiÈùÈs
! hisrrersr or on rbèssAp 5.Òrc(75rq smc nairy., rnda l liniìies,30ror
14,3.5 Homologoùssùplrfrmily
upeff.nily if rhererisfy ore ofrbefonoritrg
14,8.6 Seqùuc€ famili6
r5tó (sirh ú lcN 60ti Òr rhr r34ù domiii eqdv.len b úe shrlq),
14.8.7 The CATII cla$ifcarion proc€dùre
14.1 js rof, cbù doic b),úknlnrinsrhedigtr iúr (úrry ssaP) bcrweqrhcchiìr.nd rheiepÉ*írjlc orjr (s4Eie)
ó A$j,$ r chs ro rhedomaiis(90t4o'omfid
3. a$tn dcE etor, ùùù,nt Noreúf ù poùr 0) chainsc ,$uped. md in poii' (5)don,ìtrsre gbup.d.
14.9 ClassificationBasedon Sticks (iick\), n i5rlEu 'hÍ m{y 'o lire *sor\ scodry sddùrc ({idr o'ry oie 'ypcor SSÈii dch l!ycr). ar
[email protected]úl a ,{he. {iÌh hercesabo\r iDd bdow coNi'uÈs úru r{yùs (BAr) xndir ,ncr ed s 2-s-3. (as rhctuis tu ful dir;icrìor ìer Nmbtr or hèli.dsis silcn iìN.) wiihdilÌse .orbjrdio$orr!y.s,p (bmh), n\issyren cù csptor .lro
od i{x visui;arion. a fóng EpEfnúnon tr
ìibnry or po$ibk dmceme
14.10 Ex€rcises desìÉid G tu*nù d37 údcL, ddmined by NMR).úd uq ror erimple,RNmolb vismìize'he sdctuE. sft,defii.dbyscoPÙdbycr'îg, CaDyougiveanqpbîíion of NhyràedominsrE è6ned sodifeEDù?
i/ ùe. GÌPP c rBnc ud b qrer a nnu &lndir
Í ú rhr cÉii n r ùi Fr,iÈr!r{d ( B){ s[ior ú/, ftrc
iE ndr Èrh rcic(rcc 6 e
Findwltrrrhrdd,rains aroor.h|in a. trnngssaP lyo! vrrr rinda búoi for rhf jn dìecrLrH vindÒw)How
(d) conplE Ihedonaindefid'iotrs{irh úolc siwr ù c^rH, úd conmenr
14.11 Bibliographicnotes iscN*d ir lxyrÒ!(r1)99b), andndhoc {e dd.nb.n ii ron$ d r ( r993), swùdeÌts ( 1995).Hoìm dd sxtrd$ ( 199rr). Hùr'n
atrdsùdù(1ee3a). kramdrr (ree5) !tursiddjquiaîd Brnon(tee5).Horm aÍd slnds(1993r)ds.nb$!merhodwh
PRO'IEIN5TRUCTU]IECLASSIFICr'jION scoP n desdibÈdio Mwjn * aì.0995),c{rTr in onso cr,l ( r99r), o,enso eral. (1ee7)?ld Peùl d d. (2000).aid FssF/DaliDDin Holmúd sùdù oseó). Hoìn ind sandq0994, Horn úd ssndù( r993h)andDieÌm iDdHoìm(200t). Thedifffil crusìncdionsc conped in Illdtùy 4d Jor6 ( 1999)rnd Dier mùn md Holn (200r). EloÀsonand somrzlmù 0999).onpms S(OP atrd bcdji ràytù (2002).
Part III
Structure Prediction: Threadins rùdyon. wouLJlikc rÒób1rininroú{úon on e shdue qPeindany Gy x ny ..r,far rocdphyÒrNMR)ù cxpensive andrinÈcoosùùitrs,rhùe hd h*r nu.b woÌ* in ryi'ìglodereìopeoonm.úÒJ.fofpfu rcrìadei b se homolo!*4rercs w dopedbtheftws'ìÙela,inaaÒnlrn.c h *qudce.deriveddabbrs{ sucbN Inr* cixÈdvith pntid (ù doúdr) fhiiics.
drùbrsof sùucrms.rfIsigriici
nùnìbùof trdtrfl[y Ò..uriis p@iî lird. úe úat the Dewpmreinba a rold which ÉaledbyoùeÙiigùcsqKrcaeaùra tu nc nigh. adoptbÀ*donrhcnruduic orrhe
,r. squme b bc@icÈd bu'*n am m.ù br{d on sfl.ùro (Éùs rhin a squene/squence,rjgmeùù is poreodsny nùe bioloEclnymqoùetuì .nd mon {cuàb *hùr .odpùod wirhrì. Lquivatar
ommmly cilìed ,rrari4.
sone of draÌ us úe scondry 6hdúe ol rbenev onot *.ondary rrucrurccremnb iron! rhe
Protei. Secondary StructurcPr€diction
ds:bclhcsqìndî}rnrfuttypelo$ br!. or roodruF (or?/4) nìe i,,pù
15.1,1 Artiffcial neùml rci*orkl ^d'hciaìneunlndForts(.s-,
rc rto
spe.i.llon orANNerLcdtudt nL?d crrkdkrMt
orÈurhd jn I oùhrrù
I hr4 ,.r/r
.onieded) a ,?i/,'
fe tr)
cotrered Gs opposd ro
n AsmiR
t. aidrùrùL nÒd$on ir\ rd(csso 'he vàgh' brLvè.i iod* i od I rre ù 1]Éi i \"ìù. s, = )]l=i e,iq i5drùìrtud (i = o (rcsDords Ìo îì *ù. rooe sîn .0= .ù€dr I 'hfeshord) Tlìe rcdvfioù lercrf Nde t n rhencaìcuhred 6 i toftrhrcshordtuú oi r(sr, anddn j . a ù , i ì i ì ù r $ d i o i i s/ ( s , ) = r ^ r + . r , )
d[ir reím
nfances. i e. a sr ol ièsidu* Gn.odcd by v-ros) rlons vi{h {hln rored hb€ls (hercúdì lrbd beiiìg d, , of rJ.
-o -o +o +o
id ùd qr. dssE,!nunú d{dbiq
Duing drefuìtriJ,ephG of ùè aNN. úc wcighb*sùred
virh erh of rhelim
'(pmvidiic iipùr ro rhcoùrpú rryk) 'rd b îdjuf 'heweish'sof,he rjns n'b 'r c enoBb au iayen úd rdjù$ i\e {èighb úrir a[ Neighbin rùeaxN baveb*tr rdjùsrd arr inreùs ir ù. ùairirc sc!aÉ
è weishbb@ms lqy we djured b rhe Ìharn independen' or ù\ehiditre $r (do
15,1.2 îEPHD prcgrom ifwoù i;ÒÌj HrjDdb4) dLydoll.d by Ro$ 3id srdr o 99, lisbrd of di. sìigle scqucncc,rimjl) (xìism$r) irfùfmúioi ir trlrd !5 ùpd. I mùftiplearismfd is.ùrshdcd o
siidow fom,r fdidúc r or !ù'!rh u (= tt. îs shovn ii rhe figrr rhcr
ù = 5
in seconde.t 15.1,1 Accùrscy st.uctureprcdictior
(rrrsl)osrr, rLvcn{ ovcF0redidrons
ld*PEdidioD\ (raE ncsn\rt
ù Ro( a^d slDdq(r99r)
qùq,!r dìpùjrù
h ir hìv$€r
\hs!n ùr
ry ssome prcjers. r undr ot {rudu(/ 'idruÀ (liru{'rirgmdl'oJs)[r!. h!ù' d!!.lopednù.ú
Nú rndoerùù (cdrdr/r-lD Fdhodn,ùfou8hrho$ùr (Drìdqpì! menÈd rnc(rDsÉi!l)ifufúriù\$'wuuìNriru6(tru!'/ù&tll/{ddl'oJn.o'lr\!thr rionror dÉ qnoìemodd (,rd.//r,! mrhodt. ltu nnse Ìbu5rorru ! hndseron
Alisnment 15.3 Mcúods B.sedon Sequcnce
mdqrDdlcidcu herix.pdfdo !Òr (Nar fon soùenr) $r|!!N fca (s^s^ or sas). as erb o le exnìPtc, wc úr
onty 5n c x l) dhriú
15,3.1 The 3D-1D narchi.g nethod Ir rhhùdhorrofBNi. crar ( rc$, rsst). t3dìtreÈùfars (6bunlt/ I scondrry shctur úh) vù! usd Ld ÌhepLobrbitrJ, ornidiis!o lmrnorcjd(d) in in rirfrri.d(r)hdqÈc d Í P.r k) .i}*hèrc & Pi rhey.henrkùk
,r=*(?) (sìrdinsúneddx rrriùievr.ft cbÒuidlnsto'sasa dîr$îrÒL róprulcin\ !r u!s) búwr.i diÒsr hund {!rcs {erc adjùsed$ úid 'hu {m or tKoE5 $rf r p|n of dÈ nìiirg $r ('bc 'r .rr, ,he iocx (rnEqufiú ( r5.ì r.'fte Esùìri'rs ùt. orlrrp.6ni6 |.r cachposrioiin drol P@/r(,b.iD-È rìmiLf b eurú LF.N orrùriPlrquocPrìhrs(chrpd1j. I
15,3.2 TùemtCItE nì€lhod -nìisapploelìuseaEefdkìelyol Fmrd
spdlìc subsúùionrabre(ovdiielor d aL ree2ishie'xr.200r) rhe redbyi'shydrcgenboiditrgpinem.i6sc íNd é n!ii.!hù
lNon {st
y doùoN lù ù Go gtuup) Ùd rcccprorsCd N H ,!rcu0). solvcî' crpo{É. ho{ britr3 rhesehÀ ,esuÌredm a sr ol ó4
15.4 MethodsUsing3D Interactions
sinpldl;trdioísfur.úkj]ìolprded i1ìgotrly '*o ftsidr.s (pai$ne in'flction,
ùcadi']rcdigíIneísftgeiertdpnùbeú|uo^, r'dsrde ctar$a€ flexrbre. $rturcsrdB. sy iÍ alliiine(ì1)dìdim 0anlrc(D).
ber{.enrcsidus (ar Mì aoms in rhcÈir mu{ li! \irhi! 5A) aod 'bci dord
rhsèr d $rudùet. AJÌ{ rnrbh io
siduesioi crchof ùe .!lo (20 : 20)lr)s br! rheprorh rrudùfr dilblok (or I fcdrced
. ú or rniio sonrs mÍries
a:= f,NciccrrieùEúor(4 a) 6ùs R .dtdtde
ùì! w't
.,J tF aliPììùr:
r o re & hr s i d ù ea r j r n € ! ; l + r d o tr (rj. ir) ù'sads ùù
lntit sat^!ù1Ìt4M4t n'ad
:= 5+a(.r.b) €nd
15,4.1 Pote.riatsof meanlb.ce ileìihoolorapitrore\idu$(ùr g ùrrypd) ipsù.h dmlj!{ bfùùc nuik.Îú usds.Lopldbysjppì(19q0)kdeln he |ollpcr'iiL lhrn ) h Í!di!t.
d,! urEr
sipplùrlbó^iq!$ù6I]l(ùnr olidstrfuNidllì}'dmp|rcbjcrc\jiuc\.Ju. iiie (a) ùd vi rnc (v). bdh of \i, ! 54,Ún\pol]ri|g'oìelÌ.qÙeyor] i(rj.rl)
lr slrouLd Lìendùr Lhr 'hcrclirtu
. \Llình hre a f€qùeì.r' dn'nhdim /"'itr l (ok rì! di{un!! ircruln
Tlì\. ri.
qùorc/(r) i!'berreqren.yorrr0ui6
orni )ft$! Nurcr bftikpÌriú apnaiùD (o rlihù disriirgfùl ririq
n (ìcuìr'iig
1rir6) uld i Nd'ufund
ùc tulolirls is flc (r) Ùd th!
MEINOD5 U'NC ]D INTEE{'-TIONS m'hcfructorcdúbùkis/(rr,rhe /d(J)
//(J)!, *--rlG)+affu!'!u. lhqc /(5) is Ìhe farùercy lror dúrame inrí
r) fù îr amiio rid pain ad 4
nd 3id 'h. sFcific più (1 ,) ob{rudon, al8e(big,r. rhcnùc lsr bm oi rhesùn
prùishst. sivins/,('
= s,c), wtuìe.
bam'ionq (,ì = 0), rh /ó(.) (nrc fdor d rrsoplaysr tut h 6e rrnsforiì ni PdMF)
= sd(.).
chrù ron lypcf jlcrùdùE (cd{d. cr{r. N-o. N,o. N N ùd o o. {orh4 pliings rch s cd{p {ùe,r$vgsered ) Thc* s6 ofdnkills providei ferM ro rhenq
i i,rs
,ù .14 h1
cd se,mrioo rh pnncipalidvùrú$ or ùc N aid o dkÌaas jr rù
.o. '. 1l
rc (of 'he qu@@ of dìe f0ctore nstD jj
me r novel (sry ro'i I s6omc) rhe e '|yi']d'jj'yùgfruNe.Hol€t(dÙnrgùl
hbrùslrtu trlining set. Mof wo*s
qninic !d
15,4,2 :losards no.lcllingnrelhods
nn dr! vrrL. rqid)
h ir rh*e lp$ s Gqrjr ùú rì ùo iuohqofrnndds
15.5 AlignmentNfethods
ArNft li|e mrir)
i.d dic n.0'lùd drMnic
15,5,1 l-rozen.pFrntin!liùn
b$.11rfoì tri\1!nroi r.rv ìotr
D i ù r c ù l o - 5 1< j - ? l < ? - e l < n r r K l
q=MT E RV'L ' i otùÈ ladrg d x 6., = 1lrdj = N 4 (F)
r (
\vih 'hD alucmc 4 (of ririo*n
shúùÉ) aloig orc rimetrioi and rhe sqùen.e
posi'ion/itr,.lTjrep:jJliseirtrodi ubr. r.I Leh lofhrcd) îmiro rcid rrpè pRn úd c!(h bbÈ h6 a0 $rqr tara dishîce
hr,!h$' sconrr pú lor ìoqd eEsy rbf PoMF meÀufen n exhd.d by ordinùy rc ùÒ?Ìnxprrmdion arlonrhm (FAA) reMs ilrlrvdrb ùsd by skolritrt dd rbla ir0oD. Th
nedioi aidn\i {as
nK ncÌlrior. 5J Íuf'nù tua1iÒtr nrg rhcture de cìosìy ftrard (bomologout,
\1: f,
15.5.2 DotrbleDJnaóic Prqramúiùg
ù,h" '.,.. ",.r"*Í
,ituj;clrr,\,...nb.Ni!."d\L',1ìdìe! ;:
r"d" -h."--"
, . , i a i . ir *
' <,.1r
!r orrhopùpaú1(/r'i,
P.MD n-"-i
> i.! > r rcdh.d!udld
( rhe
" " hqPded " n - i 'rob! i ' hc('!ourire ore thr\ÒÙldbe [Nrì {'r ùe rr,.j !$ùìPrio0 pnr.iN krìoLln)rhÈùirciÍ hohoPud(ifr|tsorc rúùÙiofnd dl nru{ùd ofhdh (úL') tulridic),hf ùt issisnms\ \ rl ìtriirì a hish(gool) sÙE !hih spÙrious (|îr) I lov ssisnndL siìì or\ sd ù 0 , ; r h e l ù c hL - J ' . , d r g m i t r n ( " ' d d c f i b e d n c b l o L . e r ' u d i r o i f F A A d
rfc andobs^3d bu1ìir (ii hdh prc,cmnr howeve(n' $rc!dù!, .he ìo(ì rNcur
quiv enrfor$nÈ fuiúionatFason lpeòxps
15.6 MultipleSequènc€/Structure Threading t'hod\rtitdsmbcdfo|noìlipkaqubc!
For mdhods ùsine 3D (Èi.vnel rcn
(5umrfl)lift or sP sore). rn rhrcldù!, hoqrE. a
or.pù(vnd nly norbe
15-6.1 Simple mùltiple seqùenceth.cadirg Ls! \dr drnblrs 1ìrlo1(r9trbrdù ndl'oJ (MsTrbisedor rhe!r3re D!
15.7 C',nrhinedSeqùence/Threading\,Ielhod.
Nr rir. {,1' ú!È i,sîi,ui,)
A\ n$
on$ (rerehì krù!d.L idúLnd
15.8 Assessment of ThrcadiogMethods
ro fec
r$sqlÈcùdy by Mccrdìùcr rl. (200r).
15,8.1 FoldÈcosnition
hemmirù.r by rbcrunb€. of on-' mfth6 (ru! È\nd!t p ored sgrimr ùe ùù,nbùoÍmis* (fitr Il*nnrn orfumriois onhes quftli1j(, jln ó ssriÌivir], 4d srediviry (k cb!p€.3). oùìctuoùFi$ns irc.ird ii ùe bibìiographi!ùfcs
15.8.2 AlìCnmen t accùr.cJ
Gù.h A ndr').h
e .onú.! +,1 (or +ù \htnj ù o hdics or i2 (o: +:1)sbú5 ù tj.rnnds 1orliiI di5pLrd k, rhcFsi'ior orùc ldjae auù$rnisbrbe orÀahdix(ùrErd)
15,8.3 C^SPard C,\I^SI
/t f, k r," rrri,
llro ebNrk).aì
ù(n!hòhlcn rhr
ri'r lc^sP). \ $ h.!m D
\d { (!
ri! rù\ù\ (!dr!d 0ù&\cnùn
15.9 Bibliographicnotes reer) oLhùDdhod\n'ruqrlLtrt rr!dút $!.bcd ù aú^ ud srjlorslj (ressrmd
dcslnbeditrBovi.dr (r990.1991) d REeandEisàìbùe orBrunddr 0997).àndrhcrndhod d d lFlJclrr) il oùng!ú dd.(r992)iÍdshidir.e00r) poÈDrùì orrìlri rù( j\ dcscnb.d iDsippt( r 990) ru|'jgmd'helMMdg|isinkJpìNdù1, FLrlNîqcr i'1.(1995)s,l Skinrk r|d Khúa (200r),andDDPiù r4b ed orcrlo(tqe) lnd Jones dat. (1992h). u* of mur'ielc e,ìùd!$ i\ m,faylor hd orci30( leeT),dmhin.d{q!crcdrh€adiis in Joies(lr99bl, hd Qmn{isr or ptug.Jm\fof rord€.osniriodrc nùnd in sìù dr 1200 r) andRie ùìdEi$hú! ( reeT).
Ba Prr
oprdhy codTtkd .1.( 1992) hú vnhoù rh. .kÍd (ihrn (2001)iskoLrt.k d d (200t)
deErop.dby Ffid;ds md woìyìls a 9391;Frcdnchs.r Ít. oser ) ji yhich rhe
(avH) TtÀ is ! miùixbsedona H
ctrtutrd(aNN! bortì r3D lD md6d (Lù d ar.2002i) aìd iNùfrdúù8 rD irrormrioi (Lin d r. 2drb). nÈ ANN
Basicsin Mathematics, Probability and Algorithms
A.1 Math€maticalFo.mùlàe and Notaiion
n*.ru,i, = 1,r^* ,r" *"*, Ilî=,r, ms
rhcPrcdrtlÙ1 úe tr vù (/t r)r NùLerlft0kdebrcdò l
l" ì,,r ur--r.*m.*'. *'*J * -1:
"""''u=,,.,,='' -
L , . h m -re)
tiE Id|. Ìk
ti|c Ìk tùrÈ
kk. tart! tqkt
îhh. tnE rw
uR tE Jak.
4.2 BooleanAlgebra Boolc$ dgcbÍ!mlrs ù5dÒfiógjar f niy hke ù drev"rm '( rfr\?. rhr bsrr ro8rar openror re a,r. or ùd dr.
ìrs P = r!Àe, Q =,xr. R = r@ ftci'h. P) a"r R) = (lake a)d îtue) at (hp ùrd n!è) =fuL!
{P dzl Q) or ({,
y rype:nmbd, ÈreB. Gùb)seqù, (strbxhduc\ s,FFurd ùh, 'ype. rlìe elerc$ u ùìdù.d bl 0.
E t u n r lre2; , 7 , 1 , n , f a!, t.,. h t . t . r . . r ! rr. ( n rr.r . ( r r . 3 r l
is,u itr s no.rn clcnsno
v i5 thc /ir.i.,.r. a E d d i yu v + v
thor. ot rI c u
u = t c K . LR . , s . Yvt .= { a . c . R . w r .t
! v = ta.c K,r..R,s.r.w.Yl.
uu(v tA,R.r,- Ic.K,r-R.s.uYt.
{ c D . E . n . K . R , Y 1t =a , rG , ,r ,L M , Np. , a , sy. . v . w l .
,{,4.1 Pernulation and.onbiùatioù
róúmpres: ÀÀ.^c, ac. cc. cG,cT, CC. CG.GT.TA.TC.TG, TT
oÌdù.dandno1repìsad. r2ÍDpìes:ac,ac.Aîca,cc.cîca, CC,GT.TA.TC.'|o cc.GT,1T, desrAc. ac. f. cc. cr, cr.
4.4,2 Prcbab'lity distributions hjliiydnùiburionlorsinDiyrhcdisùibùrioi)drùrbeìilrrr,Ìr,.! rogdhù bcr ri = pti = r,l. Ìhepftb,hirnysat x h4thev:,ruea, Nor rhú0 < 4 < | aùdù\drt=,, = r.
Ld x b€rhercstrhÒrrbóqiq i (rue)die.rbù 'he{ìue sprc.i\ 1r.2,3,4.5.6t. andrheprobabiriiy disdbújoir ll/ó.1t6,1/6. t/6. tt6,tt6t.
reis ùe *.ighrd (bl ùc pmbrbiliq) avmeè
E= L,,,,'.
4.5 Tables, VectorsandMatrices m,r hrt dli.ùùrn one (. Lad4 ). u sp!! rld br$. úer. /i
A.ó AlgorithmicLanguage
hù A |.\\lti1úuÌ
ìt alxÒnr? 2
îm {ix serúeHru,ri=r&.
-ùm\,rsrúe \iEt:=,
\ ù m : =0 ; r r i é ( 2 . 73 . l t d 0 $ n = $ m + { c n d i r I I v i h . nv : = y ! { r l . r d
4.7 Conptexity
dda size(i'ìporsi*) Fùrridpricny\
r[c. (/r r)r. î.
'iiù ùfeas$ rrparlr {irh ù.xrd v! sy rhd.henmecomphxiry
Int Bic
Introductionto Molecular Biology
R.l TheCellandthe Molecules ofLife: DNA-RNA Proteins
ar ! Geììsofliisrìq orgìÌisnn lrhin a u,r4A sfùidd
(Led DN^ rdeoryribqùdo! ú ) lr hsNror. ( i Frk,ryoÈt of !s?aÌ trtr!úrrú$)
h\ úecybplòù rhc
xid rlr ene DNA , DLc!ùL\
fl &-{
DNAisÙifÎudc'rbylòùdiflGí idúy,iiic(dcÍd.d bya.c,crndT).îÈ {ruduroofDN jr ! donbhherix(whichwfsoi hd cork disco\rcd in l9s3). 11È
oNi\linsd qc! tA ora) rndoÌe Dîmidiic bi$ (c or r) paied rccord c pria vnh c rhi\ me!6 rhd irone
r 5.qù.ica acrTccr
fsrùsn(orr ceìì a'nuhi..llùlfofgrniii)
I t 1 lf
ò"q[ .,i
ri8urB l(r),rd(b).Tlìemirìoridsaloì dc!hînùcdrcDrùcrd
,.i l,l
(r) aù ,0 mùo !iù
hirc ,
onsotifrc]idlgsisc.lledlltz.Plid.,,,/, aldlh.chiinarodg'hsa6msNcd{
.ùbÒ\ (nc) 'cm rh ("hlE 'hec I úr prkc of rhFine. $ úc brs ìn RNA ú. dsnoreda, c. c .nd u. wììjtc DNA
(hq$ensùRlra), úi.h camcsgeieri
B.2 Chromosomes andc€nes TheÙdÙsd'ùes|irydictrr5@ €rns(mi.ty hnbn* aid roeeúùùc conchrcmrii) Elrry soiìrj. {hddy)cell ùsùx[y ùduda.*o.opic lld is Òarrcd (exdùdir! 'hc sex x. y .bDmosoDet. rh. nùmbq or ch oI sch .hfonorm
j.rqnirudi'r! rrdrj"
(ruì RNA.qrlùrif rrr !ùLL.JI gNproù ú). h r.N (i priùirive s'Niryoric or,lrntrm).
beencrlsd.lÚkDNA!n!cnnu]dùfufrhd.i\ ùp ol s.xller piefts (exoN inc+|gsd bt $ (lllqiftEqiiD!
ÈsiEs (ùLrùn
ir! ofr ofrhegcicÉ tr Drlnbcd(c s (opidk,nRNA) r!nrhci'rurfo{)li!.11 ouraìd rbeEnrùù3 {Fru()nRNA n hDikd ro r pturir GÈ fiem r.5)
{iys (ndudifs direEí
Òr thc orB
in , hxÙr fs;a)
-!iriù,! sero i
8.3 TheCentralDogmaofMolecularBiology ,.'sdrg* RNA)comaininc'herme i0lorú hm rh piots is.ikd 1a,3.,?/, oehnery (ohsomt Nhilhrer A irpùr Nhirhn goirgro be (pú ot l pdeir sÈ Fietre 8.5 tu iù' a porypeorre Dùrbúof rifcred p.odods hm.eint.
B.,t Th€GeneticCode o acidro be in 'heprobii rhc fdtislnn' .'hc'hrc! d'sofrcs vùrd hc (r) rs,sd.ì,.. . (2) r!-!, .r4.
. aod G) sc
Fdd, oF!D
ftcE !rc 64 (= lr ) drjrrrcd r prcr 1 lnm rid lh..ode is fid ro be rcdurdar).Tr nq,piig Íor Lnptr\ó h^\
(hùnrn) gcncn 'bo!n ù Fuúo B 1
8,5 ProtcinFurction u! NLd {rr 'li|!' wr.L (h}dropìùìic an o a.idi). oùc6m rìo{ hrppj'qhetr trorii difú coÌrld Nfh rd{(hydrcphob,! lmuo a!i'(). HldùphÒbiraùi',o i.
í lompq,otr! ((nFrdutrt
or llr Gìl
arr kP
rhis i imFdbr
8.s.1 ThegeneonlologJ údioi
ru! qos-
8.6 Protein Structure
l,lÌ .r.-. _.'.
1 1..".
"' i,"r'-'
, .l-
r .tl
ron'nÒr). i' NaN dìal NÍh
nrd\ 'h
snrtue Òr'G rur $m4 FoÈh' b! ùlirs pórn dot.ù.
dÈ Èfabúhld (F!t (\{ucnfl
ùb hy] nÎn lìùÈns'ú,
ìúir iùren.rlois
Thc/3fNd\l€romedbylì],drceeùb iD3, ed iq sd jud! qDnfúc I , tfd. bÈpùdÈr (gÒùg ù rhÈ$Fc dn4riotr
rro îrj&rd
n ! $@' mrshr
I :
!rudure or a snrn prcbio (instin). fte pdr of I imtur mr beinchori
:". i ,. ^.
,'ltrt.;' i-; .\ t I i"'l ,-'.tQF* i
hllPr !,r rcs of/ù/mdq!
Prnmry This
óhù ! rù
simrr! Lhcùùf Glqúùtù
|qriJù,sùcbúnbesiroby(r., oùrtmrry.
or mino úds lLons ùè porypepùde
cNiúaL !ùf,ùrice or ùe omroù (Òrepory J.ooidiir$incrlhrq!i,tunotùre
f rhcPro'Èi'ì.oùsist or
uÈ wlúchmr bc gtrcnhYd,o ( r. r. ,
Bindi.s!Èi dic!úlystona,tur(Lh.sùbúrceuoD{hi(hrholn/ynoldt @!o''icÙ.ùis'in]cdbylstdfc\ùu$' nsuquult!FoL.rùÙph.thco4neL-ypln
tu{rns (xodsmnúnhe súqed ido/dir.r. vhicììcoùnLtì' pftrns (or
6 I ke) mon tiliuly kr g
bfih ù'h! (tuDgodio roirs.rìlerqrenes of r€n.qnwld qr 3eF\),oi rorÈ{ npi..ù\iig gciÈ
gÙ€sG.g I hGomi
Iì.8 InsulinExanplc (i ìo) c|mrro id{ìized oriimprijirdextutrples
sqrcrq. sltr
posible.i red
(oLdìonr poìyDeÍids) ThedÙa
r joNr ú di! rÙcÈ$ Go úr Ì Én rt \F!ìrì.Jly tor sll !úir!. Frci
(tr,cNLir €.epro').rhrcush\ùtb dnnesrle
^!n\ dùlt!ùmdrùebNo aDddùdphidecmslii$(hd\cs!y{c m {orcdft,Ò\irisPrled $pfucpoìypeprid.chah(!r!èJA,idÙ.ìxii,1 xs r ùn or n r ndcar$ (itÀÌDrt) s
ql prpmi'sùìil. pPr)h!! rr0 rcliles ir ÈrofrPÌ n mbdrúobir beexporedfmmrh!ldrnù\ î roi:g. rùidc k, $úr rchò9 Thi i'Nir
rorLcdL (!
À Lrt,sD) Î,nqÍriukd
{fnh sf j'Nlitr.i'ro
Re Fi4FÌ.e
rùsù4ior des.tuftLh
rodermiDed.n u\ hund dBri6ulin wù] r.e..
1. , r tmed'.
'i .....^,1"' ,'",,"
? \:r
r\ nrli
8.9 Bibliographicnotes lesnedùBmdciand foor(r999)ùJLcsk(200D
, _. úr.,s î! rPfl b,
References úk
t 1@2 Mdda.
bìdap atb
cat rercè ù ct@rúLd D|LBn,z {rd hsruoîlÌd
arMùt st c8rcùRud Liptu D rs39weùhstue
GùdrD, m ssr6.
eL,ú byI rE,
^ ùd Èr-" d.úo
rd ,br ú,
. bri Nrljrur
&!ús s{ s+r \
s ordr !úrqr
.rz rr ,ù/ 6. sr ro
r r 3, ù0 b ù{r rj sùaoi ri!. uùiituE!
tùtl 4, Eiútd
RqìLù! r
tu .on 4 hkÌzùt
XùÙn ùl Fs ,( ú/a/,/,4
r. ciiJr r dr Rrd
s reri î'D
s(r0). ú5...ri?.
.ttr\ 4rl{h4ù"-
! iso{L
Úbrlo,ù/ir !l'
msbl J. Di ne |! D.lùDor N6' s
\6dr Púrir
! Lic
I ó'F|(J'A4cPI
D Prubi s. ,r'rtqzdd
Pdiùlao M r4r hrtuic
h rh.
at ù.rú
14, nG5,
ht cùI
ùt Lttthat !ri414 ttt Mù!ùù BbL9 @ L$gWT,sldFiryl'EJctfi0dùt$4
Nrc 4d&q
v ú6r sr@drÌLry
P re76 rixrhntrg
úros itrùml rsrcoes .rrnr
!(r), 14.
s$d, (, I r{ nì qq RH!D Pr{.itri! .l r,r 4 r, r(nr. spnos
K. B,{ , \'. ùlstrlr-
or .rr?osr00) re le
Int qhs4@rcè{ci!lú4.Nliinvsú!sll
ondm ii er nùÌ{Nrc
r+nmsr }i[ 4pri.
PROTEIN BIOINFORMATICS Ar Algorihnìt Apptodth tó Se|uend dnd Sîtu.ture Andlfsis Dt
Dtuikr 4Ndkúúi
tuedù oI Fn$ Ed Fdúi.
Bú!ù, Nr''q Nútn.t hritút rt ! ù&îr\
búhn. uK
r h( i rt rcrhs rbú
PNlù' uùilanùúr, an Akú,ìiùi app trmrj5r ù idary uisr Io, lrEnù,r r 6ren( soúa ù ihe sbied 6r