IngvarEidhammer I Ingelonassen I WilliamB,Taylor
PROTEIN BIOINFORMATICS An Algorithmic Approacht-o Sequence and Structure Analysis
Pmtèin Bioinformatics: An Algorithnic Appmach ao Sequenceand StrucoìreAnalysis
tryrar Eidhmn€r ùd frge Jons*tr Deporhùt of Infomaîics,UnirérsittoÍ Beryeh,NoNa, Irtision of MathztutìcdLaiobsx Natiotul t"sîituîefor Vediml Rétulci, Londù, ùK
JohnWiley & Sons,Lrd
521.63t !264
Cor
Part I
it
i,
I
P
Contents
Part I
SEQUENCEANALYSIS
PaiNis
Global Afigrùent of S€qrcrq
1.r a s6nns schmefd rheModcl 1.4 Fiidìig HighcrsontrgAriemts wnnDynmic r.4.r rr.mir.&,j 1.4.2 UEofEahic*
ii ì'6sÓno8GÀps:c,pPefu|ljs 1.7 rrymic Pmgrunninglof Gcndîl cap Penrq, ì.3 Dyrantu rmglffing fof af6necap FeD.rry l.e ar4rreÍ scoreaid s.qE@ Dishne
Pairwlselacal Aligmenl aùd Dstsbse Se(h 2'ITheBalicop.nliai:cÙredì!8T{oseqEtrcs 2.2.2 Rep€rins$sDe*
2,],2Fjndingúebellocdr|isihcns 2.4
2.3.i1 S.onn! Dúi6 andgappcfruies Dahbse Strh: BLAS|
r.l
HyF,ú6is Testinatù seqùem Homology
r.2.r
r! r.j
Poìsson prcbùnny dÌfdbtrtior
ftobabfiry Dnriburiotrsfor cappedalieEds asesinc md Cohpcdn! PósJ
MultiDlèGlobrl Alismenr and Phylogenetic T.€6 .1.1.2 a pBnìDgaìeon66 îor ùe Dp soLurion 4',Mul'ipìeA|igihdhmdPh'|oFreÌicTtds l,r.lTteDmhe'ofdiflÙù''Gbpoloei$ 4.r.2 Moìecutù.lockÌheory 4.r.,1
DifÍeEnr.pprùchestd lmmhdjns
4.3ó Roorincorlé t6t: bmMpping 4.3.7 Sbtisricsl 4.4.1 Aligrinst{o subsrùsnmnb '1.1'sseqEr@vè|glB
5.r s.ùrirg Múi€! ss€d onFùY 5.2 PAMscùns MdÙic$ slhsùtuiìornùn 5.2.2 calcuìare 5:.r
MàtricsrorAtn$ dol!úon!4 Ljme
5.2.6 ScorìigMties (ìos oddrms@t 5'2'7Estìrdinglheevo|Ùlionij'dtbl.Ò
Conìpdìis BLoSUM úd PAMMdn.è
62
6 1.2 Rènovi'g Ns úd corms 6.1.3 Pdsidonvùshb 6.r.4 scqEnc wcrls T ó.1.5 rerùc grps seNhìtrs Dd,bses wirhPrÒ6les IdTSd BLAST:PSI-BLAST 6'3'lM*jnguì.mÙlljPlea]ìgrmeÚ 6.32 cdsùcddg rh. Èofile
,
ó..1.2 Cdnnrudiry, Èrrìle HMM rof a prcreùfmìly
?.r 7.?
Îe PROSìî!,ragù4Ò E\ac,/aprminaÈMÍrhins
7 4.?
S.ónnsFlftefr
77
CompùnÒDBaedMcrh0ls ?.7.1 Piwl rllsd nertrods
r3
Pfrem Dfirn MdhodsrPruh
Part II
STRUCTUREANALYSIS
StùcturesandStùctù.€Dcsc.iptioÌs 3 r unns ofslncrun Der;P.ions 1l 85.r
Linè$smen6(ficks)
3.5.1 srmdted sheb (roPs) 3.5.5 ropolosyorÈo'einsrrucruE 3.6 rdsrryins i\e ssEs 3 6.2
DènÉseÒr&4 shcM orPndis (Dss?l
ssFnn.wùkfolPaiÚjssn4uftconpùjsor
Sùpè.posiiiotrùal Dymmic ProgrMming
9.2 93
9 r 3 U$nsRMSDÀsoriisofsdctuEsiDiìeìries rnd Alignmed AlEmtins Sùpcrsìsirion DoùblèDynmic Pner3lmine 9.r.1 Ldlevel soring naùies 9'3.3lbdEddohtdylani.P,lgrmriig
l0 ì0 I I îvd.dirúsionl seonètic Mshins rorshdùrc.onpùison 10.12 GamcùichÀsbiis
10.2.r Meaqjq rhesinìlùiry or dirù.è ($b)mries
Il
Cluste.irg: CombiùineL@alSiùúlùiti6 aodCo*nhcy 111 compdibirirr rl.2 s.ùchirsrorS.ù1Mtuhà\ ll.3'2orefìappinccla..6
ll.5.l Compffins rtrtomaÌioDs ll.s.2 cd.ul ii! drercv hoslohdiòn rr ó Crlrdig by U.corRetarions 1r.62 Ceondfi.fchrion
t2 SignifcanceandAssm€Ìr r2l
of SrruclùreConparisons
coishritrg R.1.dom shdua ModÈlN 11.2.1 cÒnsndinenorEduidadrbsG r2.2.2 Den!úloD ti'r for.imì1 l,
1l Mùltiple Sùuctùre Conpafi son
rinding r conmon corc fior ! MuìriptcAtignncn!
Part
135.2 Diso(nne PrcrinsPúem l3.5.3 lhe aPPúmh 135'4 scrirg ùs Prckns úods rots 13.7 Biblìo$.Phic
la Ptut€tnStnctúre Cla$ilcatiotr r1.2 An lsinsModeltof DoMi! rddúficfior 14.31 MdnìY{donl?B r4.r.2 Miinry-, domdns d , dorùis lari 14.5 Aúomtic APprcshs b Cl,ssi6cati0n rj.6 DibhMs lor Sh.M cì6sific'rion r4.7 FSsP-DdiDomhDidioúJY Ia3.1 Domains r.r.3.4 ToF,losy(roìdlibÍv) lmili6 143ó sequcdce ddsìrì.adonpD*dÙE Th. CAIII 14.3.? BNed m sùcLs r1.9 clssificarior
Paúm
SEQUE\CE'STRLCTIIREANALYSIS
15 Srmctùrehealt tro!: Th@ding 15., PFb! Ssonddf, SÙtrctuEPtdicúoD 15.1.3 Ac.ú4v in $cond.4 shctuE predicÌior ts.r
MdhodsBucd d sequde ausmst rri.l Tr.3D rD ratchiic n.ìrod '5,:]'2]xcFUCU€ú.úod 15.4.1 PotúiolsofFd
rorc
15j2
DoubÈDyimi Pmsn'rming
Pre
AppeDdixA Brsic in Mathematics,PrcbÀbitity..d
A42Prcbabilil}di|dbulioN
B tntmduction to Moleotar Biolo$,
Preface
r probiobioiDfomaiìG,rcuiins or mdhÒds
!lms'gúnlberelbaisforch6inglbgtish.plogDmndtb{iglipaEn46od oprions,aÍd liniuy be&ne noÈ conÉ rhe onpúr rcicd (o! q,npúú sico@ rodcú). on 'he dhq hand,nly lìDd aid n or sh. mry edi úc údèóbdiis ics$ry b mmc iftú!ìng probrè'ns, ónnù
bsis for ólabon'ivc prcja 6 nodùe igbt balanemakinsùe
ùis by l$uiry on rheideÀ or rhemrhdrlprcgEft lhire sryiis mry r scnrd 'Ih. ider {bou(heir ùù of appri.ltior úcùons ùo .lso d$nb.n rorn{lly $
'ÀeonginilÉ*afthpapcsalwdlsÈv r.o
ahoú ùe abjcd i EoE daù.
etve.xene desdFio$ onheivanabbdrbbrss forDNA aodprccinsquEncs, Tr4e hool! tpìdlly 3oì! y.e lirlè d Md ii ú. pmgrumsor hk. îrrc a
my rì'd sonc sdios
hardro foìtor lwhi.h oi bc 'kipped). ri . simitrnEDq
s we Fdìdc biolo3nar ndidioi
er
ùd rÀ.
What is biotntormalicsani{ay?
.vùbsddodfmflEbokpìi.ed rndmofDNAbyw{sialdcfi
mjÒ! bE*rhrouehsitr rlÈ 1910s J r95
x n rhcdeveìopnst ii rres fó
iigmaÍyrldentrndcsÒdhcÀrìlmr]oùfieúmruùtiÒn,lsd'h9bù]o!ia|
i. rùvcr duì !ùh aos ùd od (bìr).DNA trles(rùdodid$) dìd comein foùr dijìcrú
ldopnm'olhighlhrcuchpuln.lh ih. appì'6ùù 0f 6mpúh
h h6Yd
rhis
pi., bu.rilì oreof rheho$ widclyusd :inilÙ6agiqqu4s.qu.i..'Ti.q ru be*prÈd rohapp.nby ch!n.è)si rero rhGcorrriòdaúbs. prorin.areoirhms dì*ú$n in dePlhìl chaP€n r-3. úd, anddudy rhcarìsmsd ro s,in infomrion abournherclarionshìps bdwes 6ùaìaldsfurudprcpefiesommonblhc h.E n\e sme dioo aid). 'ìe aino aid rir. sdc( îcspdivcry) 'bù nry bchifu úNr rhcdnrrc. or.cùdiry rhdtr€ riomfy felarionship beNen a s. or prcbns ror seret. sorh tor pmEin runriona mryss, i' is .ùciaì 'bd rhemùlÌipìealism !fles rc rrcúedin chlpÎeB,r 7. !ìmtl foreuùrd smd€s dd 'hemjtrwùkn's hù*s (pedomjJìe,rordddq tuúbdkm rndsignalìitre) ii livinscel d€*), b bdh ùdddshd rheevolùrimùrpóbìls Gìn.Òrhcsúùcrùie013Èdeir rhdg6 froÉ slowlyin cvolùtionúù d@si$ $qúrc4, aid.o id.idry rb.ommor da$i6.!rìoi or rho úivù*
of prc!ìr 3hdrca. A coúnon appfúh.o doirg t r . , q ,\ n 4 r G t o , l r o n , t r . . , r u . o n
oi prcied rhd(s.
rn Pld n (chlpÌen 3_r4) Ne defnb.
frudùr grcnrssqù.i!Ò('hcodtrorrmjnomidsrtÒrs
ctuinG).rìoEqmos
!c.rc .tìa i o.ù. rru4re diíion
No This Qn !s sonc
d rhùdine in pin n (chapref l5)
ind porh rrudres Mo{ofùci'lg rrpìicrbìe 'o iúcrùÒ.ide(DNA ù RNÀ) squcr.ù\ lrde e. bÒ\$€( a nùnbú
Íl,/pjdeiJìbìÙfonficr
Notation
. o. ,. (, . . . dedÒksuD$ecifred úìno &ids. . a, ., D,. . . (úc oràr..Èf codd is us.dtur spdi8l Àùiro ùjds. . c, Ìd c" e boÌl ù*d fùr rhcblcrbonea{arhon arom. i e is a geHrr aiphabd,mÒsdyusd fd rì. id or mino dds.
tsr.r,,...,r'tfo.!!c!ors.qùo.aj.
. sr' r Nd rórrh.sr Ir,.rr. ..., Í I. . { is ùsedror a (qùery)*queftq ud a for a dÍib.e seqúna. ! 4r..J;rìcruh6.qu.nccGuùsùing)or4fim4 rÒ4r. . . isùedfùrsèrcBrBidue. . Jsd S'ùc us.drore.orìnsffirìy b r I is usd for o $dir€ mrix, Rd,is rhcsoins beMeer. ud à.
r ur,4,ar,rj.4,4....
GN.drùnsìdusvh.nffiyrdnsstururcs.
. , 4 r . A r ,A , , r / . & , r . . . . . f t N e d . ? is ùsd asa prh in dynmi. prcsGnnirg.
. ( 5 ,P )= r R ( r ,r . . . . ) . ùsìlgrhe$o is na|nxr?.s is rhe$oline,hd t
AC
Acknowledgements 'ftis bmk is builronletoE mb for llm6 in bioirlmrùd aìsonìàms bughr r 'n. 0niv.6iryof Bà-ger.we rcknowredle en studedb whùhM torlow€d rhe drarr oysbìnrùfod Elsdsm ùd iiromaliE ardiúpinns.on!ffirions *iù ReinA,srúdiDdKjerrPtuen, Sone of rh. d.itur ov.rd buirdronjoinrwr únhavh Brumr md DrvidGtb.r eI ùd èpdùUy RurhNusimr ud omù Drcffor vrubìe he1p. rinally,rbinrsroou fmiliG fork cpirgupvirh usdufìns
Part I
SEQUENCEANALYSIS
I
PairwiseGlobalAlignmenlof Sequences i"^F.,s ",.*
.h.."" '","
llll:11'":l''-) 6"' d.r r prrif we
eqrRÙ'ch!fo5jI4|^;\ràels.ongJ!'1
*"p'"
ùèotú-cD6k
bhtr'Ùehmdrdtrid-m'
"lgollmbÒJ'n!T!r.em'.xed
Pù'h
bbúJqùqk'
dt dù1ùùaqtu@r
016 d \ r" &.",. i
nB^. -.N"i.r
l.l
ÀlignnentandEvolution
drhÉ5 6uúr
Nhsc
rhcrclrìoNhìpby a/irrù8 'he qEEs. nre iliemù' strùtd ippencdinòccldlurioiof rhcnqìsLqucnces.
(ùoded by rrd,f) nEms delèrilji d iirÒrión (t/rn.
onc q rrsr
owi Grd, ir iÒrkióNn).rsilLnrì$mcnr (onryone hlLrpscd Hidtrechrqe ìn eùb muturioù. rheaùsnnenr b€rnenr rnd
PAIR$ÌSECLOBALAI-IGNMENTOFSEQLENCES
:1.t1:!.11: :t oi.4db, : too. ftncon.
Ì1..d,
Rod I ooeoiúleBl eú$brliii6.
l..ql6.-*-*.,.
s à t r r d i r n . r , , ' or . o o ! . , n e m r
sd o- \ ne . hpr. moder.r r ù!d. q b m rÈc ù" ,, Àù ù mnmc
rbuns r*o ildeb. onè hiíoy cÒutdbe
qf\ù.h.erc|JdmJryIil'ol)hîjel|'gnmÙ'ùL'D\okh\ò"!bùt'
m\ mùr .ì " f - e r h o r t dh , e o ú o @ú a r t Ò E h r i r s " n ; . , u r n 6 "^.o ro pcú rr s.rcrn q s ,\ oir' mer, sfu b, .!!roc_ d ,,4',.s;/" prce n
n'Ù""noluuonfJ"ub1'"."'.hji€g'e1'
1.2 Whatis an Alignment? mN \djsrythchrr'^linsoiqdúú (resdu6)inq andI h! . Aìl synbols
for theModel 1.3 A ScorinsScheme 1.4
Gddili\e somg $heme).
. À , ,- l f u 4 = , , 0 r o | , + , i
P4rRwrsE cLoBArar,t6M.{!NroFSIQUENcEs
ffihllffi "hH il?-*Tq;f ffi*#,",#ffi*"'ffi 1.4 FindiDgHighesrscoringAlignmenrs wirh rrynamicprogramming
'*;ffiTliÈi':r-if ;,y,#.r'f.:;hBi*r
*ffiurfr ,**àtfr'".ry#L Usiùsdpùlc pmgmnins, frd ù. hishèt po$ible kE.
hish*r ff iltl*hidìns,rc sùcbv6 ns,,e To.xpìainrie m..hodee ùbdE
"r"Íi:tì,:tr&*J '
sorÈ rohior
Pffi :T::Tr1'1 ilfi#:H',Ì
r 4, rh i!\ synbolùf4,lj ì.1în syDboì orr. "
'nbnk orq rr era0Pìe.r, î =:th. ! rr , r rhesequenc! or 'nc jì6.I lmbok or l.
.,9'JndÈhigh.s]uft{hthlinbe
Ndclhir,,lvillbe'hÈbìgh*soc u* of dÉ Eîami. pfd!Òmmùs wa r,.j by siig oie or mof or rr.J. 0 < f < i 0
LA.2 Îì!arjenmenrforlar,.rr r ) an onlyend+hhoÈ orrhrc djlrcfcm dùm3:
xtsrfi|dsdDlcaionf$Ir;Jb], ddennìtre'becon.l!rL(fu':/l]. we us i = 3 ardr = 4ós sanptein'heeiplrmiE, (4r.I = vEr. /r r= PREî).A$me ve kmv r! r. Îìc i'rrgimciÌe|dssnh (4'. ) Fo.rhe*mdd il is (Ì, ) ftealjlinHÌ fol rr dhtc
2. rhe irilrtri'
4;t
eùr \nh ( .,J, c,r)
i mry be vi-
fil
By lsi,rsdic rmc qprindnn r
t-l ri
PAIRMSECLOBALAI-]CNMENTOFS,IQUENCES r. fte argDmn'4ds vjrh (4,/r.rr.disinorisúù
di..,,l!t ';
=rij
'fùeermphrmùe
! Etrl
PE b
., I + Riir, (r?4 =0h'hc4anpre).
F,.r=nylÉj) |
I ishs, .. oe ! . h ,tu, | ,a ,"rr b, e I tr. onÀ ,.4_ t E H, rt t+R..1). |h @
nL$-bl';;
ri hcìph ùe ,lisn;s pNsq n is arurcprifc 6 úecc ú. eoÈs 4J in ! &oùe ùnd mrix orsize(à + l). (,r+ l) s sho*i in FisE 1.2.rhe aIWs shd! fhich úrief fiììedens m uen forcdcuta,ingùîevd@ofr elì, rr (i = r. = I I I N q d , ó n r r e 4 f r . n . o m nd o w rf e ù€boroî,iA\|tun er.HowN . !e r ar I qnnor be cd.dabd. lhmf@, *e bavero itrìriattu rhevaìus ú re atrd . o p r yú l r r . , ì r o / ,
. li,
;;; E":.h trorc-& lrDr iìÀ:
d o 1 îú c w d i n s v . r hI b t ù * , c h r td , "
ru,og,hrq0.
_rs. F,suFr , .bow.h.
af,befiUed'bdl''tshq''Fig'l|\,' rhFo!'mp|cÚi4t|en.'ed'lrrcq'À|sw'iJ'flgln'd||'
'ì? mìshboùclls vit eiE rhcndimum v'ttre 6mùr yi n couìdh0pped rtìú au -"i^, -tv. Nd" ,hú k,h" nE. *'
E N .=t ú a x Í H r . 6c . H 1 i s , i r ? , 6 + t a d , l - h a t o rvÈsc 6a! rbeD*inur vrtue0) ìs
i.2
r.l+01=l
lbrJ'_lb'dol9]0:=4md
Prce(s* BibtiosúPhicnor,
1..13 Finding tne afignnenlstùrt givethe ùighcstscoro ncDL\,Ne car fonoq ùe pxòs frd,i r,,.,, bekq?rds 'o /{ 0 Tlìe dovs ro rnroq, r.r lhccrrDprL!ro 5ho!nin Fieùc 1.3{b)
med'o nrìrdc (4',4). !ùicb od6.hd rhe ! bìlir usirgrhosiùrès(ssriF r! r).!e [ad ù( lorlnn omsllondirs ro cen rr.r À\ nr|ors: . ifrhedorco!Ès riÒn 4 . irrbeùor.offrnùm
r.r.rììecoìr'nri\k,
r, j j.rhecoìrìiùis(
); /r);
P{R$4SEaLoBAr ALrcNÀnNroF sEQIENcEs
Itr;i fliÌ.;'":fli#Í;"",*i.,1fl"'il,TJ:;:^l**,* , r".d", j#ifi d " ".","".,4 ":;;;.;
: ;::-,:":Ì::l-;,î,,
ri,"nT.':"y*i,""ffi : $"s.'f*, i;1,s.r"ffi frlH.Hi *'*ffi;.:"í::'*'*'**oreposniorbÀdin4ùd,ùein4hùe
sq.'enì diemÈÍs otr ei\r úìdhich.s 'hrrrisnmù's!tujnu iNd
;l
b6lglobrlrìignnmb'
ldùrd rhciiilioL
6 /,r.*r .(r.,
r. r)
Dmir'yoImchlmk'ljndeippll'r
Rtt-
:Rzt-t):|t =t+):j
:=j
|úd
t\t =ai|ts|r =' '.bltkt@k(i t.i,t,k+t)t\d ifHl.j=HljL'3th.n R \= 1.[.k+1)tt] t B , \ = , 1 j : b a ú m . t : l ii R t = 4t: R2.,- tt , b4air' t(i
r,j
L. r.r + l) rnd
PARWISE C!,OBALAI CNMENIOFSEQUINCES
rsF
rr
(f) tu 6rcdii h4ni ft
t" "' ". r f:'" I
a
s.ccncÉlu mof btúLs njgbr loùN chh neolnefo]lwiDgbLfksisoÌledlgap'
:n erùple or d rismeîr wirhhofc snpsis
r.5 ScorineMatrices Tle $ùiig
ùstdìl s{rion L4 i. bo simptero be Nd Nbenilignins na p uKu oi
tu asidùs arufjns in rhc seqùeiB. For No esiducs {4r. ./r. w leed d liaqe or ùo pÌbbrbìriry (or LlÈrihood) n\d rtry ha€ , omnm ú;,br ú rharùc is a resùhoi orè Òlseveralmùbrirfs otúc orhd.11ìèpojtìoo ohh. rcsiduB n iflored. ry of rìì. cúir-! aminodds is trrd. This
scoRriccArs:caPPENALTIIS (ù m*diahlc c.nn). rhcsoring mùj
s ù th! nmcj ùd re erptaii.d ir chaÈr 5. digùrs4ù|rminoejJjÉDlalìÙi
q
1.6 ScoringGaps:cap Penalties ÙgxPsú'bea|iginù'gaqto'nykd
E
ql
3
! :
5i
wd ù\sùú! rhd rhedeleriontù insnioi, ir úc.rolrúiotr lìr sÒD heodrerwr, of
P{TRWSE Cr_OBAtAttcNMEMrof sEQLENC€S
'îî'
Èt
.îTî-.î --1iiÌÌ. "-îi-"ÌÌÌ "Í".T?Ìiî. -iÌ.îÌ.iiT-î
€,,
!a
'ÌiÌiiîl-.Ìîi '-.ìT.îîi."ÌÍÌ j-Ìi*iÌ.
Ì.iîtî
Tîî:îiÌ.i"iT.Ì --Ìì-ifî-.ÌÌî "ÌT-îÌi-"ÌÌÌ ÍdîTioîoaìN1-
-î
iì'i
il---ÌÌ.
orrr*:qÀeFÈ>>
coRrNccat s c{ PEN^frEs .lDgclg.pNÙ!$qaìreiehboÙilg+Ùt FobNrtu 6f gl'PpÒ:lriess'n8rgErudft
(r r) arcsad rohcq,r{rqrq,
'Ihe
lÌned gap Èùr.y n,ndior (sr : sD, !ììi.h wd tìs! uscd pÉuus,f ,\ aioheiúLlyGidnihNin3or rnddcD.cncidireagrpsboddbep4iurlu fnulr for dÈ slP Fnaìq, dìoùìd b. !n dr,r. so.o..x!O lhe frrmridinr úc hnelrppei,lry s!|ìpeniìryruMiú(lhilhis .oron
r.1 r
lappemìq!5+(
r)05ecgddÈdiFmeî|
{hth lonhiis ferù (rìJr) saPs chrqùg 6e srp p.Ntq ro | + I
)0.1rt$1r5ìrrherìign,..d:
d,aodlypkJ|ylirc]:10&{..J\yevi]i I dignhcns 1{c ch?pÈ2.3)
r nr^t
i4 ùi
)iid
d ù! !ú.nx
$
ftlc@,
4 &id
PAIR'ISE CLOBAI-À]ICNìI{INI OF SEQUENCES
HNeE, otutrmof thes{ùeoesÀ rsub*qrr..onhèorh4 andaL2 so,rldh* beùe ored onein 'nÀtcrè Gndfoutd beroundir eùdc!È *m nÒ'psaìized).
1.7 DynamicProgramminefor GeneralGap Penalty mepeo.]ly,ind.p.id.dofhdìonglheg'pis' Forfinding'hev ueG.of) in n ,j vhenseFnì gip p.iÀì'icsm Nd, w. nur . rrcrof
ú. subitignmormdswiù îte p'i 4r,d/, n hr g r pi i q o ft . n g 0r ì. t < r < r , rhI grPi! d ol ìeDgrb J. l<J
Fism 1.4(a)showsvhich .lemedsmùi bèu*d roc,l.ulaG r, /. 'ne Ecurere fomurr for rhisis
Tlrerim. compbxnyof ùk ft.union cm befoùndby rÒrinsthú rhenumh€rol ells samin.d rorfrodins4 r n 1+ i + /, here 'he'o'rr uhb.r orclb ex,nii.d
II,r+r +/,=* +I., +ttl = mn+ abù2 + 'nnzJ= oanz + nn2). Fiem L1(b)show!lonè of rìc viruesitrúe dyimi. pósruniDs (DP)Èbb ror aennpleofusilgffitregappdl'y'Îiesolin!shef.isd.|ìn.din'iè6c$'s we s ùd ùè woft fd hrdingùc bcr îri8lmn whú r súfar slp pedry n I liDar sapp€ntq,
DYNAMIC PROCRAMM]NC FORA
nr dlnprc. noddrhrccnrriùs (ofre'lgrh l.2.l) rìJrI prndryd 4+ 7 + 7 = tB
sso rimcomprqiy): ùi\ qirb.'E
irgroúc0c grPpenatB | (i runniq rime(úh 'be ere $îs'
1.8 Dynamic Progranning tbr Affine cap Penalry rhclir$
elp periLlty(sectioi LJ.1).
PAII.lvlSEGIOBAL ALICNMENI OFSEOUENCES ]L
risrnu
liMúor
tu e dson'hnh im@g+ pdry
ir b boinst!, w. nur 6rd ir ir is rhesh or r e.p Gleo + sqqr, Ò!î dtrsìoi Gdd). Fordèdiùine ltij we lókd ,er rr ',4.j I úd H,-'.i llre romuh roruùs d 'hcd1@icisrboùrinc.clls H:.; = Htt.r_l+ Rt,tt. ad ùis tu ein bèùscd.sincèir iovorvam gapGÉ sErion r.4, ror rl?). Fù cdùra,rns ,.r)'i (,hc!ìcmriye yia Hr r.r. {c úur brè ji,o móuÍ how rhealiennm'foi (q r r,lr ,ceetrd.rùtrcsshawrob€oirileE{rG@ (r) t t,r Lj be'r.sorcaîi - I,j vhm comins rrcmt -2, j rher di;"=Er,.r_8qd O)rd4
'.j bèrhesorcati - 1../vhetronìng nomi
r,j
l.ftisir
s {oùld beone,{irhoú, blÙ]r Bú 4Ì'='j-'ìr c) l:rcr
1,rhe$esoÉx'i
l. j whed.oniós fton i
"ljr"=c-,., si; - nurr, i.r - fd@d.,.r
&@,'kd
e"*_s"_.
2.j-l.rr.n
{Ì.ICNMLNIS'URI ^ND (EAUENCE DIST\NCE
NbbciEdú,qr rqudi{s Hú!ù.'h!rreorirùìisfillùr ùdù o(,r,).
1.9 AlignnentScoreandSeq'ìence Distaùce
! r ! , = 0 r b r .= r r - 1 n r 4 + r . 3 n d 3= r ì
onÒ. did,r\hi!o\ hd\Dú o6j&rsco
P\RlrSE CLOBALALICNMÉNÎOFSEQUENCFJ
1. 4, <1,:+lrrdúy.
Éx(úretiaaleineqùaliry).
mtrdrdion (ùe hsromaúon ùth rbenininu : prù of*ings i, io! rec$dììy uiiqù.. FdÓfpÙisdofbioloeiúlsquÙc
Nmbq of opúarioÀ)be'wen
lar *!uqes.
HNcvcr lho ìrc nurbr of
disimcnr Gorcspoidi'ìero 'bc hhrory)is, howev{.
Foronprnngdnur@r b.!vèi diffee pairofs.qu.rms,ir il@mmonrodivid. rits obsfl€d dirb.c by rlr leigù ol úè lorser cqùei.e, eìIting in (ElaÌive) severalnodek for coreriDs aq mulrì ryú of ù. rùdioi I$ rh. .dftcrdl romuh {nh a ìosin.hmi. furc.ioi. Ld , É rheobsfl€d Geìa'iw)dùhncarrhen! conmonmondfor lìndìlg úe .offited . rehriE) nùúbù of rM'io$ (
x-
dln(r /(rr,
?hùc a is r .o$bdr úd /(D) i. r p aù! úàr for pdeiis ({hetr colùmns*i'h brirk irc ìsnorcd)is f(r) - D + iD! Kimum 1e33).(Note.howH, ùr $is canod bENd ror rqc D (D sndù rhù
0.35,si'rc lrri /(D) ùoFcs sfcdcr'lún l.) usìlg 'h6eurusbra
.oiú,iÀ iavr dtriLní ùrjno rcidt,
ùìd /(r)
rL,if Lhsobfrcd v,lùr i5 o 3 Gishrot ren
1.10 Exerciles
1.ll B
À -
06
4!r19
(c) conpùc1tuiignmetrhroutrduMùG)úd(b).. rjndfúcrchofrhqu
\rts\!rsrcLoú\r {rr.NMrNroF s[0ur]Nc[s (r) h erìal .6s
À rris Gasombìe (
(b) îÉ gstml prcedurc ror dynîni inìh\i.eyaybùk9arcoflhis
(!) clxùg.Aìgoúhn
I 1 ro hke ido ac 1d) vi h!!c úc *qù.n!d\ ! = 4r u, rynìbols. t roi ùieqùal. ùìd a (lìner) elp pemlq or . Fiid rhc bsr
trccbdstùùÙì[r.qE4Nti.l
(r r) whil dtus ú. diskn foùrd neùl G) colLddhùherdismos($rù
l ll
Bibliographicnotcs
\ =rÈnú rndwu6!i ( rero)arir. d,n rÌ$ùdh aodaqs (ree2)úd isbm ror 'nùl'ipleinuÌitiors (rorDNA q :- :iixrù!pftsftdinKinìum(1e80. le33).Li(lee3,leeT),LLorcu(ree6)
2
Pai Dal
2
PairwiseLocalAlienmentand DatabaseSearch
.ù. dishlÙ reraÈd pmbi$. i\e momurarion of mddios mi$r ÈaE Gad uneEnry rD$ ùe sqrner ìèdiis 10! sirilú scsrs Ènaìiiig in o'nÒflis Ìúghlir$hmdisàìmoEdiledyÌhmugh
idenrily a signilìcd mrch N Ìhe idúd ol 0eft (sùbseqùere, in dr rvo squenes. hiorogia! ùEbÀ*, 0d r[crc s r 1ÒrÒr rì DNA sqùen(s (EMBL (Eumpo, DDBj Oapn), acnBnk (usA), whùh aìì drxb*às Òlptu'rh *quofts (È.s.PIRúd s{issÀoo :ùrrin tbe smc $!u@t, rd dúrbri.s or prcÈin rruc'ùrs (PDB). o'her dfubiss, ror i6Ìme. rnnsFrc Tnmcnùion Fador Ddab*O, ùè m \tr enry in one ddab$e my conbitr intormdion Fratd b an eíry in ,nÒrhcr ibbir FÒrcx!'iplc, r EMDL @NA) ciL-y rm conaii rhercsion codingror one € drh $Èilirs
speidàr dftrtu. Fd dm
iÙjs'iúbùtbcitrurùd]fncrllAùes'
INC TWOSEOUINCES {ruduE'*istúefolprctiNùÓdcd
, q'fl rc\ lr i' id.Lb.rDih.h. jjity6notpHÙliÙnhonolÓgoul{qucnq.
ds lor $. squcrccs, fd ù.i o dlicicnr
2.1 Th€BasicOperafon:ComparingTwoSequences îc hrk.ú b! Ioúuhrcd asrorrowsciver Ì q!.ryrqlr'd rquocs krt. fiod rho$ l Nhi.h \hfc
4 aM r dd{b*
D ot
ftù úrk F ro hrd rhcsgmens Gùbk!rcnet or Ìh{e rwoqm.$ {irh bìsh4. ds8Eeof sÌmilairy. úd ro .rr.ùL{c d.lo.a/,/4',,f'Noblicl.singo''hc or \úiis. (vhùc suhsqEnft hs úolhq nqoitrs). nÈ remonror 'hi\, p.n.ps $bequcn* aÍd {birirg idùch.igÈrhL} rd spnfy if {. Nc ùb$quúc
a ssaqf
f 3 sùbscqwrceGubrù!)
più
or r of l'. a {gnon' do* no' conùm
en'rsn' c{h Òf4 6d d (h,y rccdno' a i,.rr ariqnD4, * ù dipned or a seenÉÍ $n
2.2 |
sh*p$''h'he.prcbabìein$ìinlikePegÌide r úoo ùÒ rdn 3y is (rÉ!.s d di. ciÀ iidi.!tr nÒÉr!rL6 ù hd s,pn ..5s
, !o
YÙFV:PH(N!GÀD5DLDALNPLA9VAEFEÈEDNSlSÉ'LRSALFPCsI ì.61--------s!poÈG ro.cacvcsLiaLNycN ]E iNSLÀMRRTRORAGI'ERC'
35 GIEOCCAGITSLYOLENCN
idu$ ù 'hc sgnen$ ) rhc kÈr arisnmn'
:.2 DotMatric€s =ronmuy'dlefrnlt.hiiqmmedfol -rn€s (aì5o arrd dororos)Àor'! r03ònidÒ(!Òdi.rL),odtlìù*ùr./llorgtbÒotlrrsi'1.(bod,ùúl) Tbcvllucs dorìn'heeI (i. rmexnsrharl, = 'r(j) -;ufer
ì (r) \hors ! dd F!trjÌ, lhúc
I rÒù{r {b{rings (doft'l liî., . fn csy r, undc^knd. ivins r Lisrrld@ì
{oniìs...t g in F gure?.lG)ì
a'
:
[email protected]
(a)4 dd mù,
lhowùacom 6x.q'hvùdowÈoeù5.Údfuùdd60%'
sùlheind?ah'idìl'oràmprq3ìl Fjsurc 2.r (rl aurd be qEnded h frve
ps. c aid d could, ror ùmplc, bè ombii.d
Figtrc 2.2 ir úc dot úiùix for rhoh .on{ (itrdicatdby rhesìÍshs)
2,2.1 Filrenng
P^IRWIStTOC^L ALICNMENIAN
-
-
9. l !
1' ;
:
l'o/r}+r'e'c''lrEicon'iues'j'ìì'hemxt i::nionorq(rcw'Noìn'hedoÌmddx):4, 1+r to dì r. e'c.
rÒ md.h. Usù,lly, xn crd mrth is iÒr iùqtriÈd:
Fieun2l(b)showsrhedo"ndix*h.iawindowÒrsi?xfi vcisu*4rndrìcar c rh{ dr noisedisill{n, bú îhe $siriviq (rherbìliry rodiscoE **l simiìririe, ìs v.l}rDdl.
, t t ) j + ' t > c % r b e n D ij = ' r ù d
2.2.2 Repetingseenenrs dr{ r+r ftùè eirl bddos onù. .ubdirgonirriù (i. j) '0 (i + i, j + À)(r.c
s\
.:
2.3 D)îamic Pmgramming ììebc{ srob{r!|anmsr. Dyi'ri!
F+' ùi .
b$' 0 ghrd \conrs) lo(L rtis'ftnr
Òodi
ilr /6. TrÈ ber slisDmenr. ftom ùreb.gintriis, rndiie ú rbrhe* ftsjdùàsthc bsi 6)n nMd itr Fisùr 2!k). Nrh i sor of 0 4:
dt.r
DYNAMICPROCRAMMINC
2,3.
sis d1''
2.32
TneexamprelhoNrúútprc6x6 Gonins 66tìn 'he alisinot)
mùlr(4ri /J ,) r!t
{
ch(uc
nlg!
<, r
2.3,3 liidiig gùb,r drcuúÀ, w cm Ero iÒv |r,.j b€ rhehisles sm or my h iyrbols, or 0 ìr rhh sóf is < 0l
à , . j= m x l 0 . T i f t r ( frt..4 . . .:,r < f < i , r < r < n l .
'] I Ò!nmN bc úkuLxÈd for an r, j, dd r'ÒrI Lùo, eapp€nal'y(3r = rr) 'hc H,)-n!ta.H,
t1
\
'F
3
'
| 1+Rr)t,
The sÒr Òr ths b€r ìoù1ìiirgnmùús qcrend*j'heaps.sìi.ecrpleìrcDc4tit coirnbdions. l:fts bef ro.al iLgmd' ro úìc hishc{ hrtr. ù H )
= 0. 0 < , < ù, 0 < <' t0 r = /1r.0 or4.l is showtrii Figor 2.1(h)îc
2.3.2
(2.r) i@
Finding lhe bcst local alietrmenls 0 is rrched. roî 1rr cxlipr. oic .siry jì!d
L i\ nov shthtfoNrd
4,
cú
46
cED
h .hdge úiÒx
irnilrrlridn.ùndiidùd.0irù.or1n scÒÉ(of on.r
one atismen. n enoùsb).
*r-fdiig toar disiFot (o' au 'he aìi!nbur ùr h rft i" flL 5!qù.ru, Húwcù,
reagìon'mdúemysm(lol*ample.b in onmon îbn righr b. do!. by tu
23.4
S.oring matric€s atrd Cappenalti6 lhespecÈdsolitrgolî]ìgni'eNoúbùaq
ofbnhc]'oudÀúeberlo*|s]ieríÈd
hi.ssùfusd.rbomúryilcms*din'o
lúìino rcidt be a, md úe soE beùeen lrldismotvill
bc(cD cn).R.dù.idsthe
Rcdd'g ùe p.ùdq lùrhef (whilek.rpiie & > 0) will fsùlr in (cDÉc cD-c) eÌhalinlhissinpìe*mpleÌheúmdq qEhonciscqlalro
Gho showirsasair ùat ùe !!p pciarq .hotrd deDrndon Nhi.h (onn! mùir is ùsd). ftogrun nù d.iry ,ienúcrl1 usrlry dlow úÒu$r b spkìry which$riig
2.4 Dafabase Search:BLAST sb]*dmdyimicplog8trìjng.mdal$
trLgppcfIJqú.'"'d''Ù.p'ogilmjtrg
pfoglaúsÙ.irgh.Ùirj$hù[hccidc
gFop or prcsÉmsqùEd FAsrr (LiDmaraodreuon r$5r Pemon 1ee0).ind BLAS',fr(B$ic Lmrarigtr'icu sclrli rÒùr)(aL*hùrd r. 1990.r9!7) îc pmgmmBLASTF.ùdùì*le.sionofilpiìl bcdqtrnb.dìi Chapt!6 rrsom.olùc1(h eì ddlilsBLASTTNùkr slieì,rl! ]rìe[linpinciplefolBLASTnbfi
. ^ /i4$n4, vsnarzo,
(MSPe/)
sùrcobhii.d|olìosìÙerppddienmúÙlqsd, . a h4h ,cùig,4ù!ùt
Paìr tHsP)i
mrximaìscsmcn( Ilair(LMSP).
MsPar'rons ovd a rhcshoìdv (ve 6sùN r mndr.d
soq *c chaptr r.r 2)
ùtit Ì,tit |ih d stÒtèÒJ'tut k! ttttÌ v rbodìr b rìndúse Nordpans(or rtrgú u)
rlhes$niviiyollbepmgmìwiìì.1 'o b. dn.ded) Dair milhr bùdv'roÒkcdGauft3 hÒnohso6 sLquLnccs
24.2
In rhc hr {sion of BLAst ùc 0r@dtr€ abor h4 6.ùi .hrigcJ ro rc'luirc tlohitsbÈlftdÈrdjle'fuqÙjrn'€l {s'mhierrssnng{odptùsndfafiÍon
erbe.onbii.'r(DRrsEv,
b(DR, sE, ú).fteMolììb(DR, tr) DFsÉd), posibìy cx'etrdincrorn HsP.rte hit or sE rerd ol onshf. thorhÈsholdprlncrcf 7
rd rhc alphdr be (] vidr cùdtuLli9 r (i = 20 ror prcEin $quenes, nDr rù DNAs'rrcnccr), ù'd *old leuu , TF *qùerÈs/islypica||yequa|lo3'm N' /,Éptusr?4 wh!rc ft nnponio'do. eord$,r(Òrdlpo*ibl.$ordsorì.rgrhù, Ì e.r" md, whichrlnly r(,. oa)> r rÒrr le$r oie *od.q € q (r is 'hesoE) irc round,andarurcpÉscíedii 'hodd! Endr ol rh" .-,1' 'r (|nÒr fo.rd ir rhepiepressii€). heEe frrdir! hùs.Trc 1. scarchin d ror0..ùronss (4) or rhcwo.ds(,?) thd ùc itr rhedrrasrcre 2. E{erd (boùrid.rlly) ro r hish so ig scemcnrpri4 rbosepin or (îon. orúl,ppitr!) sùceedilghirsúich srily rhedhùne onirìinr lmitè HsP9
':
PAIRWSE LOCALALICNMENT AN 2.4.2 Pr€prccesslhc querJ: make the sord lis!
L.arl,hibe'e = 1À,c.Dl 1vi.hqùd
ThefiN wordi! 4 (ùl d{ÀLy mÌche! s (s.ofc2) but 1.5.( rÀ,4) $orosI úr (a,D) $oa0 t we rrei s.
. rhewfd sditre arnsiduc3 ó{.h$ ac, Dcl
Ahb|Ùo||',dntiqlaEused.otreerù}ln!crchPossibìe{od'AtrctrL'y.onbjnslùc
r'ùi plorir *qucrc$ (r = 20) md , =
$r ruaEhing rcr'is ir 'rcqúcry.ryei.rry.
j.
ongw4dfusingrJjiib{latnùchin
of 6c (lùb)rùd of ú,t frt
atrdrhe bú (ede-) ls oprÒiùr !ì a MmE mrhiE \ih
r!(rl$G
on {r*)
Th. fr e{d.
2.4.3 Scannirgthe dnlllrascseqù€rc€s
c! sroù5rJ (rìndngnohirt. a|d r
h cid rnd è Iblloqrh! rdgd hidÈd D 1I rnd 2) oi ù. ldPe shoug thaÌ rhuD !t
D DAIABA'E SEARCH
r*e .D (inwimJy) hble (dot nrtix) rs lurmÈdbyFìeuE26,U=,mdm*rl r.r i bèrhejJìd*of4, sd j orr. rùù rr ositionof 'hc!ùrtdhir(hc u{roùid)is .:..Fdin i/(, (ai amy). as l js $inned. dìehrs ondìediaeoidsm roùid (byùs ::ùerúr.djagEn ofrheúbb).nìe fis' hi' n s ondi.somj-2 U - i = 3 - 5), :r 51úù indù ofq) !s sd.d ìn ùù ú!y i/. suchrhd i'Ì( 2) = 5. Trrcicf hn
'' :.;;;,:l'; '-;.
r,1^" 19r:t-l;.'..t""""..i
í.1 ,;'::r.",",,.'"'-:,'',.,'ri
"'!
r"
"
(c,a),nrch. sinlt rhe soN has dromcd Nìrh \qÉ r j ro fouid (À4, DAJaîd hDlùv rhe rh8hord ( 1). r$o $glnent ùri6 (rcaLlv)roml Do\imús !^!l Nde. hslcrù rMl iì$! ùe nú 14ccD,DÙa) liiehù soro (2 5) iinhùerrunrior({ìth cÀ ca) soúd ralr ii i
2.4.5 lnirodùciúgsaPs
',. 'h '' :;"".;ii;.,; ";,, "" ""
".'"'
'r"
d:
..nYbvrjschElÀvaR....
AGARL
rr\5ed (io r pofpff*siic rcp) kr !h.rk ifn\ey ún be conbilÈd ro r prir (dr iìign'ùen') ùh gxps Howeki thir E
losxlctimc.tbet'oINitrgi5inPìemenèd. t ú. HSPÌrù{hi!Ò!bishrsmah! sion À p3{dtrd fù o,Jy úe pd Í dahhÀE squenes. Nbich cotr$ponds I r rypicd.Ì{isn qlq sLqucnca (s$ dùci! s!p: ro in mry,rd iisnner
(trHt) arrgn
(ù rhe DP bbìe). L€Ì 'he d
ihrcshordbdov rhebsr sor (or rherudr
$pa
p!n) aE @tridù€d.
,,,tú rmù 'h. IrsP(otrcrcsirucrro'i
.{b ùrrlr 5.-s'irùsii (4.4 fti\ pai
u rn. bcr slpp.n disrhcd tnuîd (unrirnoql hrf soÈ y rd 7 bedìe
DATABASE SEARCH:BI-AST
ù482'7fu6!!EilbrnBù.qa
ùÈ3hol4nd, b.'he DPratix'. in or lr,j 1,l/r L/ r,t, ,.r bve roiesleslhms,-Î'lhenHj'jwj|ìn
arcorithn2.1,BL{ST rppn'.h.
I qùùyroadahMs inaLAsfforcomPanrs
rhrcsholdfoldei(ùrsdyrmicpóeru
j
fo. ra1lrdlisdal Àdoc$rn(hÈo o fdrl:= r.o, u+rdo i := îìc Niitiar .fthè nd,hiB
wúd h q
ir sorela,-ú| ,+, r,rr r+ú r >rhrldìen daaú. vcttenîrin Qt-tt-t) t+ú t,ttt r+! tbarBSP ME thè Es? lb. F'ìhb dyúnì. tm s nninN
ll soft(HsP) > rhf2 tù€D peùLm lyan1i. rmsnnùùt
a,.tùt HsP
2.5 Exercises Tryùisonlldsquen$d=DAEAD
l,l (n. i, bc 'he fln' rnd li$ posidon i' (b), ùd (r, jr) bc rhc úme fo, d. s
!,1
EXIRCrSÈS andr ror n\e sappàìiry (ììncù).rw Íqr@s
*c givcn:BRRîRî and
(a) Find'hc highcr ndn! lodt !ìilnmsnh (yotr$oùtd findroùt. (b) Youhxlr rh$ly d foundrheaùgmÈú
cfler?rize Equîrion ( r.2) ro bÈyiìd iir rìldine a l&aì:ù4mn
{idì e.i.nr
aIA (noft€ 'haÌ only 'lìe imino xcds
A . I .L . s c r l
u s ú = 2 ,? = 5
(r) Mr]re3hbleteiúali|osibrevords(=4r= 16Ìord!) (b) Exr ncq, úd roroeh woln' ir 4 (o lxdinc l, rtul ror.ùb wd. iìndm (d) Youwil iow iii,r 'hd 'hcr ir x *o lh*conhinssLf@m,atrdhNs$rcsi hichHsP(virìì $ùe) woùìdyotr rbùcùr.oflYlt!ù (Lro)! Bci\arc (a)Nov debilk,m.orùc sùbmùdri
2.6
{hl r,ìsc
irc rbrtu bib d rhcsùù di
par (Ìoe àroded roHsr)
2.6 Bibliographicnotes dornmrrds\ arsofoù'dii afgos(1e37) hd\lnsMùd^4os0eer)
sqf
MP/www.g.h,sdq'!rc'aJúùhi lurion!d DfotnsedGeesÈohenI qql).
\'
, .'dMità rd3ò
r00t. FAsTxn dscnbedir pàMi ( 1990),,nd cippeJ BLASTjn Atrs.hùrerj
J
Sta
StatisticalAnalysis compdns r qucD\cqu0ncc la) $rh É.h 5ùrudoct(d) in ! dú!b*. sNc! fúc ro k{d ,iFEcdl naiy G/,.r),e$h
(homorogous)tsqrqo, Í mui hc fÒh
ÈrfolsiEJfrlmeofîa|ienmm'o||qo
3.1 HypothesisT€stingfor SeqùenceHomolog' drsist$i.eHypúhsjsqùgatrúidly
hypodÉsis.Él'À.1$'qiH.
r giEn rhEshdd{qc o 0l (2s]), rher is cNon ror rcj€'iis H0 Gùhc 1* rcvrDùd r.cpriig ,r :n!ndù (4,/) sigfirì.aú.i.e.irrher -,v
ùrrttúútt
údpdhrn
::ihn@ql!i!núdlNlRì!}Lú
qqitur
HYPOTHESIS TTS1INCFORSEO!
rh. hìlher $or (hì-!h6' nsni6ene).
ùd rheiÈ!sld PUfsrìlr (4./) vrh
gùi.ollJ rquur!$ (\ccscdu :. t. )
Dder!.e
ùe rjcdu
ÈEr for ,!ro.fo
Fshm dÉ A|e im.ú.hosedù (:) fioddE leftnr
piir f,!ù (.r JJ \nf
n! rherî lr o' higlú, !iru, to G$ ,hc p'obrbiìiqdisribúur nuoduboo. rilr onrìnrc{iÌhrheq{úo. ted rof
3.1.1 Rrndnm geftrrliotr ùfscqknres e rNro dd r^ sù
a$ùf!
ù'Lphlbdútoùslmbots IÀ,.
oosiioi (or úe nndor
!, Et ù'l rrroirqus!$
l/, =0rì../i=0r /r=0r. r,=04t.Drnìbedm+rwnhr!tubrrrrnl'\i!!
ndù .he cure Fom ó b @ is rtu piÒb$illry
rcÌdsin rhemtualsqEie5 (4 d)i\ u*d or onc ({tr bÒrh)or úè slùedÈs h dor
pr&dcÀ
Ior úe pnbabili$ dìstib'rior.
jlù8,ú.{q[mcisdìvidedùbcej
TESNNCFORS[AÙ HYPOTHESIS
'' . , "*.' 't1,.'
;:
r'
: tti
r.:'i I ''
, ,i;....,
! I dL beiî ù.s Ddirios (bÙrùshut['d ofdr]
GÍ|gl5iiglesfNscÙafl
siadlicance 3.1.2 Ùseotz Y!!ùestor cstinrtingthest{tistic'l
ú.7skfds'hh.]llThj\lsudlesúd].
1.2 StatisticalDistributions
3,2,1 PoissonproDabilirydisllihùtiotr Poksoldistibutioniclbenoniqrnlir|4d',nìe (úe prcbibiliry Ìhat 'he rochÀ,io ujrble
=.r=i" " Prx Plx>.r=, i1" ;,,
.
{ vill hre
:1.2,2 Orl.€ne ral ue distibtriio ns LertrL.
irs . r, tbe iodeperLrtù,
nrùcrioiroi ri i\ ùen (:incedì! r uo indlrlfrenr of eiú orhù) ,l= Pt\i
I
=t.
I I
qhn'io! ofr fr of (oven:ppiigrslneds ! (QDdon)qFcd
ofd. iid r1 rhoI
'bcf,@IlliiicpÒtÙliiLh|úje
\ÍrúI+!drioro$us(úrrestìoq.(ri
'
rr
î!F)brhiirr
'
f):
l.l
ì (d.ishyldn'nbúFnof f is
1bc 6mof tr(r) d.p:.d!.ì r urd(
PtY>rl=t
r'ìr)=Ì-erpt.
ohen.., rr) htrrk,NlhcoccLhLre s
, 41
6r)
îùc t5 ! turfbi be's*ùù'.d; a'd,.i,tr (0.5rr di\aihùrior is Enìe1s coarinr)
rhe
r|.|'
\nallsi\oÎ Slàli.lnrl SigniRcance 3..1 lheorelical KÍìfl ind^rf hd (Ìe90)hlrcdo^.rhÈ
r.{-k.r.
.Iis'heiìphibdor'hermi0om,ds
tEq@.ùr lp,l.liJ
(orter, = t).
E= L
P,,ir,Rt,.
I ,""r"a"= l be)otrdrhr $ÒPoo. rhisbúk).
{ed rmn l,rd, ùd I oo{ rhjsis doir
t lrr ) nd {tr I rrc sùfi.ieiLry imikr
ed { ùe 8Nd?'1
ùùhtt
o! ylnIl
3J.1
t (.:,i3 urdbùiùscof 1h.lPrxjmde
By sritr! Ì = 1 iDEluim c.o *. 3 Lsttf ptubtt itúr elIùùùg 4t kai otu r , { s ! = P ( r M> s ) = P ( z r ,> r ) È r - e ! = r = I sp(_E(s) Noèrhdexp(r)is.,rùivrùrk, !,.
qp( r4,e /51 l1j)
. By dpmdiis Equdjoi (1.7) inb r p
P ( s r =L e \ p (E ( r ! r = - l r - = - + : + - +
)=
wchlÉtgoseqkrùsrúd/oflère'l r. wc hndrìc b4 ìGaì (unslpFd) ar
3J,l
The P ydue hasan exlnmc valuedislrlbùiion
rr ùe bìgher$gmentprir KoE foundbycompÍie! olúo squctr€s bcs . Fón qùa'ior (3 ?),!c 3.r 'heprcbabiìiiyfq ?(J') = P(s$ > J')r r
è "= r
dp( ,(aze !3)
P(rM > sJ I | - *P(-ehú'De-!r). P(SM>S!!r 3!qinsi
cip(
= I md! - 0!lr-Ì)/4 P(sM>rl
*èsd
- r -exP(-e-nr-d)
(r..1)rbencewe hale rbeso punebB (I and d rli.h is similu ro EtìuaÌÌoD
3.3.2 Theoretical ùalysis fúr datlbase search mrrysh fof ùslpP€d ìorì aiEnmú rór expraiiedii arr{hd d rì. (ree7).,rhdrollosinsdcsdprioDis bxcd ù ùir atuìe Fora {orc s . 'bÒ, erìuc(hc expÈred *qu!n!6 wirhsco€sor d ràr 5') ii gtun ji rqùdiÒnc 5)
3.4 \hùe io honoìosousr,ìù.ncs di:Ù. NorerlDr dr , \.rue eG^ b Ìhe luruq sgmcih, bu ibrvory smrrr t {lùs úì
o,
s n\Ìssor *hci . squeiccs (indep€ndùi !ìd or shr rcoBù14rc conìpmJ wi'r rhuqmfi n , mhjptied by ùn p eìue. asùr'ingin 'rìc sùc Equ{iù (r.e) Gin N = 7_,). 'hiscqullirysÈnslohddlollulsrs Ùb 0.0r mùn úc t vi'Luebesiosroinqcca fan4 úlltr rlr ? vdtr. usd.îÙdlheo,'.,pfob.biìirylolct|t riro î.id (brcksìuid prcb.biljri*) NheDdifr.ri' sonrs naùns GM brk!Òuid prebrbìtiÌi6) tueùsd. ,rhcEr{re, r s, ido r róiultizr
sm s,, sch dìarrhè
(r! Òarbc rùNi ún úis nÒhrntc'l k disribùrior *irh ! =0.1= I (se Exercis6).) T.e nomaljzedsoEs is deroredby Dri Fron Éqú'iÒ! G.r3) we tuìd s" = In P, whùh i. ú' iofrdizn rón rcqù FÒfcfhsúi'gnaùixúdlypicdsnino rhRrorobecrr.ùhcdby F4ùrdÒr(3.rr). $d 'hoÌ v,luc roùid byEqùa'ior(3 ì3).
3.4 Probability Distributions fbr GappedAlignments lìc soijsú.,r tnùy $orc is dcvclÒlolror merppedìocir dislmnb. No prur anpuuliom|qPcfìnqkfuiglys y rhcncrhoddcsribrd È!ùìÈ..aìÒulatiig Pdl1jgnmo*ùscNb.Ibhdmlyú.y
rorany$onrs núii. îiis ii ro' (yr) p
ÍJ'
ror(crpped)BL{ST. A dnúek
Ílilgdúoms'!Ùacslrcnrly'ictrlniD rirs r.r..3ppìr Er.mi. pîrrun03o. isfibúion of rhc soú. n órdè, hd ùc rhd lill be usc!. Thisfl.eduÉ n u$n n ftí sEnsncar signi6aDc .md besiver if
hbissqÙeDce'rcnoEnANmilgúil 4 is honoìogousroa mùìmun ot fd quLry{id r *t ÒrdùdomionhÒ'nolo Pe6or (1993)ha invstiefted $venl Es (z = (Y - r)/'). sinirîdtymrcs ,qnM ùú ? 0 or r$ rlìb
tt
3.5 Asscssing andConpaíng Progranlsfor Database Search
dkr. no{!!ù
ùt Fo&rì mrgr cr rir),
!)2e ?
(ned i ùÀ!4in? \qrcn( a sequ.m. r4dr qBre irn n ouhndÈou (o 4J:dlhrei5cnr !rrùì r t ,J.4.s'df
. rr(?): ùcnunrr€rorrn€posnir{quua!s. . FP(7):thcù,mhùof filsepnsnn u s4oenes . FNl.) Lh!nnrre oflrlseùèefLÉfquenes
aMd..rùt FP1arn{ua ù F!ú.r.r(i)l
3.5,t I
../'..
..':
3RmP2ú úc q@pr. O sols P ud h-.'6 t bù MN ú! utuc 1. (b)fu ssn\4 or ù.
i r 5 35 7 1 5 5 . 1 5 2 ! s5r € . 1 ? 1 6 , 1 5 1 1 1 r 1 2 1 0 1 9 * l ! l3Z5r ól ! n : : Ì1 2 :!re! !ìe6v3ó 35t9 73?37573?l6e 6e6665É1ó3636rrio!958r!5r49t
Pr : HrsnrqnMwnqnxHsnHnin P' ì HHúffiHmmffiHHlnrhrnHn.n..
fte l nù s Nhe€FP€) = FN(î) GÈ s{tioi 3.52) Fi3uE3 rG) \hoqr ho{ FP.nd FN de
3i.1
Smitility
ald spécilìcity
o'heF (N4r simiìlnúcn. r.ror@nmea hy prcponio rnd T/(rP + FN).'he
. tr. /: N/rrN + rìPl r s/4j
TP/('|P+ FP).drcPrrPùri
ùbí. trr d ùqu\ ns î 57(?lhsrs rherr o(ùr. r rùvi f hg!rc r th) Nor.rtì
,i / n i iord{€ri,ìg
iiDdùn
'vr r ue rìì dor ro1.ikr srr i n
+r(oorrlc)kì
L(ioù!e) mdèdeFnd on rheòLùlbtd f iguu r rlh)
:1.5,2 DiscriDinatio.powe. a aii.[ piogfri s /6dirùúria
(of dr)r/n(d,r
/o1q ir hN ldr i d\(nni
r^ (r ) (r r tilud oumhcFois.quarc- \tuns y lJriLld !ftrruc \quoì!$ n {ed). r,Lndfsieoodjobindiúimùrù:heneaì
ru{rutri i F,gK:.r(! ùFir(i).
$her rr = r^ rdúrJ: fE qoFdig!hfl({ùic(Rocl
a^Rocancn!$ls
ii!ú
p..irì.i'
l-
=7
ii rior @vq. la) ft NÀe!d
trLÈ'oN.s
.r rEsìrrtr.
i . {pfopùÈ
EUI d 1v. *!4r prosm5
€?) s ùc ùrc$o
s i i Í q P Ls d 6 s 6 f
.rcs to r) f eqùaL'o 20. FÒf'h. splcjri!ìry. b se 'h.Jrlv rdirn? zk. wlìlch is
a \hich m dasirìed rs hondosout. Nore iùib fc dósified a bonorùgoN)Th .uidea|cmv!{ouidbulLafnliliyil]o
.Icù.ùnbqofmnhonoloeùss
dìe Roc diaenoì(r'ora dÍrbre or j b or0.0r).1xùdoc,!d r!.'l ! n'Ìi {hi!h umbcfd homorosous f+qes enmpkl ard rP, rhenù'iber or ho,no
(10ii our
rslc rsnirc5. Roc, is d(rìncda\ nPI
j,,*""-,,1=ff = ,rfc*o* t t '.t,o -9
riÒs rÒ 'I. fùÍriùa í EqufiD c 14) Ndc rrrd RocÌ n ! sù! 10,r l, {idì 0 s wonr úd I îs b.r
ù rbc urMt
turdir d P Fisro 3 f4 \hoqsrhi ùe kr (0 ..tr,o. r) urdd Lhccúrc. $ irhrrud i{tr Pr ìl Fìsùor5G) tuf
ùc nmbe or ÙÒddd3otr\(d ù. qrcfy) $quoirc\
3,5.3 Usinsnore s€quenc€s asqù€ries
3.6 Exercises
t-
{E !of!rFdù!.
b 'hc Roc {nc.
'.
(b) r
.h 0rrh. rhumLdqlcicc\
pmnde^ ( (he modi. ù ch{adefnnc. {ìm) mqsufe or decry confùr)j by rhr .q
r found
ind r (ùe vùiamc
ing i! q sd 3 tudÒm sL!@rce, by uug Eqùdion (3 4)
h,honoloeùsb4,wÈù.scFq Ìotrrìo'ioldgous rÒ4 suppo{ 'hr .bc
G) Ld '.ek $bcto cr.h or rbÒprcs.J P': !!Fn!nsn I nnnHns....
Fild rhevihe a = FP(7)- FN(î) ro r$ù {iù whd Nis n{nd in (a)
]\lhaì\i['n.vùcdj'lR'ld.t
4
(r'solr59)ùìdas.qr.ì!ci,,swn\prl
if rsi'iyirr,Àp(.iLorr rlr(;4siiurr on,hch{toúd dit c[óo\clppf)p Dr!ù$ !hi!h progfn twil|] rricrN$n p{r]d. t )'ouretud roberhr b$r
riqùrior(r.1)ii suhsLdòn I 1.: tReùùúq dúrù(!') =r ) r r(J') = r qo{ r4re
P}
rr ). Thh
{ = (n(,(uN)),// coNiLlùLrr(RrLlqD!d rnF4L!ùon c.r rr. slìN Lh! P(r ) \ | úr$ hr! r mrl izederkù rruc di!ù rrùiontr = u 4u
3.7 Bibliogrrphicnotcs ,\xsììnì(r99i).ìl rdjlr$ roÈifrd
(nrir.0d akrdrurfreeo)trd (rú 0d bna r{hùl errl (ree6,ree+)^iny's
conùsda (re33)(rfiisolrlr3).ik5dìùrúa (ree6.lnr).r!,6on(j996.19931 i \r{ rrrri6e(r99t
Nebk lnd Bldor (1001Ieri,ù,tuI ql!r$ tu siùd
ri ir ro.aeo ú j (:000rr) îd r_indil údFonsoì(:1000)useorRoc!úr$n$or!ii|cribrho!d,rRol,i$o!(ree6)
1.7
4
Mulliple GlobalAlignmentand PhylogeneticTfees 1ru|'iplealigrn@'iltmtÌmlùrci5
o rir r whote rrdiy (n h5 bd srid rhd iùtrir e aìisnftùrhoúrùùdty). ri odÉr ùdJ rvo (oi r jc\9 d Lhsmir rhe xi rnry .rcc
(or supedùir,
I. ùis Na)l 'lF arlrsi'
nid Do?lF ard ptu /ó dc\d
J.1 DynamicProg.amming Is&dbmuìúpl.ljg'ìÌjÙ'l.PÍod
ú a nùìriptc s.,lùe!d djsinqr
qiuN, lyps k (fe'ne'nbe.'hf ùiuns
vnh ùdy bbiks r€ forbiddsl
hd! an bctn (l d'Ilùer lrs onLqus!ò gnenby Equinn (4 r ) ilì. ùùnìbùorcdis
riisr€ {.,
{n ùs!rc ihÈrfúg 'h{ ricono, h (b)
rqucrus (o(,,ii) roreqml scqncc lcr-ellN,r, ùid ior phúid \orudoisor such . T'IÌotdùeÌhe rumiu rimebyNirÌgpruniisr.hîiqùs (.ù!oft vrúchf ì
aid rhe be{ (tr coreo sohúioris n
J,1.1 SPscoreof multiolcalignments
e:Ùenr.i|ldsi'nplysnlhembobhi|'he
iNen.d itr n. L.'s(i',ir)
bc ùepai
ise
(J, ;r ) * ùc so€ or3 rlfrubre Pri$ù. hrmlsìnrbeprct-'id re*ha€m rf *c tr\Ltiie{ lap cosb,rhcsp foa crn at$ b. calcuhreds a sùn .f (orum
ìr dì3nmm'.ii is'r. r,rr symrroì of r lrd ^
1.1.2
(riur 3!p |€mll],). úd (memher 0 ùsirgEq@'ioi(4.2)is (crlcurr'cdDs-Nìs) 0 + ( r)+( l)=-2,aMbyurirs r)= _. E u a ' i o n { 1 j )n i r ( ! o r ! m i * i s e )( 1 ) + ( r ) + 3 + ( , 1 ) + ( Nù'hdrofigivenscolilg{htme'l
Thisfolss ftún dd ri[r rhii J(r . Jr) i\ úr hishsnsùe rhieúbrc byalrgoinc
rk sùe or 'rreFrjsdiom or 'be 'yo iiÀl *quses ìn
vÈishd 4@ù! rheE m biotdgiùr (
elìtredii &ùadon (4 n a[ lcluercs
ùe
rulf$du[ÚoLdbeÚmtdb'gìYù3ùehigher!cj!lùÙùtùd'nohcd
'1,1,! A pruningaleorirhmfor ùeDP solulion
d úc,, ;eluenes (r r is ùf knNr i u d 6 ' ì d i 8 s Ì r u )c o n s i d ù r . eùr rÉ ( n . i l . , , r o r ù e D F ' ú r i x . l r d l d ù d $orcofrhebsr p.rhGLisinentiÌo'nthefú !Òrb ldr , bss. (sccFieùf4.:rG). < F îcn e ! kio! rtu I )e$orc or L mùf be< ú + .1 ú: 'heEfore,sr + ,., < ( , wckno{ I
K ùJ 4rR rlc ccttI = (r,I l){ìtlrher dÉ hjeh.r súto nÌ!isnns aRe, as. R (sG.Ì ). wc rho hreio ddùmiie rì ler uppùbdid (iì , ,) for rheirieiùh
t 4'!i7 r.usior j\ ùsd iirúd ÒfD,lrlad
j
'l G) r4ue 4.3 | shrhs rrìÈfo*d prudDs (b) îE s@N!
(b) d!
I i. Jr aa ri
eìls Gs in bek*ùd dtr$iÒo). a vrl D(r, ú). rher úc uùe r, + r(r, D) h sc b u: rhc$orcor drebsr pÍh rÒu
by sc ol a queue. wìÈi I .Òllis vnncd.ir pllcesib rùsùd ncighbou6(o phich i' shdùrdind ujuet in dÉ qùnc, n! oc
j\nji! cùnft. i:) ìh rofld deiehb.ù!.d! s h o u l d b e p ù s h . d i ' ì r h c o d s ( irL) . (i :a++ r , i r l . ( , + r , i r + D . Algodúf 4 r $ovs rìretoruird f.ù\iùr
snh prunirg
DA.
AND PIN'TNGENENCTREES aìso ùn 41, FoNad-lffiion
wth prunin&
ar :iconún rordoirg der muìripteati Fonùd lmaion is ùsd, {idì poDìtrsor.è[s t0'herúcrror'rrcDPnarú(rq0, .0) ù, rhefld cenor rrc DP mút {ar,r 4 . ,i,) rhelhole alism.nr s(r) ?(!) D(!.,) 0
rheber slm of m atìsmetr' (pú) frcm/lo .o ! rhcsÒrcorrh.b6tirisnnqrrmnÀob!rÒu soiT 'he$oE ror dbdìns thealiEE a sbck of ùe ceìlsI for whi.h a yaìuero p(ú) rsfoù'd
Flr, /ix) ÀpM.duÉ rhÈh frndsm ùppú bÒud ol rhesG ofúè rlisntur tom r ceuI rorh. èid-.ètr, fl r = /'oì P(,) := 0i push(', 0) pushdresúÉ cf onrh. quce p o p r , . o rJ:r , ) : - P , i ,
h6sorr .J,-..!r
irr(r) +.(ù.À,) > ,( .hen
6úis.@L!
tor auhNod neìEhbÒú, u ofr ànù ú. 4ht otd.l Push(u, O):P(ù),= s(ù)+ D(u,ù) P(ú) i- nd(P(,), s(,) +D(,.u)
ainding upp.r linit ro. soru Forùy úgm{r
,{ or sqùdcs f,1, l, . . . , r'1, rheEqùúios (dr) 3i/ (a.a)c
s("{) < I I rr}.rr). ldùheúser (n.i:,...,t).îÉir ro.rherLisnmón!o.rh.suhÈquenc*r1+L ",' "i+1,..
.. rl-',..
nrs mb.
Jotrcby ùsinsEqud.n (46) $
.-I I ''r,,,,,1,., "," r f{ù!d ù rimeo (,1, úeÙ d.iors f mprcrity fof6ndiiEr is o(rr,1.
sw andaRlR. , = (3, 2, 2), aid I soirg
i.vjsgali'!ú]enÌsorcs
s r " l . ., . " í * ,
",r.
r = 0 . . . t r r - 1 .i , = 0 . . . a - r no ! ,ts.,. :. .'
u.i id \c,o, .-
r)
conpL*jty ol rhendbod ror lindiner lLppùbouùd\is ú.rîoft oo:n1 spondiEronoviig lod ù rou (DG, u)) n ùbùraredby úe vùih' Éq@'in (4.3)or rhesP soF. Nor rhf rhctumbcror
rbodJd uscd,aÍd mrnyof rhemus (rcush efimaÈ o0 phyroseúic (orflorùrioisJr) b! riir heh wnhdìeaìignins.
4.2 M tiple Aligtrmentsand Phylog€neticTfeer rte lè*as {lemì!!Lnqret,
and rhe 'nEnor
ros. sù.h I r4 tr ciììed J phylolonúic (r .vorùúùrfy) ùco.ind sfiuy '11iib (bmnctr$) o$Lnr sqùems, $d ùe ed,ees e tn i rK cùmrurL\i f.om prcrein{of rJN^)
lioiÒlamuìlipklìj!Ù[4sd\N}nsJ
co^ldsr s'orseqùeocsl^Rl_,ARrr.aRs'.ansl.awrl,a\yr] ùù4rlEl
oùùrL\
b $irrudattud) prryk,!
o|Ío'nR'ov,h6ccuftdin'h.pa rdúryhrùn î nùùriotrfftn s bî
e^ trro 4ueNes. one*qkNcJd
rGd)
cr)arLcmur, ofbds.ùr 1{o Gul\d
whor Dcihq u fru, rdirìe ililmÉnr.or i (lirc) ltìrogqrrd! a!u i! tN\( (L r ph![]genà ! re. (plr rr! orh{ mdhod1 tva) hy dhq mdhùL clmhùùg rh
.Ùlp[]ro:uldi!to$ud0d'ielcJ]sF rerhod!lo1ÓNndiigplìylogÙtù.ft4
.LL ri$ ld$cu, úrn trhorrl'3r), rq, u'!$kr0lnsrhd
cdg$ u Lhc'(o
ln ùyìosi€ric rudies.'heoL (6 tur uú,torLn h ou($! úr {,bjLd\!
ud
ND PHYLOCINEI1CIREES
a prryroa{eri. rÈ cdsrudd ùyòc Eisù@cjohiq Dr'hoduis ù N ù (ùr sùc 4\ red ro.Fie!€ LD. rhe Nnen ioss ù. Ésh orsirg bobhPpùg Gq .ieEr.5
ows,cìvúDdgìú|sqEo$'qlimde diphy|ocsdi.hc,wÒÓNidùoly . Theùe ha r l.trimr nÒdes oeavct. otrcfof flch ùiein.ì squence.
dnediotris dÉidcd.(rnùrf@Èd ei
úc difarion is undsided.)
rvc ryo chitdcn;rheùrenat mda ofm unrcord re haverbE comsrd edges(btr.h.9 ! Ai ircdd ioò.ù
('nsomc@t
r+
r+ ofPddeÈùFgurc4'5
nús bM rwÒiDstin sqq. lrìis fte rùùefor eivesùr rlc Òepúniry roìlludraé rbeconcephor ,n ,r,s hd rdrdr4. dÙpucaliù'í wejitiptì
lh. fuìl
ìes,bÙlpmlÙgsirúcy{edEnvennlmgeoe FigE4j s tbeevoluliotr shNi in FictrE'1.6,
. rNsr(MoNe) d nrs2(Mose) m p8dogq . n{sl(Mose)a
nrsl(Rra ft onnobs.ì
. rNs r(MoNe)Ìd rNs2(tun ft odrybùúÒrÒsi . rNsl(MGe) :ld Ns(cliìo) ùc onhorocs.
4.3.2 l
speiès.í rhemw.ùpy f cÒpìde normrive(tudibed) rheyft calìedpr.trdd listlÙÒkrtúcdtrdbàofdìfi@íftebpoìogies.
43.1 ft€ nùmberof ilifr€ml treetopotogi6 Al ùmoredfte (ot dretypew. msidèr) h4 ft 2 inhal rode,ard a r@Èdhaj mh€roldifimrbpologies,TleÙmbdol utlmrd roporogies for a > 3 disì,,1 squcrcs i3 '""*l.l-
]::
'.
FÒrcxmprc, 7i.d0 0) - 2 02702t. so. dcn rù quíc snatìn, ir voutd bea talle runbqifal|pssiblebpologish0db
4,3,3 )
"---i A :-;
:A
/LA/L^^ Plyiry 7i,ihù(!) hy h
r.rcsùù8ù
J,3.2 Molccuhr clocktheorj-'
frúdio crnb€sriiùrd (sedu 9)m ù60mdr r 42 oúfioi\ rrrEhNú
J.1.3 Addilire a
ultrm€tric
1r€es
Fiqc{.3
GrÀ!ldd
k,eca^!
addrqùc
È edg* .ÒD!4'hg
rhe nods. Flgùe 4.31i)
ded iion rrì! dirù..rs io Fisùc 4 3lb) rrc
o\ùrìir!
rk.qr!!)irlidodyir
ir{y6uror'lÈnwecarhbel'tunr.
j.r.
4i.4
wcchr!b!r rhurods!\rnFis!rc49(orndúl squenesshoMi.Fgùrc4.7(Ì) elrr5eúdEqÉfùe.r0)núbe (srìd$ing úúÈqú rú (4 ú) inpf$ {k|niq h beyoùd dìe$opeof rhisb.ùr )
G$unii! nrcù!\I)ac)rf dorrr-itrorerùyrnpbi.j r
iI!LIlPLt
CTOBAI ALIGNNIENIA
j
rieoi i.e
(ù F+r. Íù idùe
'hr iddiiny d tu inkrc$ d toù objrds (br rÈ dtueobi4r fre ùùudd! 4qùiPnd, tr
Fisùrca.eo) iìrunr{* rhisrEqu ior (r.Il) is $úsfiedror aI rlÈi,ives or the ry rhatEqùÍion (4 11)impÙ* rhd ìÌ h Eùa'ior (a.l l) inìpliestr!ùrtion (1.r0),hei.e ùlknÈúi.ny iúprì* xddidvlr (*
J.3.4
Diff€rent ap!rcachN
for reconstucling
r,mlÙmolinullipkarig'ùaÍ'Ù i:ì<Èd (or iu) or 'be olùms are ùre
rùc$quoB
(Th! hc h Figqc 4
phylogcDetic lrees
9
î î-r- \
'
j"'' >-' ; .-;;*'
squencs.) lrre arigmeú h 6^r iispúd b 0ndiur'uliì? .,1 ,m. a cdùm PÒhgiesNqlheoùds'weiusht'hilby m .rùnprc. sùpposcrhf we hde tou
{3,5
FigE4L0we$.'hdo|UnN:rmd5f 3, s md 7 'o d4ide rhehdt wiib rbe rùs (l +2+ D subrirutions ìnrhosdÌee .ohìmns, ree I h.s (2+ 2+2), úd frc ìù 6ast2 + I + 2).T.ùs@er h chosn
whn úe rumh€rol squenes Eori rhcrcac irc
pd$iblc h6, àrd ùè .aì-
lish6r pÒbabilnytu beîg coftd (bú( $n5 ùe onúii
sqmrc$
is .1ìù$n
-D. Tten Pdt4 h rncpúbrbirny ù{ rhc e nfcnúlnúd.rùo kio\n (j,r.:. f). The r! (r' ) P!(lt Pr,('r) Pj: (r, P:.(r1+ rr) t\,16 pún) p@an
:D Prcbrrririls (orìiÌelihoods) rorc!!
.13,5 Distance-based c..slruct'otr fljns{Ìjol1'9.trbyÙsìlgEqcldqrced
:quene. ùd rtÒì rd. rsd rodcrnr. , :!.
ro.1' 'hÈùlhq !od* (r) ae mìc
Fie!rt1'r1FilWiÙ5hú4úÈ3p!pi
cms wiìì o(ú
| = th. yt òtut
(*c rhsexanple belo!). An
t4 k..tt tlore Nrt) Jot eattt ÒnsMt rqr?ne.
tu.r):= ù, eús aJtuanè: nt u f ub'rhÈ theknltthqfthe cdq.:lt t)e"tt",u) rof qch nd ' ùJth?Ì6 it u ll.ttdo DA, r) :- utatkte ttp htu . bdrt.ù \ .t
u : = ( u - { J !' } ) ù t ù ' l
Ìo i roo' I ù úoùs
({b)rre is .dù
etrihe mw Nde u ({ùh .hirdcì ! 4d ,
Al|(\vc\|{\DolY|ocENFíI.TREES
Ftùr 4.13 (i) De d^kfs
ÈNer rì
jd ir ùe 16r G) E n grclFd vii
(c.D). r.) n. dcÈmÈci òe oEddd
tu ($c
ffi8rr.d PcM^ (UPcMA) D . . ,= j ( D , . , + D " . , ) . eighEd'TlìnnúeIEsn''oical|jicdns Ldx
rr4r'd
PaMA $TaMA)
, rhcrDr i' tuc rÒrèwy Gùb)fte wrh m' 'ftenrhelcnsd,of ùù.ds.s(r,ù)ùd(r.ú)cùersiìybecrìcuìaÈd btr, bc . :-r ofùe rÈ wirhfmr ,. ÎÈ hrllh ofrheedgc(', ùl ú ú,ùn
,,." = i,^, -,!,.. ivenin Figue a.l]G). :.:dces Ircn Ìhe newnodc(a,B) ro .-{inRsúr1 r3(b).coÍiNjngù' :i ù. dhtucs bervq rhc*qucnÈ. in Fi!ùr 4 r3(Ò).
stue rheodsin,r dnbnes (Fiem ,1t:ltr) m ror utFrmdnc,Ìhe crtcùbÈrl dh's* jr ú. .Òisdrcd re (FieùreI r3G) doìaÈ ú!m dreodgjnirdkhics
rhe n€jgibDù-jorrros ndhod jÒiniic(N,r4hoddo$noÌ.suermsa rreneighboù
cvorurìonrynrerb
mes vilh rhcdrinlm 'unbú ofedses (r fù ùco. ftcn rhù@ is sucsn€ly cbùgedby in!6ing by I rbenùner ol FìsùÉ4.lj(r) Nev inremalDodse rtÉi sm$ivdy
curcd. md rhedesEeot
kr ùs.dr rhÈ(sdblr$ cootr(cd ro x for m!s. A ,giúr,!r/.r is a panot mur. li Figm 4 r5(b)((a B).c) h a nejehbor p!i. (a-R).(ÈF, in (c),erc.rn , rld I icw iod. .carc! (D, wnh edes ro x ùd dch or ,!c oTU olhe neighb retr8n\.Tris úeús rrrd ir is d n@ssiìy rh. plirunn ùè ìe*i nuruil dishn€
aJ.6
:i +--
-tr
r4uE4.rs rùrÈ'(r !r1heNJ'ioh.d G) s n * q n m q ! e e i ! { . s i ! j l Ù h c d
i ùÒ n^r cydrcrrrcr ùc 4 (zi
i )/2 póribrc !hor6 ln' rh. icitrrbouf prfs ro
r!.'deiÌheEm (,, -i+ l)('r -,)/ :]Nrg'h.oligiruìdnhrcs'mdth f,din saìrouind Nei (1e37).
riúprr mdhodrir cÒirru{ìng r $òod .rcc ]Ú6.ddle'!ìrdpoin'fìdnsftró '- in FisùEa!(a). i morca beinsen ghrxndleft sbrÈs. Figurca.l5(d) sh*s n FisuE4 r5G)a moris pùced
i-i.
1.1 |
4.3J
[email protected] sstrr howfobùr tù *able) rhl rcsuì l0r0) of !*ùdo atignnenGis smÌù
ir is tor.ach sùbú (ìnèmrt ndlÒ)in I rh. sam.!ùbr€ ocùs (cooEinjngdì. !r6s rÌ oft4vct. rtrjsnumhùn q\wqNu ne, asihc rimprerÉeri! Fisufe4.r6 show r, ùe sr of k!v6 (hùe prcrìlr) or
(b)
G) rigufr4''7^ji!ìù90'IELqoùd'ìprc
.1.4 Progressiv€Alignment
ùly lcps .m.' bc .oftc":d larr Gs rlÈy rd ii roch.s'ic mdhol, wronrry irclLded erps rc no' rcnorcd ( orcc a sp, rì\nys r llp ).lsone mdhd\ rry r, ri-qrtd re onoled ron úe aììsmeÍ. ad idded bro
r*hniqùs iÌ.m dNcnf,s ù phnos
P.forning xlismcrt lNirh !r$onablc Konng Khcmo in dili.Èd ordc* {il
,."1.(,r.i)
, ,,"r) . . ( r . . r ) . ( , r , 1 ( , r , J 1G
tr4Er h cvdlùtiorry tint
ft :rigr
T']ìOCRESS'VE AIICNMENI @c65 if 6erc *isB (or ún becon{Ìùrd) a (phytolcreri.)R vhi.h cadsuìdc rsosrsxìv.arigincnrDitreEnrnerbods dtrúÒbimp]cnhddol)?ldingmh d ro u* ro. .hóosingùc rìisnmena.*e cd
Prqa$ire rlienmei'orft *qùdft\ {r'. rr. . . . . r'}
f o rr : = r h n d D c : =c u t G i ì e r tù\ e rn atì3nnun^ A?, a4len c. c := c - I^t, n4)
!r i= risD(r/,, /r)r c:= cuur)
îins dÈ Girgtd) rìnlr atis]rer
ìydie|elì^ÒfshúaÙ3nftob'veiB'
4.;f.l Alienitrs iso subs€taligmmts vi'bÌhesqÙrcs{r,...,r})ed{x,r.....r, ).{hecsèquèrcÈrJ!è5}oEr ù . lR€arr 'hd ii is rr wiú rheblbks ù$,rd in in iìiem.i'.) 1|ignfr6t\:ewkkalismwúú,]pÙ;r34i1cdal@Ùfd'lnsmpìeÈ3|iginolL sqrcrcr ofhetu s!b$r arisnnÈnrm lrc súf bcMci úr corumrs(i , is sbom by &urion (4.l2), qhe€ n h ùe so€ hNed rwosyEbù. at is'heqmbol (ei'rùà'iiiÒ Èid ofbhn*) lor *rllcre P s o e ,t ( )nBthl4ro:
I belrùeqùd. r rùunEquar andbknk
i(r - r ) + (
ù cq mùe difiú
r + 0 )+ ( - r r ) + ( t ,
"= i.
Biolosirs orbr úy ssverl sapÉetrìs bdorc rhcygd an
rddÈ nnJ di!îìngisdotretonùvìry ùsitisnn.ir
!5'hE'lisnmqÌ othii.d iflhe úd squctrr lfomb.rh Gùbsortìgnmfls ra usd
by 'hc dùrtq
rhcrc$trs srúsry @FU
cd |Icedm. No cubstdisnneft nDr
rÌDrúno.qr!nù!.
hú (Pmbablr,) L
quq!6 (oE rom údì itignìùúl i5 [€her. m rhosn nx !]i3mìc (Tlri5qrre5Dorti\D i Nir ro.he pc\1a 'ne'hod rùr lq\rudins '(L ) rr se quaLioìI j. (r^ro!d d 'hc ùlhmdjc nÈrn r d*cdb.r r(rr +rr +. +r,)/,r. oN.tr. au úu \qúE mein
' n è x r ( , t ( r ^+r r / i r i Îhe rùiìaù
,lù*k)
. r/!,J))
Lùb|c ùdhù
3h ù( n\î slrltrur.\ rubvri,gnnen') iÍe si,ìirù 0r5!hi3h 5q,e) Îhe ùùin@tqùvhie) rù,tdy )n4ro
(ù! fnn cub
$b baven!n. aÌsmù6: Àr = Iri , 11, "{: = Gr. i). ,{r = lr1, Niu, p!ry\o
r ì c { m , s b q q e n ú r " . ' B ' n ó ' a f o ' 1 . ó. .' n ' . . , 1 r n r !n - h o dm À r o l 0 " ,
s ( r 1 , r , = 0 + 5+ ó+ 4 ) / 4 = J ( r L , Í t = m d ( ? , 5 , 6 , 4- ) s(,{r, at =nji(?,5,6.4) =
Thèbe$ÙelhodispbbablybseNmeesmsfoldecidjne*lììcha|ienmÍs
r! (,{.,) mqn s(n, ,). and'heKoEsitrd€Hsiog orderc (,4,D). (r, D).
r.1.r), (/1,c),(r, c), (c,D).(J.r), (r.J), (?.(), (., r), (D.É),(F,c). t D r . ) , t At,, r E , c ) , | c ,] 1 ) . t 4F, ) , . . . . n Esùli in drete ìDFìsure4.r 3(4
'hc ùr &d rar q$ic6 m Nd rÒr q.le'P,ji8Ùidedllisinslis'henp€f'o rnlus ún alsobe$e4 soúr ùe du vducrlish'ly lc\s'nùùesore(t. 6).rwo
znA 1r ourn! D íL$
iidrqr (b)a +q!
tl n.c D)1rlL.t:.G.L,t r.K).t shomi,ìFism,1.13(b) fte ofd{ orrhedurùin! is (..(!
D) ')
lod (r (c (a H).t/,(J.rr)J
NoÈdn(r, r) dd (r. /) !a Dorus
lÉ^b€Ndlnrissimpìi.i'yFdlhebJsi!
.!qú!i!.r lr'. 1'.
.hu^, ^v !!q@w\ nr" ùar lhritÌr) 6. n trú u
r = a r i e i G . r r ,= u dW^| a!quù.!
IJ'l
eU: u:= U - 1:)
.r'l
;t.4.3 Scqùencèweights
rd o.sìnpres (ilrúes
otrlr d!$)
tu 6d rorkÈ d..iri{s ir a N! obj
ejsrÌdìidfuo!(kÎosî'hd!etnopd[
Lnoú úsc ùÌ objeh xrc seqùÈ0!rs.dì
ft !sir! diflùùE 'o dc n,or('hedi{!n!! h rhc.middle*queF)! hoùfbood(sinìitù kqtrnces).
(Nnìbd of ,iúaront
I
! ir hc rhLrîrLre(ilnìbelof ùùhrio$) oi edseórr
È sùbr* \Íh rcor in rhe noderr (./i, il
"iro' (\rrB)
o. ,ne edge\ftur d,o rlL u L
Ii FierÌe4.r9(b)Ncjertrs !ro !i!di!
1.!t.4 CLUSTAL j cLUslnL (Îonpsotr r rt. 1991b) I seps(rvhj!hcú rreGpeareo r ciìctrrdcdr (sùrioPì^!i{ snih
r c n g rt ho 2 N . i . d ì e , G g | n o d i s o n a ì s erúd (cs rro dùlo.aìsonú.h \id9, a'd dynlor!pruàiDing is p.dbned
Lh!5mdrc{d 'hcsquenesG fore o
ru ú Fieuc115 Ìh. oílr.!ì l1.,).c)'(E.D.(D'(.ll),((l',).c).(,.(''.,)
bÈ(!.r)
ieh'sdÈo|ahtd6sphiftdi|]se'
ni! mùicesis ùsd GÈ s{ridD5.r) 0,n .$.cs P Nrorcfmr ) Formdì digùùùn iaÍce. (Th. $drs rfnG e ru*trned b rcnNs \?hEsorry.)Eqùdro! (1 12)n Ned rof lìndi1ìcth. s.orc! b.rsrùì rh!
. r$coP(grpùPcnnsFmtr!).yhi. t Lht GEP (!rr ùremior p:ùdr,
4.5 (
1.6 Í
r orLúns!lhcFLdiì\.oroNù
r. adi'N' nr muLripresùbfnùriortr (,iùlilios ). ri@ rhenunhù of múúior tr lko sùdiu I q) ( rl,c PAMErn!$ ]nd LhLBl-osul\{ ndn$lrkelì sjntol.oÙil,)
4.5 Oth€r Approaches
icrivesncr)slùtr|'l $N tutur \ùhaLsùtur
or ùblcdir. tukrioN r,f rh$o ùdh,i\
'
1.6 Excrcises a$ci,l'!F,rudiú (l.t hr ro" for
coilid.r clr r = (?.r. r) ùìd frtrdù endceìI,by usinsPAM250úhbìes 3) choor ù upmpiirc rrlr rù' dic r. Dn{ iÌì umored ftiffry) phjìose
Shw rh.r by uing U?GMArd cGlMbe
evalniiù"ry @,.ùh
orisiml
'IYyrosprrin (intufmn, rhd n ìnllmetic @ is 'ddiri€ Giutr .3yMicrl
dsb6
mrbt:
(t snd.h!!.ù.É *i!a h uhffiic Sryp.e w. m cìh 6vc*qùd@s rr = MD, s2 = Àccm, rr = c4q rcdi,g sysen hre ! rrneùgap son
tn rhismy rhcltlm 6 beEsÀde! s ditues bN.5 ihè$qrn4 ùd .o€r for rbobd D'isis di!ìrlmts
r!5673 rhsè 3@È. G) Ior..ch piirshw m dismrùrsiB (b) M,r. Àlsric) cùid. É bu€nonld* disbes by úing ùè upcMA plreduÉrùft esv.nr'r.cúà!iE! ld joinif g(.{u'i !.4), choos G) Mrkéom iplé,rismrbl!.donrhe@in(b).usepaiNisesùid€d rrisrert uins thés.qùffi wirh l4t dyr4i. d&hcc a cuidé. h .ù. (dyMic) éIc@rt, bìank en c ri) tu tunr}andDs ii ú. ..4u.ú f nure4 Nsms0:
(ij) onebhnk qis'irs in oe or 'he seqù (iij) oneqù'iis blankn illled ro in (iv) a ne{ bìankÀ aìisnedrom amino (rry to findrheber aìisnmenhwlhour ùsinedymjc pberilming.) ca|$hbIhgsf$Ùeoflhenuìlipledjgrefu' (d) i{rkc andisnmr usingùc minimumGompkl linrasemdhod.use joininsyoucmue rh paiNisesuided
a, E, c *h.i 'nÒndhod in sùs.tión 44 3 ìJ u!r. sone5€!l@$ (ffomsvisrÈ.t rhf youvnl aììsn.
cÚÙdeÀiewphylocÙe'r.hnÒd
angi'g'hcaoPb|.conpeit
4.7 Bibliographicnot€s seeedhgÙelhedylaiú.PfogÌllmi4
5
(1e33), (ree3). upmm.t3i.(r939). (2000) GErìcrd cupbd,r. 0sss),Rèiuderat. Dd.mì!ì!s equmexeiÈB is aebjn d ii At6.hurèrrr. 0ete) udrroùeson er3l. (1994d.vìnssn andsib,brìd(r99r) compft dififfilneùods for {eishring ft. MUTaALpfoEm is deqib.d in Trytú 0990). À neù\odbd on ùù_ hins hn ,lins is pÉ.Í.d ir Kin andPmùjr 0e94),ore 6ins hiddenMùkov nodelsinEddyogss),8ndm uirg serencdsoíhms in Nor.n tu rnd Hissì6 (1996)úd Noftdme d d. (2000.r 993).A heùod b8s.donrheFoùir dqrÒ,T i! p@iÈd ii rúbh.raì. (2c{2).Theu 0996).AÌn!ùs èr31.(2002)dscnbe I nedrodùsinspolyh.nril @mbinrdi6, 'nd le d al. (2002)dÀcnh. Àm.6od usirgpdiat odtr snpbs. of pss1:tr fór fùrripre aùpnúr is giEn ii rnomesord rt. oeeeb).rr@ h a benchffik dd,b4è in Tnoúpcord il. O999a),snd(lrprN,nd Hu emO daùib€ ompaisnolnxiehsljcimúcúdújc a5ssneir of sbrùricrrsicnifrcùc. is pleún ir srdrcp andcdsbin efiB). A goodbooklor.brudon ry tu is u 099?),*hidh hs mny EGru!6.
Sco
5
ScoringMatrices imil
l], bd*c!
rhe query ind eacdor dìc
ùùe;inìafiryof 'lÈrcri,ruqocumngin'Le*qùerùs Fors!fesidms (4',4),
]]ds.lrìismÚsÙrccrnùc!bcliver rnhdofamimxcidr{20) r ( seien trinccs for pins or tunìiù Ìciù (400 Ì 400)
:rr of rre soa
cor bcrwrú (,. D)mjghÌ be lrrsú dm rhÈ tu (q, .J dod ( , à) (or sDrxtr ú$ úen diflance) s ilso u*d rìr a sconis mfdx we {i[,
ù a etuin Gvorùriotrùl) 1j,hd a $h
ilìdBLosUMRd.Úqtìolherl]psol
SCOR]NCMAIRICESBASEDONf
,,-/\\
o'- 1
,,/N
t, \.
/\ ,\\ \/\ D E(RHNAS]
,t
I
l\A
\
5.1 ScoringMatric€s Basedon Physio-Chemical Properties Us oîidúùrr
a sinpìemdhodn b
us or rhc gènedc.ode.rhe $ofe is b úc nu.rd. dds for chinsìnsh amino{id inroaorhù (r. 2 oi 3i 0 roi eqù,1). lor ftDslomlng Phe(codcsúo. úc) b astr Gods ùu, ú). NÒrcù!r rrrisd
uf or dNifi.arion or amino,cj&. s.drs mfnes b ed moE dìF.dy on rhe ndsEh!Iopú'*qDbehydmphobh ity. pordiry.cbiae. mmficiry, ariph,.iry.ridi.aÀi., rizc.H.bonddonofi crc. trpropùrd.sdì1hÌd smnn( rq90)Pfo. psd 'hedNina'ioo aacH Gminorcid da$ lìirùchy), shNi in Ficùre5 r 'fryrof 093ó) mxdeI vqn dnsnm ù
(cH) ùt sidrú prctsds b s,hÈ riiÈ ùe id&d
òfn (.!,) 04 e ùùe 4u:eÈr úiir'crcs!úE
r{ùr
c R4drld
k dúEd byinprraúor ard
ftor rlyror (rqó)
5.2 PAM ScoringMatrices iships o(ùrìng rq?3).Îr$mdfi.Òsft
ù oroe. fte 66. sconDg
rir {idcly used.
ìò (by muEtionsNe meaosubfiru úois i'ì rhislhrprt 34 $sfamiìiej !e urd, ous sejùeices ( > 359. id. ity tÒrd rc rhe 1''ú oBupnúpo*d huhho6) n
nùr hrvù rhùsm..r.d
(nú' i!'Ò.ion in I simìlfl vry) f rheord one,which ùsu
6u€d li Nnben of 0ùft trúi!r.!lr6tdù. Thei Dryhof\
prMduE canb€ desi
'ipkirigmenrroleichgnp' 2. coîrnd
pbyìogedic (evoì'rio.ù])
'us tu àch groùp, lid csindc úc
rjmc inbnal r sícs fof caú paìI (a. !') m erimde for n\e plobabitry or/ Ìo
5.zl
Thè eroltrtiomry model
ùù?t
xitì.ù tlo v|nùk
iig rsunprìors: /Er,.raririr i\ ù1!,hp
,Ja úador
Òufr!idú4 (i.dspgi&úùot rèishboùf flìjJìorcìdsofdfuÈi\emighboÙÀìe Fsidus (boihin squen e úd spE)), ?d
r!.sirion wcdn juf wir,
-ddtà,r (/àMr. rnd I PAMmeins,r
5.2,2 crlcùlatcsùbsritùtiotrhatii itr'heofhcEd
phyloeene'ic fts (15?2inrhetur experiftno. lr'le ftecodrir
Faù. s.r Pú of{,trù'ipt d$m
r ft {vr úiiary rtc ù ù, qtuhroc c
FAMSCOR'NCMATfuCES
\,,./ FiBud5,4 (!)^ sdrphybs4ctrr ftc o qw(B
(b) ft,i!ÈiioN
m on ù! d
ambigÙityinùcmusljús,Ne'hefridnìircÙe$ÍinglbcpmblbiìitymonglhE
p,à,r,rio 'hd a vill b. 4red by ó rbis by ,rr;, (bc.$ ,v h aìsocined ùe núdiD i" PAMi sîd frAl q€ kDk d . = l ' l,,]
prÒbùiliry mùìx). f
n€sud
r ]ìe FÒbrbilg dúr . nuùB (run rLLo.curercs or a haE 'ht sme prcb.1bili'y).and
. '.,. ù. rumbc.dfd + t dl] + a. {hcrc + trDs nuktror (nor
. , = I,-. L,,,hcb'irnmberor . / = t! ,. sEe ùc búr iùnbe, or muh,ion:Gút crdì mÙhùon I d. Thi\ is rhcclaiiÉ @cncne ormino rid a in ùc eh m{ùtd, hme L p" = I
5.2.3 I
t /i,;4q rholrd i0cfae wi'n ii.tuiù8
tì a ooriic
Hcrc n. cai bedefiiedÀ D, = rrlr..
i nùy iìtrú'jÒ
,hec ,r is ù.òmrd RnslhepmbabiliryùrtúÌhilrury
rhc rhúirity
rhf anebiùay 'nùriÒi òdlirs d is i,^//2).
biriryúfn hr,,' d nGìre r, = ó,) tt L z ftz f amoùe r00asidusúùeft r00a,o
rùe pnbl-
,,,-=_LL
(5.t)
L t c o Pr " = L t a a p . . # - = , L h = l 4'FordeÌemini|gMdbwe.únowsgthehch'ha' ! rhÒptubrbinyrh4cmukres(rrnne1PN)ìs4i,lod ivciúìur!nùbo\nt;/t
. Mòb = ,4 Lh[ú îóf o + b: . Md=r a, (ùc prcbrrriry rhd. dos tur i'uúr). a np]e.lhepobabilitylofhAbhcrcplacenbyaLi\o,r'rx)]'
5,2,3 N{atices for gen€ral cvolùtionary tim€ úsion pq 100rcsidtresA rttr G$luriofufy) eù50(duL\'hcmllcú|eeisbrcplaa r00hyj0 jn F4ùriÒ! (5 r) Nd! rhi(hìsdes nor.oftrpdDdb z p M (2nuùúon
n2;aa
iziEz !i::É
i a !..1
ÈÈ;:;
', ! :1:
.;i::
1aà1=
--.; E:
5.2.4
2Í=li
! Éi; -
|/LtM:,
=Èt)a 1: Éa i
aÈ:à:
!É iE i :9 = -'!
'a-e.:E =-aat
':ù4hdkE(be'ohdbyn.úix íidcp!ùdcù pmpeniesof rh. mo.rer(Mdkov ùìodd). ve wiìr srìo{ n for ,r.rr. tpìrcedl)ym
im m'd /, rfÈf r{o m
Èabbiury Jor rlìis h ,v!j,Ì4r.
tco|dî].pmbrbi]j'}ìs,ì1dl]ìr ùdù.IjrrlpÓbabilityibi(tobelep
Mi,= M'Mù + M,,,,M,,l+I , v , . M é = t i ! ' , , , M , ,
(51)
T|li!h*rdydÉdelìniùmohlìemd
s.2.,1 M€asùringseqù€nc€ sinilarity by ùseof M' rl,; múss rh, e!úry ora &d, À : llhjlh n4 b, dlrc'd rfÒm',;, ù . úd , = aLàr (ii 'ine r) h úù ll, ,v;ú cr ,!r;. HùrÙ. 'fù ùlNr of qurrily isro' syùftú, n dos norhkeóhaÍce ) n: (saq,neqcn!! d1hù unduryùs do rcoùtr( ind irh.s ro Ne rnuxiDLi!{ior
ù ru' ù, vnhoú +rrtùg
r
aùÈdi3qnù' juibydúr.r. rr ùùs ..iú a i"orcorrlrsqrenes (r). nre (/) onryhy lha : r rheorrìss
atÉ
5.2.6
aE1z lle?
! F;F
4.2.7 2zLrbÈr>>
Remmbs 'rìd LÉ{ pi = r îd :,€r hqù.r.ics. Thc rrcqtretrc'in ùe nùme
Md, = r i henceo i5 úc n'i or 'ùo
Fofr Eqúrion(5.:r)n bììowsdúr .o.,>|:,'EPhce\omoEon.ni
Theodd6naùir.,issymmdfkd,!bi.bisshownb} -
M,b
î"rn"
Lrî,,
Mrt
l,r
5.2,6 Scoringm.trices(log-odds narricet nl\ini|ÙìtybeNe0msqufl(Ú4
II îE ocÀrÉ orsinilùity beNeer*qùetrcesdes no' ieel ro be(o is noÙabso rùr, burrchrive(Neo*d ro k!N. lor trnpÌq rìli q n nore sinìlù rÒd' úù ro dr.rhis rchfvnyn keptiftr tr$ úc locùj'h'iof.(4,r) inr.ÀdoIr(4,d).as iìil,byaddinglhesoasfÙlheaìign.d Ésidus. i6Ead ol nurriPryins:
L'A,tt =t cLt4,ò =Lto."o4t,, Mdi.asolìogod},14,//l-dri.'t
R'h= \as!t! f 250PAMmurripricdby r0.
sl.7 Estimating tht evolurionery dis$ne oltr'joiùr diihú ,. rhisnatu rbarr mbrioB ptr 100 *(,-"à*')
i (ir 4 is u riftror or
a i_:i
:c eg :q Éi:
PÈ:
rJÉ: Y?lt i1=';
! ?1: 3:iÒ
? i:3
5.3 B
:9:1
g !.ge tlií !:;
É
;::€È
:È::E
etitE
rc5.5 (RcFcnb{'hratu'irú!ù,ighf iúàr iiú rheo'isùi lnùo !!id ) tr \ ronid (by ur oI lqurion (5.ó)) 'h
5.3 BLOSUMScoringMatric€s h òc DrtDf moderrhesùiis uìue on ( leel) h!!r 'hsdn( ddvcrDcdiotue
Bros!MscoRrNc MA]RrcEs eachcorumiii c{b bro.krh.yconnr d dr tr ,0) = ,10 dirìccdP,i ,rignmq. ofn sequ$es mlkèi ún(u -t)pairoramiiÒnidsFofrhdriBrbìock rherminorid prr (úr) (ior ùd r./, =
'=IXL-..e,. vheE > ù inrerpErdrs r ronrod.nngoverrhemi'o eids (fù qoptq . t. = ,,.,/r Ge fre'rucncy ÒrobsLryc! prn9.
s r0pdfs.bùc ri
= rìr.
5,3.1 Log-oddsmatrix îÉ ob*Ìvcd furrfics ncl bc adju chmc.. For eeh (.r) rhe*p.cbd plo ! tii
> a.,, rhe obsfled rftqù.ncy is bisl{ rbù ùptur
! t?Ì < .i!. 'he obsftd
by .hb.q
whùh
iieqrcocy
t .4/' = lir,. úich indicfs biolosi.àr Dcqùriry brrÈn
rnd à " ''th''Ir1n'.$d'|fmÚ,lp3i|'''. ùù ùe ahvryed lEquqrcies arc cqtut b thcf.qùqrì.s iI the .dudt pqrtúiòn. Frcmlhistheexp(tdprchabi]jlylhr
r rnrm n.rd a o.dd 1i!, + lÉi
,i! rrtrsi
5.3.2
,h", 1.L"4, h."
!:r
h y tudn{iieÌhePrn (.ò) ir . !,,t, - p,,pt,+ tur,, = 1paù,, tó' t + h.
=;,
=# ùrdd
ùd tbe dped.d
iÉlùúcìes
= i!.
tÒ! c&b lnìno eid prn civcn 6c lbow ùr
^
l'-
-i.3.2 Developingscoring malrices for dilIer€Dt crolùtion.ry d / wiú ú (eroluriùiùr)disúi.e x, ùe lNìd6esgmfl'psaf*pondi.e
:r!
ùc *srìrd\
cod5PÒndb (D %):
crdx
rheorumi.Ò\ Òtan amiio ùìd gairrmm 1s?,s3 ) shoùld&rí mudì fs. îis is dom by tdkFi.E rhesgnùN jefufP4ceÎbgeiddiyrcgrcuFrinbone
eÍri9 ro oru of rhersnc
ùúrc!ruuP.D
gfouped(idic ùdc a tm% idediry), aîd riotrsGù sgmetrh hxlc Equrì{eiehr). br .nh (4. 6). rhÈ E$ftitrg bìock bùÒne\ (i,:r.5) ( F
'r 6r
K;;
sp{aEry, ad rgn.ns
(îr
EsR (.1.6) roserhèr
úe oDrydìrcc symboL. lid rhepair f,qErcies
bùonù r{r
=
3 ùd r{
= i.
lìgua5'óc!ru{qJslcbd9uoúcPsI
2. Findrhebr$ks (wirhoúgrrt. a Couí úc e.ùrt!.*
or all plin or snino Íids
5,4 Compa.i.g BLOSUM and PAM Matrices
s r]y ùs or chdre en'mpy(rrìm inrima ,ir thcory).ir !$ b! niùiJ rlú ?AM ]trrfnompdjng5elEissnnrlwx
ù. ld!r'i'y $oriîg mdrix). rnd 'heq.'
COMPARINC IIOSUM AXD PAMM{RICES
à;
5.5.1
dd ùè risrìmm' ùd is jùdcedb b.
5.5 Optimal ScoringMatrices hydsphobi. Egmqs lssndr
súdriis
an ovùrcptserhdÒ! of hytrrcpbobic
mìm ridt. rn rh. úeoryby r(artinaid arM[ut 09e0) (sÈ sdioi r.!, rn @.i ch$smúr pdru(wiùourgaps)is dfletoped. Ldse lvo squùeshùeúc beksDurddisdbùrio's 1p,Ìtrd 1p.t.Espsrively (r, a ùè lièquencyrorrmho acida). x d sqúcncei(usine{r,t,1i,)) úc o'no rids d, , re irigmd hy r tuqErcy apprcrbrds
hù = P'nÚrRú sÒFùsEuriotr (5.1 ror'nèsonrs mrrix, *e ser
L
hù-
),
r'fÉ
-"=l
6rlùwiDsrhecomftinb oi I (sc.tion3.3). ÈlyPs ot segmeúvc m s.mhìig for ùix is fomd Eirs Equlrion(5 3)
rhed |A, n I L, M, q vt shourdsc@ bsr we 'lìefrG dÈtìrchther îlil'trd .id! rtraotoron\e6.
i.5.1 Abalysisfor on€s€queùce
n! ntuù ù E{ùarion(5 !) .a detrú:
4=i;
( 5e )
5.6 Exercises
(r) crcdc r rcord Phyrolenericrtoc h
úbbor/d,(6cnlnberorhoúlotrs bd$eidardr) (b) assù,ic rhd Ìhe ddn€
jbf drd,, thcn
fequcù
(.) usc ú ir 'o cd.ùlar rhevdùs dr rhs {b$iturion maùix ,vi,. h n n{h ondi'ìg1ùa Guhfìrùioi r!Òtue andrhediùon sftspoiújie .o E ({hdnúion Ìo E) ld) txffofm rh. rrox ,ir!! ro xtr oddsnarjr o",,. rf voù hrc 0 r dÈ M.i,rlEjr.lhetrúùecors!{DJjn e enedr úe rhote ri,l> 0) (d) TÉ6fom ó,, b a Ìos oddsrì!ùr /lùr. ard nuìripty by ro
(
() corsidd úè úluès ró! h.!e ù r,,, (a I ,). Explrii by trsiilg'bc ! uls of 2. ùìd ,r 'rìù i' h msù$rc
(a) où dÉ {eb yoù {ìn fitrd dre}LosuÀl ronùg natices (lor exùì ple,r h'a r i.ì n sÒs 0ig r.jpmdr ùloey^wkùch el.ìp htnr). cir ù ft'heso neror(a.,rr) iÌìd(r dr)rorsl-osuMa5rìd BLosr-M
(b) Elli nìat rheevoì'rio.a.ydisramein FAI' bdwÈr ({. ,r ) andlr ,/, NhcnEqrdion(rlissed. G) Explainùe shxpeor 'he.ufle in Fl (b) coisider t00 PAM.rd sùppordì plobóili'yo|chmei.e^SÙme
(i) dÈ Nnbù of fsidùs úr rìan mr .lrùeedl (ii) ú. iÚiml nùrbù or iesidùs rh (iii) ú. diiiùùn dùdbr ùrrlidùes ú
5.7 Bibliographicnotes a liq rGmaúle Ìo dr aacH d&$iridiotr nacc s'.irhfJ snih I r992). a nerhodrù
Gmìmacid.ls\ !o!cnrs) io
( re!,ó)andHeiikorad }rè[i[oÍ ( 1991. l9c2). A Gpid nè'hod for senedi'E núúrion daÌaù,ùj.ds ìr r loneser L (19924. .ùs{d i conner cr d (r992)ind BelEer csir inatrcbut(teel) Maùj( rórpùn of rmino dds ir in conner(1994).au evlluarioror s.ÒnngnerhodoÌosìes n jn rohnlonind olerinc'on ( ree, andso
6
Pro
6
Profiles oî a ptoreìrhúiìy hs h*i routrd,onccrtr
dlhbÀc ÒrÍqbfts Thelanù Ms beD sho*nÌo bc supcnorin d.@.ìrg wsl p4ific pmpediesin rhesarch.A nurbèr of
ilrotfs.Îiòsindùdc.,,1",![sa@ PosrioniDeinc *orlng mtics (PssM, o. wight mrdc6, rhesearc mÈ rruc'èd tiod mùr,iprear4nnmb dd ncd .oìumN. cips de not dscibed a pad or rhcPSSMxndionúny no eaps
rorm rhedecE ot onsÉlior ir r m 'ion ro poiúor*paidc rors (p.sirlon{pÈifrc) goppendtÈs'o b0 urd wher .oúpÙiiglbèprc|ììclolsqueDce mu\(preflsTlrsgaeún54cdtg] Prot {irh a pmlìleúD be fepgenÉn s r tvÒ.difrÈrsior,lmy, lea denor€d $.2ìlyirglheposidm:pecificgrpp€naì.ùs' FmnInuì'ipledignmflcùeblo*o r alum a (Àor-) h rhe$oÈ or ali! nsiminorda (ton ! *!rùcnc)rorhe p.sitionr or rbepmfile,6 sh(,u ù Fisut 6.t
eÈ Ei
iz
2"e iÉ :!.
É:.
=:> ó.1 Const.uctinga Pmfile
c membds. bur rhùr jrù (Drcb nbry)exir uúnÒv! nembss, hov sh makea bd posibh d.sfiplion of rhe filesNnhbasÀinprcfi tcwcisbr{úompson 4, r994i),$hich ìs ù ùrNion or i uedìodhy cnbsror sr r . (19371. î.! wc d.rìhc u eirenson or BLAST (PSLBLAST). atrdaìsosìrc r bnd ioftomfl b
éaa
?iz
i +.-.
:::r:E =c =3cI - E:E:ee9: e;;; E - : = r ; = : e i ! E ; E è É= s : ; P ; è ; ; ; :
I ?i iî ì: iî ì î tl ii i?i î îi ÌÍ i ìÌ î I f Ì'F Í Ìî î ii îîì Ì îì íÌ ?Ì îi îÌ
:::=
a.zi,
I
'; î s 3: e I î î ; î í î ; I ì I î î ì î Ìì
::ó.iÉe=!ÉrÌ19:-!:ioÌîjÎ_i!
:E i - - i . ì i - - " i 1- i I i î ì î ì i , i
ì iì ìì îì.i i= -- ir t'î ri îÌ;i ì rì
iÈÉar:i
î:j;- ì-. i îí ì.i Ì r,5îìiÍîri-, 'riî=Ìiîrîi.ì :.ìr-î-irii:îf 11 -
'i"
i
ì-_l-
î'' i; ";
':i-ì
i
:l
I
l
ii'rì;;.Iî'!e
i î J'îiÌÌ rî îÌ îi rííîi:îìîîÍiî î i r'i î: ìF eì " i i- r-E:-- - " - ".r! -ii.:.;:
r:l
-aÈa
i.1:
ò: !:
e ' - i 5r
i:
îiÌ 1i î îi î; ri iì îí îÍ íì iì Ìì îì i
CONSTRU'IINú A PROFTE
1,1.,,'llll,l,
6.1.2 )
t,,,,,,,1,,,,
6,1.3 l *. io' frmirùr ,ih úc uef\inc
1. Tric bekgîotrid (lPrtu4
Eddpb itDÈù
, F3uÉ 4.r
dnhbùi
5. The divùsiry ,rd rìmir{ùy of úe seqùei.es,ffutriig veieht oremh sqmr m. knoùi
in rhc inpodb.è (or
difrÒcit ndnddr ro! pójììo connrucùoi.
a posnioniD ùe sliemen, md I b h mioo aìd 4 4,
aninoeid ìn posiÌion/ iisqùùrJ' úè rudbù ÒfEuftnes ol ìds, dd r (a sco;Dsmfnt
úr
ùc wojghrof squcn.c t Gequeice*eigho
úè !unh{ orrcsidus (notsupt in po
6.1,2 Renovins.ows andcolunrs
lhenb€.fÙtrdioiÒlv,,ardRh'fora .orums tueindependèddfcarh orhfl. A rcrsoiabte fuicior n ! ìin*r comr,inarion
hd.-
Ltrvt
{bùe 4b dÈpeidson î/, the numberofù.ufti.Òs
of à) rro conrfaitu nighr
t v., = 0 rof Tr, = 0. md
o us psùdo{oùnts(G llb ;trùrn $bsÈ rr mur ùen b. d{i,rcd hùe y,,, rhouldincExs widìimasiis tjr. f,d rj ,nd rr bc'!o imirc a.ids.]trd î.,1 < : lve r. 4!i'ErcrsesprcpdiÒrdry{nh7.r,Nming
"d,r
v., = !:9
CON\ÌÌUCIINC A PROF[E
9m
2 v,,, i'c@ùr by norc rhan?;r. meaÍi.s
''r,ú' ílrcÙed,'oleeqMdonsdslyij'gdús
wì,icrìù iìrusrfcd by curu 2 i,ì Figúd 6 l 3. v,,, inljlsÀ les rh,i ?; mcaoirs
vÀ,<_v,r,|'
''1+|i?.,,
6,1.5
d! lùn 1$ chapÈ t. r sinph ùrMiE e'o Hcùik.ffúd HsrLorf(ree6r.
$'irniied d ];d/ar, bú'hi
(sd
do$ trorI
fMr speriNns dn , = J-,t
qus'ùldhydhefùEfìcfio|l'A|t
Norcrh*fo.Eqùarioi(6D,rhoprchlemof noio..ùfir'glmioÒ!cid5rrl'tcr.aE
6,1.4 S€quence {€ights rh. seqùeùes .ù & sio !eish6.tDJt. rscìptaiodù suhfdirìiI .1.3. ad d qPfesÌon for rheposirioi M ghr y., c
.-=-E; Íl'.'r,
"
fr tri =r. l0 irr +ó.
= r . rhù. rorRlrt, v,r, = îr,/flr. vììi
ùdudldù!\lFlftlc'arddÈpeia]tyl'd
3 a gapis rov.fd (50).sift n|.ùÒgaPdtnìÒnFla|ryis'ollo*ùed csdiru Ìbe p.ùìry l'o! ilfrùcins
rhr slp Èosri
6.3 Il
6.2 SearchingDatabas€srrith Profiles
veni.aledge.u\\hò!.jnFigÙfc6!'l.rù prcÀìeenditrgin posniont (tuorj) rùd I sub{qucn.cGub\ùiry)o./ endinsrn posiriotri (4). we herc pt*ùr r vri úÈ lai .olùm is (dr. Pi.tr ) (DÒr! Eap)
a:= o
I4 '., tul
ÌiiÌ
'-
-t
l,{ir,(4 ,". d k tu p.ùrry io!I glP in a orb
6.3 Iterated BLAST: PSI-BLAST itrord BL^sr (PslBLAsl) (aì's.huL c, ùr r9e7)ftc nlii jdcri\ro fiRÌtrs," rhen.ìare RprchÈor 'hc (irrs (6c rouid
?,= ÈLAslfa.a ,ù = Mùldptà{ignmq(
0 is 'hes' of {qucncsrdùd ii 'hc$dch 0r )
q ì- Pó6rusc{ch(r., (Redù.e(q) - 0 ' ) o' nùnnnr nnn)bet unrir.d,rrl,'r ol ttd{
* x Éusd !c6!D or 3lPned BLAst ploiìh Thc PfÒrìrcP h* r 0! fù qc
Ndc 'hd A ù fte ij6' MeDar
or 'l'e
6,1.1 M*ing tùemultipl€ alisnnenr selnss in 0 vhkh ft Gxedyr.!
(, aidr{ennk6EteErcaroÌbeexinìpìe ) . ku!!rrbifdilxrp$i{iùNotrLsidu
6.3,2 Constructi.s the prolìle 1leprcfikìsoshded':]dig'ì.p.ì Ltl'ì0]eignmùr'Thrsismri,.dúJùgh|ìè
ù'MliscoisrtúcJ rcsidueiIcolÙnÙ.hrchjnedin]ìr.
rÙf'|ìtqoP]clbot'hedÙ.e!JjgmMbM'
ltlisrkl
ffcqucicic\. {t I (for eaclì.olùno), aE úìcùìxrd b} ù,c or ùc obsc^qj dlhesqunlcloghh.&qreNeweiehlsft
bucqF\n'ingorfiltjtrdcpcdÙdy úe dismenÌ i.4 is 'hmn'! mdcd arin rd mjno!!jùs (ú! udirseapq,fdq{,
rhe rinarrmfiie Y,rù6G.ont {L or 'he rom be úalP,), qús. pii i hr d h be iù ù\e .oftidùcd polrion (rììe p.sirior I is j prico N4unll} ft coùtdN drcleshÈd rreqÈmy /d b òiìîc r!, Ho{cxr rheiùnìbù oi obssyr onsGeqùì..r ii th! ,isoncit ma} be y@r ùu I rado
hùer'ùr inp eneDcd Geesùsùr Òi ó | r) ùìm' & is cal.ùrr'cd r'or erìì ùìitro riú. sd É Àsagcd {ih , h
a-f,,L,"^
hemlnìeúúiÌx',dèijr.dby
,.',,=n,n'è&, (seEquriotr (i.7)itrs{riotrs.5). Nc ol,+p1). ,
6.4 HMM Profile fiidden Mekov modeh(m{Mt ììa! pÎoÈinlamiìyrmlysis,llem6'om.
vù rìighad hicher'hm 'r. pbbabiìi'yof i' ÌtMM eddtirg
a (qùerylseqùetrÈ!ìd pÉdicrI À r úenber ir rhepiobsbilry
*iruùtbèiregertsÈdbydìeùod
6.4.r D€únitionsfor an IIMM sú pú ol ff$ 16 N ar
lud:fe r srn {dc 7òmd r rop lrc î,,. For (4, Î, Ìhe€ is i probrbìli'yof mori'ìgtrcn fíe 4 b ffe Îr:
ùis P(i, j)
rhc pDúbini6
rc FEFd6
t P(i.,= r.
6d Èn b€ újud
ìl irc
wb.Érr = îùatrd4, É 1; we $srm lmbabiùyúdlheD.nicdffpfh'is.ho*i .rtt=
tt
f \tt
a.
É4h$bcaremirrsymbol, jroùa
{e ds paladen
ùf .s bc adj'ft
0 < , , ( . l i ) < t n t r r r dm di ,
,,.,,"",-"","",.,".;:",,,..,ins.,",h,'.s.h*È.nni!ly'q
P \ q . D= P t . n ) p Q n t) = f ] r , ( " ,r . r f J p t a r t a )
6,4.2 Co6tlu.rine a pnffle IIMM for r protein family usxrl
oft rrÈ wnìì r 'ìulridÒ rìjsimen' of rheseror s.qrcrcs (or a rrmiìy)
reumìred in oì IiMM 'crin3 ,rfdrrr' ù\d rúcs a mpuri cr or (idmrry risn.d) mcnbs rqùei.es. h Ò sro berbìe roco'ipd Gri!tr) I sLquei€ wirh
6.4.3
a!4f@Eqql
q1 {3 qlq5
q6
rú Èrud,
! wc4 !ùNù.c
uo rrertrh(ri.kù*).
rid úc llw
!tuì ù! &ftsùrù4
(!q.
6.5 I
moE lnùo r ds (moE dún oft is p j,\.rud jdc k, n\do, rhccnrsiotr p
6.4J Conparing. sequ€nc€ sith an HNIM nh im t.tMM(bnrd lhdúrhe quence B Hl!r\O. one mnirrLy iìndr onuÈù ,hr hó ]ùcllsLrîr'hcnÙmh(',]lhÈhc'he cfúfdeeldi'jgdslb7]edgm{di.e D-'./ =
mù
L a . L P L i , i ) P ( 4 r L1 , , e pmbabiliriès , r.r JhE rtrsiioG "(i.
m
6,4.4 tmtein famìly ilaiab,scs
PFimrhesqùÒn is nÌìpdrd Grisiod)lrh crchIIMNI d tanily nenbùship inùy (ù rhef imrhmenhriona sor i5 .rL!ùhrcd) Ds pRosrrE dùhr1e Ns i(tri pfónlr dd lcqùqr ndrh accchiprerr) thdi!xloydscrcLriooship M: Gs rheyùe d.5dibdl hùO, Fisúc ó ó
ó.5 Erèrcis€s y,./ v,c ror úie dficur
.qariotr inf ciìcrìariig r,!
sbÈhsftFqdùgly(htdbd€F@[,o|
(a) Mik a pbrìh bs.d òn ùi1lhcn rheq€ieh'irs À siditr b úc qc sd
(b) Do rre.d.ur
otr r'orrhe*cond
'ioi (6 2' fÒf rhuporirion weishr. ùedùfN!ofrhealiemÚtdom'
vnh 3 = 20 for sips in dÉ $qù.ncd !
6.6 B
s ii q ù, no. neesdily 0 fb
havc ro bc r
rhc nq' .yde. L\plrin vhy ,,''..;|
,
fo]lolitrgmldeL!rjsons'neik|:
ÀFr pmbxb'y;, aúdv$r i. mdDad Eee,i wh",,.,hr,u...
b m{ù sb@i(tr rr. i b
"È
d.'.1, shÈ.ird i b rhe'rdroù ú'!
G) conrhr in HMMm.rcltu 'hca srìowù (rù$!!hrd P{hrhmùsh dr n
6.6 Bibliographicnot€s nhskovd al (re3?).cfirlkov (1990) d .;nbiLoyedvcfcrijk(1ee6).È.fi rèwci8ùnde$dbediiîìùùrplnd, (ree4!). urryd.l (199.1)rdpSr.BnsTrajrchul
al.(leea).Ditclirer[irres ùe dsnibed ir renùofr ald ue,ìikofl (res6),id sjpLndser aì (lee6). ú6tc\ n siri ii Duóir c' al (tee3)rnd aidy ( 1993). PramisdcKnbcd in sonnhùmereral.( I997.1sssl.
7
Seq
7
Sequence Patterns
rnonúúìyric)i0rend'otrs,suchI bin ù Grd urrLJ, I n'Ìrd rnn Jnhd prdl iùin ùc dorù Gome'ms ared a Do'il).
i\rrìdìrmyhdenLmormsiir irir_l). nhileI rcrr alìg,ùndtr Ìe'hodnig|l ndd Édibi'ìA !ì'niltuìriesanoi€ sqoeics (or 3lilmiy),$chbF6ÈudPrcireHM
s t F ( l x ( 2 , 3 ) - D .v h i c hi s i ù r r d b y a !b)sèqderù hesituìirssnh Rdr( rhr
pfup.nis'lì!q!mP|e,PRoslT!gi! rdhdfics (r erc$ or eúyùs
shì.h Ìdifc
biÒ$dhc\ir) rf e ney quene mf d rc !nd/o. fundion catrbe anrtlsed frd Òic
prMs (.rc nr och rGi,a tinìiìy) h mor cm.iú' nd ar$ pnÌidcs a morc {isrilc 5J $rdfi c ri Òî1!mrlyme P dN de$ritìitrs bioroeìdr\ mùliEtul similiriries rrc úLhd ndi^. Mdii L0fshdtrdDlolÈnjcsoflhgoÌei|'orlù dÒs.nb. (nÒikjrirr) f.rù$ $mnon ro bioloei.rlìy Ghdúlly ù roodiona y) .ore!a6loeviìmtgh.'lftitillfrÒ'if.
7.2 | Dir'rùcn, r-34,34 lNde ùd ir is 6suy
(or roml|nns) fo ùrerch rhf pùr or r *qùù!c
duuy
mdcher I l,drem )
Írens(Hofmrnn errr.rrrq) obcPRosrrE yofltuprerjnfanirj$.)
7.1 ThePROSITELànguage
pù.i'hqe\ 1 1 . ForexmPìq raclr
Fordample, lcHl da.ds r$ hy rdi
asin posrion e ìiscdbdN*n {f
j(r) cotresponds 'o x x x rtrd r(1,r)
l R ( ì x ( 2 , 3) t D E lx 1 2 , 3 ) - r . n-x-rDNsr-rrLFrù tDENsfGttDNacÈF0-{èpl-rLrwc) -tDE) t.rr,mt rDENasîaccr-x{2)
7.2 Exact/AppmximatèMatching icr mrchiis, rh. wilrio!
lrrów.d ndù(;b.
). oi Ììe o'htr hdd. ve also e dd dry :pprioÉery (o ù ednnishrù or r) nfó rhemoe 'p€cùìied
[email protected]È,
$ù.ics
xpÈfxinmely (lo.dn disbrce r), andcxE mdch$ r&ee ofrhem exidty
7.3 DefiningPattern Classesby ImposingConstraints The pfrcms.b
b.8otrpd
irk, (ovedappitrs)pxtr{n d6sr
depeding oi rbd
']',tg'1,4_rhùolci{lsflonìaeivc
virdlid rcgion h/dd iJ ir h,s rid lmgÌh (n nfdìlr rúino lcidr in a sequerc), o'hîwi\c n i5/rsiói?
by r iìxd dobq
of
(vhùc vidcadEsiors.anbeorìererh0). TbcrmgúlscÒrPRosrrE rms ate iroNs rb! fcPrùtoNot componenb, c.g. lDEl1r,r) Thh u úr hecoosidqed húLer
rR(l I lr,2 )- lDrl -at1, r) r c. (r). îrì. rfr 'eo aE anìbsuous.'he ta{ 'so
lqibÈ leeio!nlP|$ùcdltr{enebdpel rhr r (2,1) hA ndihjr4 3 (Nc r4hd
parems T.e/rdi,hj
of r *it,l.d
i fisd vitdcùd rgion ha seribÍÍy I )
7,4 r
7.4.1 I
. {hf l]peorconpoiefs(6rd. mbisuorl, . rlì!rùnbùor.orìponens(1. lof mofe), ! vhf ryp.orRild.ùdrcions (6red.nerjble).
a t s r l x 1 1 , 3r c n 6 . Mdimunì (dd minimum)iunb{ o. lonpndl . [ldnnum (andminjmum) lcqrh otcrlhoìnFoc
! Mlrifùn (rord)nenbilnyof. par
7.4 PattemScoring:InformationTh€ory
i.1.1
Inlo.dalion lhèory
rhrhir4 1r,l.A\ùd!rsynboltd Ljor
k t4. {rft eid, sr-'ìborhis r hrlsNd
1rr ,r,o/trd,h, / (d) ó$úkd
orlmiitor/(d)rerhdr
t! = L.'hcn
1.1.2 |
ftc obpy
io, ! ùù4
dpiú(
{ú
@hhhy
I crch.
/(,) = 0(mmwkiowrcdec),andrh l(")shoddiicaóeford.crcÀjisr/ rrn isobùined'ylhefoxo'iqcqalion:
wcrcún/G) = É Nheìr. = 0 w be ds{r
ard 'ins
(he ìiw or kfg. lumbert. MiddÈ ;romdion
Htù= -- Lht,6tt,
r){ synboì
= - Lo. rstu
sùpposc wc h!r. I binaÍyi'l0hibd:
r p= lrl2, r/21,ùs n(r) = l birì
. 2 = I r 1 4 , r / 1 1 , rd(Èpi) = 0 . 3 b r s ! . r = 11.0Ì.'rù4(r) = 0bns.
ntm ìn rhenuEhù inbsrl I0, ìocrt.
(dr'nhúnn)otrhcàrùodrs.s.Òrùsr Lt lp,l b. rhebRckgmudpLobibilii$ TÒsÒcrposnioi',i's',Jb]?tr',,,d ''ìtt È|ita
ltuù . lt ùJanrìraa.idr 5ddùcd ù E!ùúòn (7.21.
riri'yìs2; = p,/r&.rree rr n'bcÌduúurddrbùketuMdpmbnbiri,icsor'hc
t tr ; (dÍHse
ii ui.!nin1y) c,i h. nrÌr
L.rd by !$oiEq@'iù5 (7 2) ùùd(? :r)
nx,) =
Ltu|Jlla
f)
\ *r-.\;
we se rhd i1 & = r, ùÒo / {r,) - 0 (N fcdldior itr utrefai0ry).andtrtr r., = ,L G ú or oie anìodi.ìd), /1f, b{oNs eluxlÌo Eqùdùn (72) wi'n rcgion' (ìc\\ rpt.iiì.), $'1om My ordoì'E d,isn b $ore awirdcad rqion r (whea 'hevirdù,r 4ioi is spùìîed is r(n. rì )) ó
coi nt
f75)
Pú'iig €qùriom l7.a)md (7 5) k,!cù
/(Pr= Ir'((r) -.Iú
4).
o'npo.enE. rnd I olrf iL ([cib!e)
Ld (ii rhi\ sùipro ,
= lac.Dl. aùdpn = p. =
ldcùd
tl?)= tle
q^.t+ ttto,1, + H!
nD
= :r(jbej) + r(j bsj) - ,.' 3(jhsl, =-,,oc,+6sj-Ì
7.5 Gèn€ralizationand Sp€cialization
drcoy n ! surdsr or Lr (i.u andv m tuiÒi |n,r rhestunerpúsr.
è x1r,r)
lcDlbaseftnrtariù.ia :(2,r)
a-rlt,r)
t c D l i s a s e i e f l t d i o n o r a - r ( r , 3 tc . i(r,r)_c-x
a pÍrm
.
taqDì.
mdr Í 1tucndora sqùùìlc (hDufl enì'zedpdrm l,u' rel 'rs orhd).ano' ( p a r d ) ! v i r d c i l r c g i o n :xa( 1 , r ] - c D c ù h ! s e n e n ì i r d r Ò À , ( : , 3 -) D {'!urrd,,,tr'heinrAcdfsùflìrj72rioD.h.!cÀ xi2,r),c x {aRDlis
7.6 Pattem Discov€ry:lntroduction
or L (bú' ùo 'hc hktk nry ror bc biolog.illy iìrÒfmdlc)
$Li,óDryù c c.tPÌ.n).ctsDEc! nn L1!r1Fsì.r(t c hswss PRorùse
D,al0,2_ r rorl.
LrardrÍihcD
a r 0 , , ) - t o l r . i n d r h eo ( u f e N s
7l(
siiìrìiritìe\ b ùe ol úc sgreft
d hish{ rha! I gi'àì 'hPshold(or 'hí 'he u. hrgbd) A$oush we rc m'idlv irrm{cd in
'-. a-"'"r p-br-, 6"d mmnd simirùl,ieshd{Èn i s' or ÒbjÈb. :"., "r,h" Ds.rìbing m.úodsfor pú.r Ginihnlv) dis.overycar hcdoreions diflÙcnr
*,"i"u.,
"
p'".t'"'*"t."
d
fo
'i"tr!
'
o d
'*g',*'o'""" eetreritniotrs or ùem)úo o(ur i! oL\ersqucnrs
,h.;.-.
F"e-" *'d, *",.m.c
n jt cioughrhd rìec areodrtrcs
' reisÌ,t-ùnre dord$
nùghrbc
in d Lcd r (t bcj!8 r coNd) oi úe
*r pxtreris(rcquinrs'hd rhesodress ot rhepa.tns cro bemdsDrd)? ó Is exrctor lppmnmdc ratlììng ulddl
7.7 Compariso.-Based Methods r. of objeb. ?situne úmparisù is inc basÈopenriotr,citherFúNùe bcrwrn objers,b€'vei ú obj.d md a sjfjranry ds.riprion (q!. alisinent. or berw4n rqî simirúigd$lnÈìonsa pr:Ni$ rI (plYd). ure me objd' r\ pid. i obrds î. rarrs arcdÈi .orpaed sodú' Òncfiidihc I,rrG) orrhepìvú ùar ha sftìrlri'ib in anÌhedber 0bjès. Lr Gnsr procnssna). srd siú Òncobjed,ùd mmpue$rssivety rhcóùs
TP (tr. pm8.cs d. armpre (HùL'n ùriis i (po$ibtyihpricir) fte woùe 0c .l1|(rll-byru). Alì,i(, r)/2paìsisconplnsnsrep. omdlúrnolommon priíis( i mjhn ries.conplns.5 (e.g j'rc^dotr) ire doie onrhÈscrcslb ro fird simiùriesorùriig ii,lir sinpk. r orft objersG diqreofsihihdry ú Èsue 7.3,vh.n rb obj.clt rc Eqw!$ Ap:]n*isendhÒdsúnbedtr'ìdèd Ndc. how4ù, L'ìr'r d6cnpúon(pf'cn) òr ùc rjmitaìrjesis iol !{$sadìy givenj tùirì* Nnhrìì (orr , 1)oúù Òbjcdx,a|d 5i'iiLùid$À\!pdrrn.sldirúcrpan
pfs eîh onhe(oneor sercfrt)bN roúr
'heuÙl'jetc6gbyNiigùeLPleefueh' llÈùmscala|sobe.onpfèdwithpal hdfteD rx obje'5 (orsonÈmi mùmiùnbù or dÉ objsc) n i 6ed in \rricroù 3ld a4os (r q9r), wheE rheseqùem rrcflcúineBidùe sininiet ind 'h. drf N' sjrìiìri'ics iÒ' sh@dhy x ì obieds. led i3hî8i.iw (Tmsiriviry dxnl\rfs . npri4 rrd ,4 is simira ro c.) Hoq4
7.7.2 riFETlllLrdiÀúbqPnociPd!
7J.1 Pivorhsednethods simild'r*'hlnajEimplúÈúfjo' s ùr rhùcróto !$d. is itr Royrbfs os92) @round(sijnitù'yúc6nrdbyúrdif an.e)
tqsl r(0,r)
c-i10,1)
H s Lor rnsl xr0,2) c H s-L
nufually$lisly'ljcsihjrliLyqÌsni
7.7.2 Tree proA6sire m€lhods memde4.h.ÈúEF{4lilgisilgrc d jÒinedin eidì cych. ftrj.iijns\dùc L.belledRiln pxrtro(t sen{ùziic
1lì f fops dìen thse n only onere, cokri,ìs
R|iPLcJ'rcmanelhodt,ysi|ljlidsnilh reeo).The so,ii!$rretu ir $ de6ied s I sbig o!.f úc iúrbr 'ii ! 14.r,!.r. i,i. È.r., r.!Ì. ùd irs std( i5 ùe \ ditrùùf nùÒ s Grp), nnus xrinc -!l| Ì
DFÎnicPlefuútrg;5[dfo.Pd'} ùrù j.! slD Fnrry Gùirg rlic Dp) whù rhcE rre srps ir orÈ Òrbdh of rhe
PAT'IERNDRI\EN M€THODS:PR{TI
(!ì ,lrsEdnr pÍrni
. o E( 3 + ( 2 ) + 3 + 0 + 3 + 3 )- r 0 wirh $oE t0 .a rl$ be fou.d).
! arisoinscr,Jlciv*pLr =prnqFH,rlnh$oft(2+r+r+(_l)+r+r)
=
7.4.\ 1
, ariciing(rr.rlgiv$ p1r = psErs, wi,bscorc7. :).úd pr I ns.be
Anoderor?uiúetrccard(trrcóu
oory oneprúÌi (úd hishef s.orirg) : PROSTTE rir! fpJ!è
. tacl
7,8 Pattern-DrivenMethods:PÈtt
f |gf r) of ltu *qucn!$.
de if rhec m o
r dr sddqnr rcgioB.ú bÈrìqible (ù orhuù 0.otrsrivdy gìyjosùbihdy
x {1, 1 ) L. ii srìoriomÀ c-r{1, 2) I x L wildld felionsor|oso u aÍe
7.8,1 Thehlin
pmeduro
sú..hiDs, scfch for nexiblep cF sp€ciarizN.ioDcenbèm€'o, sp.lirÌi id in 'hù$fth Ge tr'ù), hd.he shndrr
r ! ù - l r i . . . . . r , t . ú ù ú ee q m n (+).Nor ùùÈroE rhr in I *ene'nc s. (Ttìcrhdlrd lilùe ibr / n 50.) ranrrdk,iiaprc orhns.h<,riú/i(i < r < /).omponcn ( ohf rcrsrhl). Reflt ùn 'hc terElìor, P
. ,t 'sùÈs,"faursc,m{súr.h , r; 's úe slÒll.*snÈns mÍóinc
hr rhDsqueres beMGRsraR andLTSRRU4 riq onnn M(;RS.CRSÎ S'IAÌ. TAR+.AR++,R
P:
rhesqueDesll!,.
.,rriìaEprcprúasd ore{tùeroiris.hqer(whùhkdf
,i *hrch 's s.d rn 6e smh. rd! ,hai Ri is khld lot ottt oE I lllrcîaî.
vc
7.8.3 Tùe p.tten spoce ,.pftm (
<,
< /).!!drheenp
! Nodeìsiddliùirs thefde,
. lhe iool.oíais
n\e empl} pdlqo (
ud oie.ompore
haEempry siìr.ùd qioot:
(ùc Fof: chtdrtr
SEALENCEIATIERNS
rqrb (rl andmùimumrorarflerìbiìiq. .ai
ftrj'iùinùibjliyÒrxvi]dgrcgiol
r t
! rcduE iof hndi.s rìl o(ùrcm$ of rliÒpfuú c tRo) r pnirdù! rìirjn3 dr sq@ne, ú'h ooftnces or q (/o) (i: rnd t is ii rheinpldnerùdon ! úmnon pfcdre)
.si rori :-irhdtù,',i+ln)do o = /, x(iì) i ,o:= F(Rr.0) 6nddreonrtm$or t a : = G l R o .H P ) n o h h bc ?aù'tt't rhen . b ù'de(Nodì, P.0,ro, d!) DFsn{odl,q.3o.tc) eì$rf 0 ,.J r.r risl, 4o@ rhèn
q
Q = o l r l a : = î \ 3 n . Q ) :H o : = c l q a , t z . , r t ) iî O is t0 b-'.jkh.La ù.n qùr.,roò(Nodr. 0, r!, ro)
DrsNodl. 0. ,p. rtu)
rindinsra,0(P) ! child-prrtnorr siîce0(P)isaexrisiorof/,'hcnBoi5rsùb*,or3p.r(ifr 0(p)bca/Èed ù,r. PC;rl {rìe'lìer arsonúclìs ?j,("). rhi n dorcby$iq rhe{rurcnce of p ù r (atqlys aPE6n.andss f the(i + r)ù rynb
7,8.S ,
,P = tArÈwÀ rrERjm. A..l.,a..{rrRmt. lhen rheocd,Eice\ trrchre, of 2 ù€ naked. ùìd
rhcn ro:ro &tur* t/r.fvw, .fiìcìeÍlr
Jbmd by kt oririoN
r c E L E v H^. E E R v v R l . ù (strú$ ol) ,p rid {lùbu5 où ., i .,."
r; rcq a\y r, $e rhf ,. in. alh,ór' ex'e.sio!o; , (r) = F-x 1i, j )-a j), andrhc (r < ,.ú b! rom,r by ùrir'gth. *c rn rh! 6rcdc\ù$ùo\ (i
Ba!\b=
l)
R a r ù = B o " , - \ l t . JBaa t o , . hccilcuLftd by dyÌiÌnÈ plo8mnmins(*e
r i r ; ' r ' i > r i 5 n d n 6 ! d r ' o ! a n y t i . i () P a n n d b q r c i d c d b ! s q ! c n ! ! rr.bì'ì! î lear i sqùeie, dd ir "
7.8.5 Ambtnoùscomponents
tdmd*, c.3 P{(ir-b.Nhùbisr r(i.j)
'ri'e
-'rl'r'
0 , t ,| " " p i
ù
T{omexof|p!.irL'diÙndÙjcd:
r eireiding úe pàreFs b 6e fishÌ (bu' nono 're h1ì).
l r , 2 ) c x { 1 ) D a l d r r= l a ( c o D a o . H,a-\cHis usd (s* Ficu€5.1),'n a i xl0,r) c x1r) D-j. whe€i=trrKRl $d j:tNQl
r ! Pf'cm P ú \PmiaÌized 'o 0, ir i
7.9 Exe.cises
nenúityis2.sidlhe!Ùjrùn$}o ln) How mriy pifteùs dos rhk d*s coùhinl (bl cD r 11,2I DÀisrprrdiiùè
ri = rD = i. r,. = l. d 'h. *rcb (a)Findn\escoeorrhepfrm c-r 1r, r I tÀ.1. (b)Thep.ù.bìsc-x1r, r)-rÀct riJa x(1, r) tact vÌ.oqtrrry.
rhepaiNise lcil xlignmeft rouM ror Gr. Ji) ft
a panenr G) wfne fof sch aùsnmeDr = r ) 3 q r ! J r 5 +o5i ftc 5 tFisù! (b) chm!È thr hishè(*.on!g p m
(o BsLdor rhc$ùcs. d6dcvhrlh (d) Dù 6 L{ irsnùs,nd lúd 'hcfc
lo arie rhspdrcm Lhdyouhdc foùd h ruh of úc on!ìnar$qucn!c\ (0 vrè {heprbn Ì! x PRosr'fEprtn
n rPPfÒkh(Pnt). we haw ùc ùLphibd = (rhc : rdidm nùmber of Èrch in dìerÈ). th( , 5, 2 {4, c, D), 4,ú e P i v e ù : J ! M c D A c . r : c d c D À J ' : A DC . ' (a) DR rhefaÍch EÒ ùd rìid 6r lo ùq c*,À-x10,1) pxtrenì À-x 10,r ) t.Dl (.) rxo p!trds (4. ,l) cin bc .odbiicd ir / sgmen! mr.hins À hrt
ecuren.a nÙs bc < 2 ir ùc 5. EipldtrwlìylquúiÒno.7) is .orE dd É4irP). srcùfirs ,e!.,LrP)
7.10 Bibliographicnotes (r99qlMorc.boutÈccitr ùd nmuÈ ( r99?Júd stPhen0991) s de h cnshsìd mdd (!qs8),ÙdÙrmPhsc'i h'ìùlid Jonas*od,l.(1995)Scqtir aidrj@i (r991), io LNfctreeri (r991),Neùq?ld !! 0S96). dd Durbù(11)95)$dBuzmad is dscnbeJì! lomsn d ar (r9q5)ùìd Jona$oi(1997).Asuacyorddùmi.i (rgcs)ExamrrcJ oî.ompdkonbsd Oeeo),vùsorùda4o:Oeer)ùdRoyrbùs(lee2)lji'iDlesort4trúldlcù (or codbii.d) mdhod' !a in Neùwild a ath ( reea),romFD d ,r. (leet. AftrhodNù3 sd wolr.nÈrù d d {res6)(rofnndci!lcidr) rùissi ( 199?) i,itritrúl deraiprio! lasurgr $ ù Bflùx d rì (lqg6). Oùd meúoù fÒrnrding mdil] {r iì BìiDchrb d d. (2000)(for orholosoN *quencs) andPevaÙ ind szc Omo) .nd PaGi Ò',r. C00l) (}or DNAsqùenet. *rY n ùe Cibbsnotil srnPlq: hnP,1,{dtb'ffdsvorh'qg/gjbbs/eibbs'h'm] ds.nbedinLrwÒi.cd 'r. (leer) dLiud.1.(lees). dhcove4isin vilo (1993).
Part II
STRUCTUREANALYSIS
I
Stn Des
8
Structuresand Structure Descriptions fi.
rn! tbr pDnii (mobiìr) md caÌirysisl vhile dhc6 ae mofe pisne shrcruml
snìilì ndo,. no\enrenb ósociitd siÙ\ fuidioi o rmaúÒn d /ò4,3a r,'dr (H bonds)
thùÌrr)
8udúdi{E\ b llfsc
1'oi|y2oleside.lrdi!cùfomhydrg.! ièldd ri !s /ú/!rrri. G poì.r nìoìecùìe kp rdr.][co.búaFlÙodcUlclrn dir, f l)lniatLy dì?4edl. The ùúìo x.ids
I $,i'tr {ús (4ueour eùimme . i.. or f ùìe hldrcgli boidíis i5 rhd 'bc (srob!tr) :îÈinsoBis'oflhy']í,phirj.ùn!!
(dq{ 'ó ùdividùl rcsiducsùr ro'ù,
(nesa!i\r) H-bÒrdscan b.lúned
i
p (N H) (posilik) aid .arboxyì sroup (c=o) be
.aneddÈdlìe|ixa|dÌher,{n'nd..oPlbq
t
i or dkùlpNde bndles (bords) hdveD ùc inrgml pús of rhc (ructuE (cdlxdÒrs),ich * ùe uinc rom ii Fieure3 :r. bal úiviry, ii rhr ùrevhole pmEú 1q d bsl a rxfgepM or n) h !ffs$fr. Fo *m'laJyrructuaelement(ssE q
rhc compd$i Òrsqucnrs h al.ìor shdùas is eìtheron ar& r.x/ (morly rcîduo or otr a @^! r4d (rìorìy ssE). c dìyidod idÒ rso l]ps, dep.idine on rhc findùgimpdùr.omìor
localGpÍial) sìmihn'ics.$ch asr!L!csù.1dbiìdhs
sedinrd!$ìryùs
npoddr ùnirsitr(rclxÈd)rructuE' a NdeihriashduÉsnÈf]$n'ilì
rhopiÒreiB iÍo shdùml
8.1 r
STRUCTURESAND SIRI]CIr]REDE5Cru?TTONS
tupdwd
Èon ÈdMontr à d (2@o
'lrì. PDB(Prcriù Dtu B$k, hlll://*e{ $b oqTindcxh.nDd Resarchco at ordoryfor stuúurar Bioidoflfrics (Rcsr) n ùc nlh pubrr ùúbÀselor pfoÈiD n& ùhireolqP.nmeÍraìlyddemi hdnsed'lefoxÙwirg.h,ptf.dcs
8.1 Unitsof StructureDescriDtions èús (ssE, A rrsncd n úc rn.ord or a
roPorogyn úe demeDÈorderrìotrg ftDp€ni6 ofùeeìemetrh.e s. physic
a/ ,Ìd?4 ncmiig ún 'bc dù{ript Òr b!5 scnPrion caù be oi .là? (r.') t\d or a
seenlè$ipliois*nilofspdfyinerherchitcfuftmdbpologyofpmtiis' For ùc ,ùc lod n n dooghy :PÙllii
8.2 Coordinates 'fhc turdams'.r ùreedi'ncnsiomì(3D)shdurc d.ùiptior cùr\À's .f 'nc sp.ci , i$ enei tu 'be PDB GeeFisu€ 3 2) rhe Bon,ncc(NMR) NÒr rbdft 5hd b.Ùitr'jrlyinlledeÈ[jrrdo$'Fi
($! Fjsufc3.4).rlc ridcchain s sidc.hains n,rkúfivelyÉpftsúedbyllÈÙ|sideclìjjiron
8.3 DistanceMatrices
.!rir (Eiro! ìmrgcs) (s
Fisúo 3.6)
x rnb.Ò nftir ùflriB (morc lhu).nough iilno,. rn oirù b indicf.lhí rior 1Ònlo ùud rlìc rîu.rLr (.r..pl fùr bxrd!d.d\), thiirr or d st or poii6. hde illp.i,wise (Eudidein)dis'aies iE nr{orerchpoLnr.e.s(ar.,,).F @srdú s.s ({o rúr ù.h pÒi ). aid ùnc rquÍiÒa cb be derìied on rh*e (otreror eachdishrel. uo*evd. rìy tfecùce coodÌre)
,
. ,r,krf,n!e!cqù,
@
sm(!f h!
::rìr.doìrrrscunkiowNGri,i3rhcdi5hnlcsdisQ r)anddis(r..) sohitrsrhe nÈ {ay. rìd by tr\. or rtu dirù,cc dii(c. ,) i{htudn(a.c) \ sunùiuousÌiloì :r!8hk,h$!(l +?)+r(,' -3) =rr
6disbiù! nbtrordnbccrcquncd(4Ì-r0(Err
... r) wboDeoùs ù 6e oPPosÈ d
(c) !È. rts tùlEs
irc zhc dons. (b) ftc ibc
d numheror coofdinds (Ememba dú' sÌr 'he dkhùce bd**n Mo poi'ìs).
bdw(i
dìc I doms lpoií9 ùrc kiown. k !ù 4rculdc rhc Edùs from m orisin
r. o lrich fe simrrrt!h!n 6b rb cd fo
t) =itti+óir
d i ) t a ft t r r i
. rd rheongin(vhi.h m beroúd by useofl-igmrye srh.ofcn).'Ìì.i dc6iè ùù r x N) mfdr c of 'hc dcmù$ kd inrdm4ioi n NMR cucnnens, whch tudmì $nshims ind úc m6súd \rìucs. :oÈ drùi.cs dr b. r.ùrd (r'c!i, fof : pmbrerscaled rre oahií d úhon plobten(csP).)'rhc con{fliil1 ùar ìn nndure prcdidion.somedhtucs ft rr|osd. ard by usìnsÌhecoisMift (s lù NMRqFnmenÀ), funbfl dirmes Èd whcrbffrbcrusúins $( of disúnÉsun
rryiig a < r < 3 [ehd. dc.
),
;
WcúcPhredoÈ,9@sùolgiPojtr\.,.r
8.,1 TorsionAngles PmÈiiis a sus*sionof poin6Gromt n '''cr r=Nr{i{r=N,+r{i|{J,r.. .hr lhey ft sp{i'icd by sddz trd schnnìer( 1e7e)6 La7 ror \{ . 1 jt fù' r{ ùìd 1.3?ror c N (i'ì xtrs{oùs). îltr rngrc5bdy lcn ù! r!, b]Ed: of ach :r
(in rhcblckbnnernd:ide dúint. ceîdrìly, ùÈ poinr hr fùùld úe bÒod5
abm cr in F4ùF 3.7.nÈ m!ìe r ùìd rh. ; rme q,{, ft oshúi bcùcc'h. ::: poùr (cr r. Nr.q) i\'h! aqLcrj, íÍond rhebotrd (N,.q) (rhe sb orL\e i: (cÍ,cJ) rothepldeor(c, r. N,,ci,
n nkè tùnd
thebaú (N,.q). îhj
.IhefaedoDNJ!'ihftli'i}cfu'|jn
rnd(9.c,)
rnislisrc isdqo'cd'r,.
nlb^ ùoàd.
[email protected] FAn iLnB4
ùe 'omi {ers.
! rhe fa.dùn ctfr hasrcrarrve h ns rhebotrd(cJ N,+rl Ìn\ùep.pr, .i ed Gpplr,mdeìy) 130,.r[.d ,r
sibiliri$!tu(appuimrdy)0..jltn
ùúú.e ot a prcreinis rt.rcfoÉ comprerry
tr ptor (mmedafr{ ùìe rìdiú btóphyrjcjsr. (j N Rlmrch dmî) is r plor yheE rheatrelcs(+. !,) aÌc rrored. tn fieù& 3 3 ùe reddry !rLoqrd {ìùs or tr. /) rÌc rlored rtors wirhrhedisriburo. oi ù.
8.5 CoarseLev€lDescripfion llrcdwru||3Dshrpeolapfutin.ùc]r,d' 9s1ì[maresftú*rÙQd&$l|yProÌ'Aiis cùr bc ò{fibd
by rhcoreaizarioi (oriir..r'r
8,5.1 Liùe scgnents(stick)
xtrdrrnrl,sr) d. th! ssrs. dìùs
4,5.2 E
I "i
r (!n ordde ù!a5 tu !{roù
3òe sÈrtd y ìÎosed uhs or (ú, t) (darrd
rtb,rar
:!\idúc or rrtr ssE (orof r rdg cro hy r r$È|qùu$ ndùod ($ Fi3ur $.9).
8,5,2 Dllipsoid s'o desdbe 'he ssEs (or iry orh{ rngmeD6)
.c sedoi 3.61,d'ea rhelorgsr Jj +ocí (ii.k) rpc$nudor Îidcllir
8,5.4 S
0ùc) ri ! $' ú rc5id6 ft!
fuid n osúè
roned syhydmger botrdsLdHbond(i., mùnlhf'herci5iby&oembondbell ercùpor Esiduej. Hbondis thusa ìosúr tundioi whnh ù tu! oh.n rhct i\ ù
8,5.S î c.hoìnj|mldcbysa.s!iv.hydrcgetrboMsì Hìond(i.i +a). Hboid(i+ r.i +5). 3E.hdnismldcbysccsjvelÙdrcgd'boids: Hhond(i. i + 3).Hbord(i+ r.i +a). n.h.lixisnideby$(dsivehydfogenbmds: Hboid(i.i + j). Hboid(i+ r, i + 6).....
5rsrid!r=
úRnachrndrun
dúgruF (Fìgúe I 3).
8.5.4 Sl.ands and sheels
groùps(N H úd Gc) ùe i! rhepbi jdur t in loolhcr rlcn 'hc budjns of 'hc tsidEfoi'lehydrccenbo.dsbl*o $a$iv.hydrcgeDboids: i+2)1,tfr ìid((i+?.j-2). Hbond(0+2. ì+a)1.... ttsbond((i. l), Èbond((i. wiú ! rùsL. ir\jdtu or dic úIcr 'ú l H b i d ( / , r ) . H h Ò n r ( r . i ) t . t H h ò ^ d2( (. i + 2 ) , H b o n d ( ( , +j r . 2 ) 1 . ;cd ($ iú bdh qild ftc idúrtrd
rÍ8od $ditDs
ù$c c dfrPPioriE,Ldyd=
(TOPS) 3i,5 1bpolosy ofProteinStrùc1ùr€ :1oBr i,r.ree{). oir! rhcssEs rcil
andùrjPdld
bÒtrd.
r?0údú - +r20
Fi4G3.r0 'úPs cdùd rù ùr irudlr rdc. AToPs,iasrouil!foma|jàlionol
8.6 Identiflng the SSES f rheidenritìcaÌion of rhessEsh for m sy qmdy Ìdsdry {h*
ùr úe rdenriúcaÌjon is prcbkndiq i e. rÒ m ssE rd\ ùd sbp. ìs uniEÉr delìniÌioDror ssEs! rherew mft.Ecùod6ideÚifyssEs'md$me 5crìh c trighxúd dirù.óúìd!úbly
8,6.2 | 8.6,1
Use ofdistatra
natlices
rlrr cú bcSSÈs.Forìd@ìteda-hcìi u l a r d r o b3. . 3 , 5 4 , i . ró.. 33. ? , 9 . 9r,0 6 . y 6ing d ide izedaEe p:úrror d helB c abns (ssúon 3..1).Realheìics usùfly s,x\ údvr in rarrrc3.t(a) whw 'h, hdix À (R\idues3r $).
Forarid.alized,innd 'hesusssie k j\ gùdiiìll
t d\lib c 1Ùj Ùognj,c adj $úcem!
(Fsùc85)
HÒwer. r f or.i
ùúiducs r3-22(lr),2129(r2)ùì'152 :6 rrr), Ni'nsdpdùlèlcoùrdioisbd*ed /r rd p2rndb€seeiFr iid ,4:r. rcsing rhc n!ú di!,!oi!ì, !( ìooÉ (*c FiEùt lj 3)
8.6.2 Dern€SècondaryStrn.tùre of Pmteins(DSSP) dodiry(ddior) ssEs1ìrr shcturcsis pmb rinsIDSSP) byKib$h atrdSandù(!93t,
a Q! 9)
. an i{L s bd h{sd = : rÀ., : o. {d 05rornd G.ùdrie).
psùqd fón (ùsh
ùd sndq ( Les) by ùc
pNpse.DSSPaìlovslolan'bdl,Ig bìidirs d.iÈiÀ ollès rìh 0.5kùrmor rde6re!bond. BurinsEadol urine
refgy rodd !rc ihown in Figurc 3 r l
À nirind bùlix ÒrlÒi$ u (' - 3,4,t ton rcsidusi ro R\ìd!c i +r r ìs detnedbyHbond(i-l.i+,1)atrdHbotrd(i i+i)asìllu{núdirFicurc3.l2. (se Fis 3 12) NoÈ s orfiniElrhdi$
!
,iE hydoF
LhL$ct
6od\ 6. Esid@i {i, , ee rrovù
of 'he,ril8. is deiìmd:
- ì.r) andHbond0. = tHhotrd(i ì + 1)l Ò f l H b o o d ( jt . i ) d H b o r d (j i+ t ) l = tHbÒid('.1) d Hbondu.i)l r Hbood(i Lj+ L)ind rlboid(j l.i+ lr. idlc i !tur úc iiyorrins r5idùù t sbids 'm|dsrcal|o{edby*pli.ildefid'júo|
ryùe ^n'ipanììeìjùde.(i - 2,t + 5) ad rìprdrcr brid-ac(r., siLl,ro,dxdipr.rÒma hdsLinjnnd.j..
SIR!LIIJRE COMPARISON
ù,"_,r<: trii:;:t!,^
idr$cr 'urcrtr4rh!de.tu0)ùdidù
3.7.1 I ùioù
[email protected])
n md ús lnÈd 0D d$ c) or bc !*d 'o
\ih
{dÉ
rhdù!
(or
8.7 SrructureComparison
moi rbsùrc'urct. in 'his cire f,.,,.
rcilon' andfor rhedNoveryornìorifs(comDdiri. conpditrs sh.tua aD fsd
]ìlmeùodsFrromì'hiss'epdpìici'ly,
rofsutuc rio|/paner) înd hnds(localor global)
dscdpúof (d i prù ofdsriÈ
prir{iserrì!ùrd b úùr'iprè rtnmm' 6i
8,7.1 StructuÈdescnptions for comparisor iirìil
irÈs x€ 3où!lìr 1. c. ron crùup, asidq
Écondary srucnEsi ^òq. !K
MaÙÍrÙll}'1x.lhdútùldipliÓnbbc
los(.
or iÌchircctuE (gcomcty). ropoìosy
ri o dfl ro prcvide rhe coiDùhon (p .n di5qrycry) igtriùms viù a so.d ú'i,j||J.tr.J'\'.'|"''P'p'h
irèrcùpturcii5(o.lhii r DUr,f',
rrucùFs dìouìd gd difleH' dscripdùù.
). mis poúr
SIRUC]I'RL COMTARISON dcgiPliotrofnrurÚswi'hsimih$crine ^njhú|Nrybdgscib!emPl*orì picres (ùins) d'd 'o d.sùibc Òr.h úir $pm@r' lid (mon or'en) 'he retarioDship ,rsid!.i rlBaafr
.rrEr re ùsd ú dre prioi !ùics: rbo (so!p). fcsduo. balrbonc
rncÙa q?e.cu^irù€ rnd ros1otr. 'ho rbriÒns fc hin,ry, sulh $ gcomcr
Ìhe shdr 5ibLyÒ!{ùPPù$ gehdnary
n bùrd
n ri!ì'lcd Nb (Ps-
de6n s (ii dfte dinersìois n .ould b! rubci
r!)n,rì$crPrs
Fq
aidsP.cebi$dde$dplÌoNml}'sema rlri.i húwt.n dsncns ({jnbne. dr4rion, y ljrcn in crcmc
andrhrn iìnd 0o.!D Pfu$
conmon I
4.7.2 S
')t
3.7.2 Structùrereprcs€trtalio. :i.b nfu (crc'idú or (n) ir fcp($Íred ampl.'{c|'rs]dutaiboÈFcsddbvi's \ ! ù{ (qdùqi) or si (modùed) or
orrcsjduei-2,; - I,i+ | siolsrhcr4r^iirheFsiriors )is de .ppropnfe orly rù nnchùs (conscurivo subfrires. rd I *irg rcPA poririoisÒf6cd
dùelhecoon]iir*ofùcci!fo'n'l prùdo fotrj oLuùl vcdo* (h! vcdo6 hdqeer ssÈdiis mùùùdtJPc.physó.chemicrlpmp
cd ront
h !ddi-
STRUCTXRE COMÌARISON
flfls,
'he dù
ÈpÉ*
dnqN.
8.8 l
(1) î!
@uodc orarubiddnhodcmD
ftis cú b. srn rnr borh spae RndcrùìùLb!$d
unr, hLdd6 too rrc u*r (úo oily s*
rlc qrefml rcpft*trh'ion). lfumr
ftp.
cd aid sors assieièdb be Ned ù a (ìú{ str? Geechaprd 9.:r). o cri. bèshinsì! lùr'l coodinare syf.r rr|yingomti.h,sùgùecFdindsol romcd. rheD lbr r prir or rcsidB (or. Iem ú.n \Fiaì neiebbou
1.d) dyraarc @gnmn rg dieine
úch shctur),
dc$.iprior fr exmptc. in
irh si'nittuNliriom nn LhLr{ìed'le ro.d .mldi,He ryrcmt specùìììashùbrÀ !( 6d rospeedùp rhe.Ònpddions lsr tr rM SÈú ncrhod(ctì.pd Ù 5), n b r$du6 (h{iie lraril| disrù.o sroN sonÈrhrcnì('d) ftc ùi!\
do mÌ .oidr
odtr Grorerh. hekbonL)of 'lE ùe,e
iscnb€d h aiorrìù (dboar) tum
8.8 ftameworkfor P,irwi\e StrùciurcComparison o Òbje.À.eachor wrridrir rPùsLcd by Ìo MoF fonniìì}, rù. sb obj@r\,1 rJ,. rrr), (n,:, r,!)...
/.,.. À d.rìicJ * ! u oi Pn\ r(/ , /J) = (/r.3r,) rio cquili'lcnft is cdled,i 4risùnlr ir rhe r ' t u p a n ri n r ( 1 . r ) ù c . ù r r i ù Ò l r , c , f
-ns (* ù$d ror$qucnet riior b. , ekft, aùginù' of (r, r). rdk' n
-ìffe|.mei',Iilio'ùd{sdedo
of roriisDns Ghrmììy)
dùndar
ì
, c@) ù! Èmrsioi ot Nlry aii LiEr.
8.9 l
a.+-'.\ '!-/
''
(b)
G) h Nr dioùy ú sh dua riem{'
l*c 'tu rcr).
rlieùrnen'bca\ m G). ed ùd hAr rbc rb).we se úr rbdprtud flìdùs (oreno'n eelì súúurct rc Í1r ùs smc ù (h)
11fÒ||oMfrcmlrreibow*gla0don'
h. dilnd
(rvhl!h, nr cxìipìe, is doE in 1lÈ 'nerhodsú chapù 9)
slolÎúnlra]$imdhod|oNmsnqùt
E.9 Exèrcises
8.10
cdÙlaÈúe@qdì!'bol!\eeojnsj
Lerdr ro$ionan,sìcsa rddùr b. ( ?0, 20),(-72 60).(-70.r20). ( - ó 0 l r 0 ) . ( - 6 5 . r 2 5 )r.0( 0 . 4 j ) , (1 0 0 .6 t , ( r 0 5 . - 6 ó ) , r( 0 0 . f) 3 3 b rìnd posibre cìdils rpoin$rcgivlnùwodi,ic
dd É
ioN,(r'. rù...., (a".Ìtwevaî,k,lnfr&
(a) wie a ùfrdx ror rhesDnol rheqùadLricemr rn
(b) lnrerd or mùihriry rh! cmn i ($o Fj-{ur 3.91.wdrc a ronnularor rhisflor rrc $c&.ding (ii stqucr() , $tuì
C) Drw a iìem ilìurntiie rhis o) rhenunhf Òrdìfiènrsh*.ropologies(howrsnmd\dànb..Ònbùd) s is ,r,ru, r/2 shÒvù\i3. dhmhúdÙe8d'opoloeyùm|y piÒ'cj'ronenhù€sinilallopd'acsiqPùirwh,
Èr úe srp sor is 0 (|ry Nnhoú
8.10 Bibliographicnotes s iîd ropolqa is Le\k (200r) (200r) de*ibe sered or rheropr$ dislu$cd
-rryloi
er il
lh.ture ddrbnrs, 5@B!$r!ni\ (2001). Eirhùì'idr d ir e000). sonEEYis Dnhnre $Òdeù, dethodsae dsc. bedin Avódi aùi Tarror( ! 9qa).r'hcRmi dEod d lrq6l).,nd nsùs forjudgirsrhc
dr
trsis rho'm ( 1ee.1). Ellti,id EPn5su (1e3r).roPs Ns rìót r*ribed ì'ì Florcs (reel).ad Nedrorruhmdic'NduE amprilÒ.s iicilb.nr aì.(leeer) dr 0931. ùd r ndhod rorddLmùù!
iiBasuudArhu(ree5).ndl.6renFuú d d. (l ets)a.de\.mpìes orsubnrudúrds.;p'ìoi ii E$rier d xl.( lee3) rd Baeley ind Ahmrn(r9e5)Dirc{ Jomssnd !r (r9ee),ady,niuker d iì. (r997)(s6qti î3ylorùd ofiso (1939). i (rec4).ch.{úrr.(1999)lìis6)r^tqaidrcvdi ossz)Gd9,cd,ìdÈrer!r reer),Kodìern(Leeó)($aùr):BasluyÍdak'iri(lees)(leai'E. À mdh.n rorrlildie flexiblcrmtms (o,bìnins ! hiig.) is'lsùibed ù Shabky
9
suF
Pro
9.1
9.1,1
9
Superpositionand Dynamic Programming
9.1 S'rperposition
t is cremd rì $ ,isir-r,/,
{pl4ol
Jqr.4 d.v,r'rr.,i(RNlSD)rùs RMSD
sr$fc (src) qù'vdcr.si for kccsdy
9.r.r cùordiMteRMSD íp.aosi'ion
can be dom by r ,!do
L ú ( o r . p r ) . . . . ( d r . , j . )h c' h c@ o d :hè eqùvdeoe . (4 IÍùf ! srd p, iÌ. rairtirs dfthÉe viìùs). The pmblem
I )
lnùnù sodiiúri06. k)Dr snrtuGr o$ov' o erb prjr (c,, ,9,)(ud oftens. Ìo l). For x ,r{/,?i,,
(hfte dhùtrcs),
ùd I tu14.
/io, (M rsÈs, múd srh of dìe*. l a|d 1rÒt oìe rcùnonai itsobe pedornedin ùe opùfioi !6hd ù ìin!, dìediHrioi ofùe lineh$ rohecalculard fof.ich foraÌjon:cr Eùrersú.orm (cdhefter ij 1977).) rìî{ shiftiis úc ùiùojd: lgcons$rr .$rs) conìnoncmdìmte syrÈm,dd lhgnfi
oI .$h írucùe 6 Ìhe oigio or I
s.nh.d hy an orr4dnl mxùir ,?r.r (lD sprct !iú dúcmDur equaìÌo l. (Ther bdreiúìcngrcs(3)hdrhùvltucsofrhem.ùix(9)(cÈrb.dddr977.pp.530 5r5).)^ m!'nx ìs onlìogomr irùe s ú rhedkrùcs bereco úe poinboiÌhe sme stucùE de nor.haeed (cf. ;sìd-bodysupe+osiriotr) 'rhe romuìi catrrbercronbc dA.rih. vùrof r, lid !e *aÍcìr for a pah (R. /) vhi.h mùimjzs ùe erpEsion (asunine
ti i
-
9,1.2
AND D1'I!AM]CPROGR{ITMINC SUPERPOSNÌON
L
Lr = '6î
+:-u+ -L
=r
=".+-++r++=
I -
,]:
"+,,+,
rt
0;
-G "-
t,l
I(,?.'-ai. *ca(ar.Éì)
b\Lftr)
.(d..4)ft
nN1
'o r commooùish a!3Òn'
lnEualon (e I) rftishrk sprified.R
D (RNrsDol lllclirts
:re Sirccrhùeis m ieed ro.rlcd io$cvr( ir ba . Goftrimes $nour) v
ùe needld findù3
ageorf rucbc /,tbea B\tsDD{! r) = 0
jr RMsDD(c,,) = RìrsD! (c. ,) rorarishduÉ c
hù,i rÒhryci dos rotineù chriÒ! (!hei arrNsghÈùe eqmì'o r)dRMSDD pr n bd{Èn 0 ind 0 2 (cohetrrd s'cmb{g leto).
9,1.3 UsingRùISDassoring or *uciure simiìaritie rlùrs
wi'n b{ RMSD value(s) flow
I !tc!!.ù mdot úùpar^où, th.6T.krt I b tÈ 4'aE 'Nt ol ttE nunbú ot qu,4-
sD(E(11.r))/v4ù, whùc,F n ùr n runb{ otdemeDr rh .ùn b! 5ùPs-
b ifPfovc dd{rioù of ÉgìoÍ 01 linjtr bp.ìog},gc|trdiigrruduilrymrc|atdrcejùs
9.2 Alt€rnatingSùp€rposition andAìignm€ùt (avcr) ùrùdrnsrrohr,!h {ruturc is tussivù. r0 = (a,j,rrj).(/r. r, (rr,, ój,). ùre ÌhnrÉ r1sùpe4.!:rìo! (Eridus) ulm rh qo fn.nres (dlLÈfsFoosnion). c
rhci bc tr*d b ddjic a
SLTERPOSIIION AND D\'N.\MICPROCRAMMINC
a n$ cquricncc ,! oithe I equiliìe (ùrhgR0)!ÙùÒobÒi.ùnd Ffonlhrsn conve4eDe(r,r , = rr) or son. nìdLnu,i .r dù.dbn ùd ncùod andFìsuEe.7(a)
r
ùc(mrîimm)fimsromÍìonrorir
Ji5(4i.rr) úcdir!n( bdrsònrcsidus {or(rl .alolate a $ore nor x dnbcc
î := th. Ìan{annatùn !ù RúsDc(E t)
a' :=rla)
rùdrpiò (,,j) do .Rrr=scorc(dh(di.àrrmd (r, P) := ri(n, r) DPo! 1^. n) úi.e f (lìndpÍbr qihsor, Ep , = l t a t ,b, i , ) , . l a i . b r . ) l ,rì,il.rr+E/,, 2
ro= l(2t,, (a: tr).(,6.h).!!./nl.
1\
-Ihe
hsromdion
ror bef $pe4ori,on of (ar, .r. ar, d,) on (h. ,r. ll. ,r) is
dish.ès (lnr rhesùpeFosidon) is ùl
b|b,b1b4h5b.b1b' by this ari$mcn( {n'r raNjnsùc los t) L =|4
ht,i35, h4),t4t, h). tq, b!)t.
ry mosq
(RMSD $rru ruEbrÒr 3liFìèn
drenumb{ ofpaiu (aJ',,,j,) oo P tor vhich .hedisbmes ,r,j aE berov a3iycn [mn k:" !E ùc rosniÒB i,r ùc 4odr or a dar sp.eósìùon). rr 'rì1 way Ònly plid
rre $orÌng of (ar. ,,
fo. m*ùg rhe
$qÉÚeúùp.iennrghltnelthesinlaitybeMetrtheÙiioaidlypsof 'lr wo ftsidues G 3. usine . ?l\M maùjx) rnd rheìod shcnìr comp.ned nisht nly bdwLcolr'r r.4J.4r+j) aod (rj ,rr, br+') lxc sprLi,l sinilriry cú, ror d
E6iir!
(porlnicc$iry)
rhs ro!1r! rù
9.3 I
L
i!aL:9
JNd F:TT: lniFosÉofue4c
. (i). (b) rrrc d òe r
.frjb!si]llÈen(DP)ú'hlúu!' . DÈdrE!!
(c) rh bÈ cwL fqs
bf re reo
9.3 DoubleDlnamicProgramming
\ úc iìdÒpciòi.t
fuqtrirhd
n vóhrrd
miì\ Niiìg DP (scdiotr e.2). Hoscvc( idc :ìú.dependiiembNleììaìiem dùa sqùÈic dtnmÈd PrÒs3E (ssaP) :ìgiu'i (dcvclopdrby Tfyro. uid orrso ( | 93e)) T]rn is b!s!d on ! merhodcarkd :lbrc dynui! Prusrumn4 (rDP). T' ':ring rhe(bsd rìisnmeÍ bdveer dì iesidù. piir (4,, ,j), hd rù! sch f tns k,
N.rinr ùÌ (,r. ,r ) n ])rr oî úe alis.menr. :r b ddnc, 00*.rqd) r,trri.,i.,tere nn! rdd.
roqch
eìDP.andúresofincmùìx[s]ld,lg'tl pú (i. , rheF *jr be defrr.d . spare (lN level) mrrìe {!ft $ilL-qd!nuù,b$Gùrcl5hù"iishÒe
tuiìrccrhc(ov rc€DrLignmenrbeorùoùehkJ.'rr),rhcoduq DPisofnhm o Òpriorrplrh tum (dr,àr) 'o (.1,rr) ùìd hon (r,. rr Io (,.. rr, or by súns 'heso'c of kJ t) rÌ higr! vllr 'hf 'hc o f ' s r f o r( 1 < r < , , r < / < i ) , n d (r < r < D. t < r <,) (lrìrir,lhc krr rdrùdbóhnnshror,isir. ,i3,r.tù rs, íÍd hnir(hishlevelrd fmÙniigi!'ro[cbylú'iùelh.qnri6ú {ch 'hf Gr,.,?) Li5 o! rheoprimal0o{ revel)padìwhei I s i\ ùscd!s 'rtr sd n3
93.1 Lor
,s:=tot",(. P):=DIìJ(,1.3) LowlevelDF.roftd dmùgh(4. rr ) rbntr (ar,.,,,) É P do n sN\P):=n s\lP) ri s?lr) fu;]ÀeDP!-'c,,s,bclpthhP
a\ Ns (nhior s v,i or viuc\ ton
mì.ricrcú'Îtti\x3c[Ù!|bù|4ÒÙì'1
roll€d.dependi|e'foldùnp|e.mbvde
:\
9.1.1 Loel€v€l scoring malriccs \rrs anmdhodsforcilcutirireus! Gho* lhoiccorjardt.oie*a),ordeiirins'hc nemsf.j mdarrj. !.s. br ùsirgth. c,,o0 :,rd1del
sysÈd (rs [Ds a rheydo nor ìie or r sàigh' ljnr) îtr oofdùdcr of
s showri^ r;tue e.4.om simpre sorrng
ea (rr.,, GÈFlgùF e.r). br4
.c i'ì thc d;rdiE d rhsYcckF 1a,.dr)
aid rr G![s rhcrilhbouiie rsidù$, À ùÒ!c). ilo
'bc rrucùEs
.e ftunes f 4 anúrr, fld n meóucd
^' j +lr
so i . îrìisconponsr
neighbous ii rh! s.qr.!c
(èi rbc odchmg of tocat secotrdi4 shdùct
bbehightrgh.|thcyÙtnc{4i0d,
hòe eqùdseqùsdarrisnrìamd (hr
DotrBLED\î{AMlc PRocRAMMINo
9.3.2 .
9.13 I
wcac ù!i úc dqrarc of sG. r) ù higÈr ú bù l ldlltèd s x dùrsins runc'ìÒ!Òrd/l + rhcqEdbúion Íiom onbìned targedistues in spre. hghcè:grofd'.vbbÙ.'rÙD sprcebùrno' nu ìn sequem(i.c .ùdird.s rÒ!bci.s ii i si,oeùnd c,rr.
lùrdior GI *luès beinsir ùe inNli
(d rdt
t0,rt fof nonnegriveJ atrdr) À rhecaùsjú
s b bù s.d n I fùins fundionandr M .odù( rd cÍh conponerr.&rìningrhesìopeor I GeeFigùfee.5).fte
9.31 ùigh-level sconngnatrix Alì'tÙglPPeiallyÀNeJT|trylndrcnof
r h,Yceapsil rh. -Èqrd,Ìesions. 9.3.3 lteraled double dynemic prcgramning
rudr
(sar rlisDmenr pmcra'ù
úc 6ur ì powerolrì!fqErcelere'h (iorwopmcinsoreqùrronsh)6n p.rrorA rcipklhdÒmFing|hcmvi[mdidJl n!DUN.loeeúÙtfuredloasdÈvd,m
- r !\ boturc.ind . ,ù, di6r
G P):=DP;r(!.rr
0. nì
L!\.teEì Dp foÍ$,r$rNsh (dr.rr)
nldd! a ,\itc d,l a úù ù s l:. f):=DPal^. R) E = rhd n4, rut, b^!d ùr a últ boúútù\ | ùùian k sd^n à
r tsovjs 0 iinúl'ed ard'he*ed srededr . Hos shourd c bcqdded?
!v ,.
eorj'nù,] nsidr prii5 (iid oó{) 4n bc on s$ndrq rmtur rre (or voùld lor mmnÍy !a' Ìo onpae m srphr.herir wnh a Éh.nEnd and blrirì (hos wirh niìù) bur d..fpoicn' bs{d on rhc lnmo eúd{ú'ys!j9&Acd.siYir!r pmrur (o ensùft dìa' aìr 'rrcc hrvc I rdsùbìc uìuc) eNing a mai.ix 0. vN.h n (0 is rhùsdermùd rry us of ùe n\fe componeft) reiened 'o îs rhe r,i.r ,",^ No \@úc {eigbr rE inrodEed. Ìîsrad 'be t úúsfùh is tr{d ro 3ir a rcushry hc p!i^ s,iù hjlhd
\!tuA m 0. or for
ser-riDcp'i6 rùd ùpd''rng 0 Îchishe\vi'lu$ir 0 rorerch.y.le tu ìysnúììtrtrmbcrdrpùn\is{r.rd(r0 r0). rhcÒùry(p!sc orcìc\(hopcfulyhwaÍdrrh. truc rq!ird,n4) 0 is !*d as i bNe roL iir'€ùèùFr Évnioi (úÈ br6 tubir) 0, n úe bì$ úffìr ù qr e ,. 'hs i\e
r
aP+\= ot t2 +tast1+l st+ /zo), 'hchigh reveimxrixfr s) relemnry 1rg. lQ)
n
bc bhs hish ÈEr sNd q Dld
t :tuód) f d$ $G..) 'hr &i {[su{r
a
u (br _5.r=r)
trrj ik Ly qtrùÈ r{Ìhù,
,tu
nbuionflonrhei, 'i,ìr0 rùrù!h ibdiù rLgtr'ù.ù.r|pîù!.hrns Gf íkùiry) d prcdonùedt r. È!ùs
ùd rr ù 'he d4ms'
(or compìerìy) non !ri$
' ù or rhcst! or rhcrvo pdeìm ki ùd ,, $d úe cyce nunbf (p):
À= r0++.r; . = o f{ rhcmiriar.rde). we s* L{d
\sc I js rhcqlrc rumbqad É kih bis frdr h* x lur conhhurion(9(0, l) = 1).whichdúcù\c\ ùi'ir J|rcf'nÒjìrù1 .ydcrbÒris.rldirÒryiÒ.oiEjbuionfrenrhebìîsmnix(t(5. L)= 0 032).Thh inb'hefil|gìohlìvi.wpfdidcJbytlf
(hish-rqd) |lrh h crde p + 1 Thcr ! ncw q n .ùddrLd, nd Lhcb
r=1 cdinrisùr9 ó f bcomcst.rhcneiúìon ded pài6 n nqc rho rìe reigrhol dreiG
9.4 Simila.ityof the Methods r equivalence. sar dos ior ned 'o daidc (of Asú'nc)oi.cxfrricnm.itrù.dr.ùÌ3
9.5 Exercis€s r. Re3-irdrwo\mdùrer (,{. ,) or dE
I I I
I J
ise.r
G) ouúhed dsodúD*r
ùtrr! c!{cr
ETERCISES crrcùhrèRMsDc(rrgh' l) ror 'hè by rchlirg oDeol ùe \ÚudÙs''
^=t!t,d1.4
!1)
hd
Ld ùe Dori.ion of (aÌons Eptsr'iig)
G) c8rc fu rbÒsúo.ùjal
3=(t'l.6.b the Bidres
rj,l,5).
be
ccnrB. a
(coífor: thecoodinds ror( I shÒùld bc (0, t.414).)
(i) shN rnd rhc,nfnÀis orholonal (iì) RdfE !l rhemrdinres of À. ,i (codÒr:rhciù{ coo irdsofd,shoùldbè( r. r)) (c) Dentrea dh'bù ù[kii D ror!u p|in t': 4), wber 4i n 'rìcqNfdi (d) Defiie i $oritrc ùbii
Ì by dlvi
G) No* 6id úf hths'.$ofi0s equiviìdu (subrignfcí) oj rùnsrhr. rurci / = (,r. d1.,r,4)
d, - (/,r,b!.,r./)4.rrl ii 2Dsprcefo de6ie
lhoose Ìhel)ri. (ar.,t loi lÒersd dy
1 - 1 , 2 r , r 0 , 0 ), ( 2 , Ò ) , 1 r , 1 ) 12,1),
( 1,i),
10,0), 11,0),
(al Fin h r disùnc-ndnx , rorallp rúiig mdù( 4rJ. bydilidiig 1byrhcdnunq ii D (roor dùiiú,. gipp.ù|ty.and]cllhegippÈúlyh les inr r.!.\rhs la: ,, kiol.ùtr (b) choose(/r, rt id úc ncwror rev diiúr\ s rn (r). ro fird the n*, ordj'jú$
rór (^). yor c.r ùc rhc
olìgilJcoofdmfesyf.h.
ardr Gtrdr !ìi: :rong.t. . c bc6e.nsìebc'sùr x{d xi, . R = arr (ùeÌ.60fdinar or!t, . 3r =dr) lhe ) .oo.dùneordr.
. srr=.ùrx. r) =coto),
. 3'r =colx.I ) = c.(eo+ d). . lr = cùs{yrt = coleo d), t !, = colf. f') = Òló) tr(r,)) xÈdrcodjnd* or.poir'i úo ùecoo iiars (Ì,. !,) in rhcne{
t r =sr (r 3r)+310 rr, r r ' - 3 , n r R i )+ s r r o ! r . eordiilè\$tm'foltdlyiig drmm. pócr.mmù3as*pìaii.d i,i k). G) rhre ùc hìgh.teeìsÒritrg,ido (d) conp!rc rh. ùec r ìgnmeùr,,n
ùs ltu Psiu (ooeHidùc i' i .Òtri! u! r '{o of moE prì6) Disrmi wherhd j Eq@'ú (e.6)shors lÌs rbcbú\ m
r Ll is updn.d
rJuc À'.schúr x = rE(L + dJr+r/10)
(r) shor rhd whenI affrcmhesinh 2( Hinriùft úe eqùsrion s 4,;r = 4r'l2+ I en or C appmmhò (qh* sp is rh. v,lE Òrrhècorespondi4cdl in cde p). rha lìndatr t r0( i. )2fa d ( i ì ) l = 0 . 3 . O ) Ì . r 4 0 = 0 . 5 . F i n d r h c ú Ì t r È o l 4 r w h è=
9.6 Bibliographicnotes
10
Geo
FùndmeD.ddicl$ ror epcrpÒiirìoift Mclehrm ( 1e72.leTe),xabsh ( le73) rd cohenands@robùiÈ 0930). MorcahnúlhèftùÙdsofaltndin! inRlof,rRosm(1e73).Rosmutr dafss0975.rs76),sdoreral0e36). (lqq4),noìmindsodù(r995), cohÒì(reeT),RuscrrddBdon(r992),Dined c{reìnmdLcyn'(l'r931 zu-KrDgadsìpplllss6lP.drjem(lee3)atrd ordso (le3e) an eanynsrivc 6ior ú ò{nbd ii orcisó $d rxylor (1ee0), rhoyr n dÈrnbedii rsylor ( l eeh. leeea). vhiìe rheaìgorìrhm
10.1 |
10.1.1
l0
GeometricTechniques
l0.l
GeometricHashing
10.1,1 T{o-dimcnsionalg€omebichashi.g
dior (rgid body Fsslnmxtìo!. -{iììy
scalÈcoùld
gilen if, the imc $rLc) ooe,pprùch is ro L-y ro phe rhe query oi k,p oirt coiù.jde lignomg the eù$) rhn n
ìlrsjruùlùhddGfuiÚi!|ìshi''eisacdmiqùÙfdfof'hnorc'EnThl ii Mrmdc qn!m5. ard r.tz&/roú.J, isúr poii6 rbi .ach rtuÈ (in 'he rD as).
!;
: ,
I "i.-
Fi4c1ù1
rw dslEs,Dúd !i qw
p.iÍs G enpuÈd in 'he irame, oîriiriie r ftrtz&?llr!j!,tr. Hmù r oldmd$dgfiigdbyfFnic!|aÍb&\js.\{cs4 frnnaEd. (f rhe diúrce beùen rhe dd {iìns.) A framcsydcn on f can rho b. ú rhe iùúù ol point pais (or to'n .r.b ieuE) h{ùs Eq@rq rmon equ! co
xr\ò iftrcvrn, Fi!trc r0 2 i[ùshrs ,bc rrLrncc tfamc sy{m\ uJins Lre lìguE pojds lrcn Fisùrc 10.r. h Fieùe 10 ,(a) rhèpoifu ( r,r) ( r xrd l) m .hosr s r basÀfor Ìhe moder.md rheof4in is d Ìhe squre ino {hich ft r|rb (ùe sir ol rhè mDaÍiioi) we se rhd Ìhe poi'6, ùcrudiis
(0.0)(ó2)(n,0)(e.41ó. 10)(3.3)l-1.ó). (r,2).id(ó,2),, irrurHts!cÒdplì.dioB Gbplrepoil6ùtrnedlheboldd'In Ìi (b) (r.5) h sele'ed6 'he basìs,.nd
(r c)(r r)(0,rt4 r)|n.nx8 r)(3r).
AEOMFPICIECHMAUES
c) rhe|oinc (x,.)G sleded rs rlÈ b
(0,0)(3.2)(3,0xó,2)(r0,4)(r,o(0.6). i( coincidebdN*n (a) ad (c) (ùdudùg
- oneirr,rmdy (1,r)(2.d)(3,O(6,0 . :trk 10l.nimeìr(.r,tnd(?,s)rhi Òri'ì Ùr i'ùpkmslalion or 'hè me'hodOr i rlmplÒ, dso .oisidrìu neighboùingsqúRt. .or ò. rruÈ syrems(b)aid G) it i oD
f"irsdarcy. Forùnmple.k' (.r. d}) be $c b3n5in r. lnd tàj. r, rhebisisfoi ,, úd l.r bornk,. ,") and(a3,,r) corDùde r ùar Gormor) rde syrb rheon s' ir (,., dJ ùd (r,, ,J rc usd * b4q\ Nor, hoveverdDrsinitú'] ad ior I
bc milc mcúion(,(,t
rr2xr(,i - t)/2) fram$ ae corpmd, 'hf.
u
ieùìevin8d!ù ìn I (hÀsh)hbre a rdr, na.,ia n u$d to nap rhedlra (or nor co'nmonìyrhci€rl ù (lc!!l) indi.ùsiîro ìl€scJìbempp.dblièshcbu(
FÒfsinprhiIx ht N rof Òur2D lìgufe r sy#n. Ge rabìe l0.r). HeF rhehash pùsi'i tumim n ur lmde i' h,re iù rhesqùùe(p, 4) id úÒ lìfhc syùn lnh b6n (ar.dr). rhendìebish (1,, ,ù is n .boNî in rabrc r0.3.Thchlss ( r.3)înd (3.5)ùr borhpracnin ir bucrebor t rhùedishiirrobrhe,(,! i)/2r4t, DFfeHcernmesysBns,depdìding
rÉcHNrauEs aFoMErPrc
ltunÒ\ fof crch prir (usiùgborh (/,, .r) r):/2 rd (a, /j) asbrsÀ) rhe îurber or pai6 ìD r NiI rrrftfft h. ei'rìf n(' ua(D ll':.ceE€llribrckdlill
UldfolÙclìofdreoùrqpois'he@]n
gúr r0 ?c).$i'ig'r$h 10.:1, ftee vots re (1J. poiob poùr Í 'hE on3ì is !d ìrdùded). rh* for rrcm ùe c.d.r !f.ì,e -sjrù !È !o voreslor (3.5). Hencerhde aE fouf coinlidirg 0oiir\ bdw{! noòr (r) lid ,hc qu.'y (irdudiie oriein), oìd o.l), rhepoinr at rhe orÌgin besen nodel (b) nd Labeh (e.g..oìoùrs ú(lor rÒriJ) Ni
'hcúoFìdÒnphA.'folrquÙypoinl rbr equiny (q sìmilùily) of lrbek nù
e lab.(9 ii îddiúonrorhccoorur Òi u
hr ùc rq!aÈ\ h€iunbeftd r i qu{c (r. )) lirh coroùcishÍhcdro.hebuckeri,(! D +,c or tr 3D ùbrcl the sme s h imPbmen6noD
r) +o - l)
Í n nowsfrrghrfo$at'r roeílnd rheh
10.1,2 Ceometnchashinglbr stùctù.€ .ompadson peris. addidoi,r\ seomdnchshi'e r.
0 ad seofdnc hahins crn be u*d i
rl
Òfruitu i'ì 'lù*ìirúsiÒDd Npee(e.g. úins ú. C" orech L\iduc.orwemldods r'ómach residud.Fofusinggùim.d. detìmd(bÒ tus $odd ofsiss or mree F'ùs) any'&epoi.b(r,,4,!?) oorfatìinsirislrìehrlide(nÒ.ùilinEu)car ar berheonsii, ú úìci.ris lie itmg rhererof (4i. !r). rhcr ùh onrheptDe:nd rhe .oofdìúus or rL 6idù{ Gro'nt h G 3. îoìi.o fid rype.sond4 fruc 'n rlemcn(srusurc),ad N *pl$r.d i sedion10.1Ì rlìh.rn bÒj'iprùdcí.d dpfun|yqù6ch6hùenìepÈpfu sd: r(a,.dr./r,(,,2r, whm 2' ù lnc iùds of 4r in Lhcri'mc(dr.,L. "rJ.
AEOMETRIC TECHNIAUES ^ìgÙ't[nl0,t'PEpn.Nlleph,g
î.r ladì ta,ttú.d, nurcuttùìan lnptc lai. o'. a,) . M, dr .nturdb ùè t ltqtt ,rht Rlt cal.úak F = FlRk.ap,pL): HIF):= HtF)rlM,
R*.)
^5 rhcfdcolc f nc vnrr ba5n(oaa,) ù diFcrcdfon oncvnh. nn cramrrc. 6r\r GLúrr. rltrrcm Ìlai r)iia, 2) diflmm it|rri.. rtu'ic5rorùti heprcpmc*sing k o(r,) permorer(wherc .). lr À posible 'o due 'he compbxnyby b,shhble fof 3 nodd n , (,) 'ne (ùsi.g t|Iee ro'nt i5 delitredlbr ùre (rr,. r,r) i! rheì'ìd*d bù.kers L\.nrîtd lo.at rhèeid dl $ù pai6Ghctuft. cr.fle frame)nidì hislì voÈsrc inve tr sdAl c$. n h or o (,]), wlìec r i
Alsorfhm r0.2_Re@snnionpha* o
l0.lJ
.tuoE J aîo$ th j, bt, b.) in tht (Ea at hasìt, d.iniÌE the ftIètà1ù t ùìt Rjt" .ahutuk F= ttlRtt,,b! qL)
ror ebhtuir lM, R) . H(F) dovlM. R):= untitsainaaar'eaìncideresú n tuad ar all qu.4
rhc rcsul'Òrúir$p h rrnlÒ'(rf, n*r. nj,J. sho*i'g ùr dftftoircìdencs Rír (tur M) atrd RjL (for 'rè qrry) fc rL5 knowi, ('bc dk shd!rc: usell a sPe,posi'ion usirs ' r*driÒ!) Nd. ùrjoiiiis
by $ir3rih
dhn
ùd bc'orc, !c ún (orHsois
orem.ier.y)
r 6prc. (c,.,!.a, rob.a fefercnce fmm dreafr' hc a rdprer(4 rr rJ ùÌbe núf h. hthiy 'mi|Ù md 'he orespmdiJ'8 rry oiry rheÍoùs ryús !i'hù a de6red tuDt
wnh ceok d a chos
I sphùe (of
CEOMETRIC TECHMALI€S
10.13 G€omelrichashingtbr ssE r.pesentltion cqìndn! bóhù! ún rrso b€ u*d lhen dre.rmÈms de ssEs. Typicr y, rbe cd otrÌ núhod hy H.lfr aid sliòf ( r9s5) kDs oneor6e rick'. DependLng on hoq
úan rÀ apú. ll beinsr conrmn. Fo \nb s(riúion orr'À (À'boîs a^ùq(niiÍo
Thcslls of rhcFo ù,w r[
. rhsDr. orssE(a hdix., fmnd)r (i.e.'ftnidpoiÍi'ìòercíe'€ncftamc),
$ irc derìn.droreach(rlsat f ùcqu!ry(r),anJrhch$hbbroiu
prir or ssEs
ìcxùr, licùr +!.à1 (".i-ÈhhÒúirg) bu!kc'\
ilr) rî nich' tu'olìrn i/ ii erlssutuundinsftcoien{idbr 3, (s i2). ii2) marh ollutr ir linilf
mìdl)oii6 b ks Ìhm ," À, le$ L\ùì ti4 qp. Gomesequediaìcomniib
.13) sivire rù lùrnd inYèldgfiÒD.
NasÌorreierrboÙfin!bmÌds tor ...tt tt.t.,aú) ldit aJssEstRt, Èt) . E do dejùenE EJ.EÌf îrne Rr tú .a.h ssEI R4) aJR do N := bqr tr kahbùt t'1ckè^ b hr ú Hn, kî ^ | heltE ssEaJth. tisl x tht \ztt 4 q R4ùe !úita, b th! a,s ot a I rrcr-' irnt?dseth?úe lò. th! .4t q'! | at. ar) oJthct^\ aùrwE tat. At.\j, Bsf'
10.1-4 Cltrstèriùg
10.2
cEoMErrucEchNra!Es
10.2 Distanc€Mat.ices cùndfic l*lims is Nedrù rìndiryI rquenùd odù j5rhesdc (hutrhisdó rs ncp).co4pù$D rins di$rce n mrics É3m L05 is|Ln ofrhdnúe nfri\ forpDB4fy r.h..;d Fjsurcr 6
rè,idtrrs-(25?0)ofrmd.Bùùshowroù{asrmùndmridhgomrshossbrccon iúrd6 bd*{i lr.ipanuel fnndr, m 'pdcdiathelir)úrnighrùdjcaÌeúrrhessEsiòundin rchc(s{risù( t0,
.
.,
::".
-.
,pP- ,
" "
t:::t::r::::::::..:..:........
:n:...I : I : : I : : : ; i : : : : : : : : ..:l: ::.... :;É::..
..:..
:::::::::.:..........
.:t.:t::: ::t::t:t::;:t t:t::t::: .:t.:t:l
:::: :::::'':'':'','
,'',
':t:t:t::t:::
i.ii.iiiili.t..t.:t::t:t: :t::t:t:t:
't,:t,:: t:t: t::t::t::i;i:.
ó .rr'rd\J'
rrrrr(.ei.f
ltLB,f
rù,
ro,i.,trrÌh,
€EOMEIRÌC IECHMAUES
: ::::,:::ll::::l,:::ll::::ll ,,,,,,,,,1,,,,,, ,,,,,,,,, ,
(spri.r) darions GonnLc.io*)bdsÈ rlÍioN donorìnpLyúd úè .ortspo
'!És orcrcnoú\a, = Id.r..l 4d ^i ìn Fieù J, Espslnelf À shoM 1,, l. l, re,wnh$brtuc'ùrs ,r = {d . , , .rJ dd = ,r lur. r', r'. z'1.suppos runhq ${ 'nc nhiN púcn (disrdca$bEúik)
10,2.1 Meduring th€ sinndity ot dislancè(sub)mhices onc4ù. srmcrine$ob6firdóm
:
cEor\ErÌrcr-EcÉ'NrauEs
rrr\ aodrhcdiilùro'.c bdw@nrhcco ú HoLnùd Ssndr ( rs9lb) (n]r sE io rheDAII pnef'n) cotrsidù (wftotrt rùss rl genertÙiry) '*o sùb'nfiG lor.qùd $i2) r1(n ,,n.r ,,,ttof{rudrc,4. n\rr. i- ir ../ni ; D^tit
i2,\ ' kù,Diln = )
)
.i,t|,,,h,
ótD,l
:hered ìsrsiùìil. q 'nesùre. id rht simirùiryof úÒ sù6shduÉ\ dd ùc m4ns a€ synmùìú| simdjaiig rvo Bnm! orc rc de$rÌbed,'ùerif
226
D ó ì \ D ^ ( . ì , k ) , D i t i . D = o t D ^ r . ì , t )D î \ i . n , y,dÈni.db b. r 5 À rhis i]ds dìr
' direrenes È$ rh$ i 5 À FYc i po ! dilisfúes !c!b. úiù r.5Àcirr
10.3
submxtì.èsrùey m ('o ore d-inrl)
5 0 . 5 - 0 ) + 2 ( r . 50 ) + ( r 5 0 3 ) + 0 . 5 0 . r ) + . . . + ( r 5 o ) l = 7 . r 2 l r r5 l = 3 0 . 5 . , ì ! G 5 . r 9 , 7 2 ? 5 )w i ' nD ' " ì " ( 1 3 . . . 2 2 , j0.. sr) meioscomprif,e'heFÌaiion (r5. . . re) rìd (72.. . 7t rrh 'he Fìa1rcnin I chcbdNccn$bshdùr( ( I 3. . 22)
mPle'Ìhat'hesf*poidù8dfuces]úd5 Mn.ú ró uìd r3. Îlic dù{k ronn for ó .lso
a'@^ti.t)D . úi.n= (of
t! tl
l.
2a
),r"'r'.r.1,a1
cEoMrrPrc rECHNraurs \hùe D1i.r.l./) trdìeddxceorD)(r.r)lidr,(i.4 Mcncompannsnrb r .uclDì{r.t)odtr(j.,me,so.sortr isserboriÌtrdrd'rdÒturccrL. o 4 is'h. errsdcsirilùì ryúE+o rdchosn eqmìro0.r:0,i.e.20r dcri!'im) l.tr rlpìci dnhr.ls for ssE situh. adjiÈú rRrdsinir shedrypicrìlyb$edisúiùsdr45^,aDdsbouìdmdchrosirhin ror.ft diiìc.kÒ5Òf2 3 À FqdamP nusd.defimdas,(r)=*p(-rrdrJ.
whcr o - 204
10.3 Exercises Nhìng'Folsinpli.ity\re.ofsidr\frù
t(brr
41Ò
)
4(4
)j
10.3).rr (6.lr.6). ór:(3.3.10.s). b r i( r r , ó ) , , r : ( 1 3 . 2 ) ,. , ( e . ? ) . G) Pìorrh.poirùir ! dirgúm(lmis 'lìe l,oii6. Ìo Évears.lÙfts úd úìÙgLcs) (b) fu rcnov roderjnefetèEneiix'n bùpÙ 9. DÒrìr.drcric rtu* ùriis úc pan!orpoùN(a:.ar. (.r..t ind (r,.ó|)(ri rn(r,.rr. j.Iis'lùùerr'hesstrd' (c) Defrnea No dmensioniìììish hb o bùckd(i j) ir ìrscd,irs (Ì.r)sdisryt 05 < r < i+0 5 andr-0.5 < r < r +0.5.Fiììonly (dl Faf.rh of 'hr turrcn!! rnms o 2 rr sedionr 0 r r onp.2 14(2Dhóhid À Nri'rn rr kJ, q, bcrh. bf( ro!i. ad tr,r.,, rhebai\ fof,, îìd ìer borh(... D ùd (",. ,J coir.idè in Lù (q,nmoi) rarc 5yieF e coii.id.r.cú ir(!r, 4) $d (ri., ) ù. usd èr b,ss. Nob.ho\Ereritu limLllrirrîjld
shw (Erpbiciììy) 'hf ir 'h. útcuhr meoincid.mcsr sùowùru$oos trorr.d robc ùe ae rben ùe ompùi.or is noÌ exd.
.,ld súb'nùdc6)andrhcsurùìaaixr
1l
Clu Sin
(har ubrrùrurs,1ì úd r, bEequaì
ùb!ruqures(,11. . 15)or rmd and(26. . 23)or r d,., whci rìc skfic fom or ùe siEìLùq mqsr 4 h ùsd. mlìdforEdÙ.iie'heflmiry'ìmewhen n 'hf fof I ripìer (r. /r d iÒb. r rcfcrcrceframerhft ú*' b, r ùiprd (4, àr,à) ir ùc q!4. $ch 'ha re
.l
10.4 Bibliogaphic notes ù bcroundin tvotfsr (1997)anjcl6 for ( !99r),Fischdd al.oss4). ova|dwórfson brìon is lmn Holm rnd sùìdd os95) a merhodfor co'npùirg rìoribrcrmrein (ùoùnd, hùsdl is dtribed itr shf\ry d ar.(2002d. i rblrì srd sDdú 0ee3b).
l1
Clusfering:CombiningLocal Similarities sìmihf (ìoaì) sùbshúur\ bcr{eeùso imtl,r $bNnd €s itrrolùsq simiLTsùb sddÙBcnndùebydÙjhngme b. I bn ninÒdirs, sir€ rcr 3u bbjec bo{ùqíicnnudeìyùsdinrhcI
1l.l rhc
CompatibilityandConsistency \i. Òbj.41in 'he expbdiE
ohhe ctsrfins
àDreh
iE FjJs or.flqd,,r.
|ùjlydepedsùlhcùlÙho|thedNtiing
{Ùì,r.dd,{t*ilh,J,lJsdll
hcra. is exptajiat itr ChapÈ 10 Finshion, |d i;b, empx,rb'e eed.
jf Ììesùbsruc. nslnsu(n,, rtóunbecotrsisri'Nnh (nI,r),*hj.hú.yùc tufù.oAisrìnror (!r. r, is comparibt. wirh drssbfrucùr (rr. rr, i.e.snnihr ,o x .&u n desFe (rrd .rd.onfRinr bejig dÈpsndenr on rtì! 'icrljod) Fisua I L I
Fts!j arsndi
@ !!J. ,j) ec @Ep!dbb, @d s úe (rL, dr. tu pi! (1j.,t eiù (/r. 3r),nqnhs úr {r,. r, k 3joú. (snF,ùh) vi6 (rr, À,
Nob 'hf -,r/'r./r')
isù biilry rcrniù bcwei crmrds.f.r4ÈzÍ
ffldùs.
(lir or clcmcnrt rbnì erh or ù. tvo 3?Éd m'.À4 (Lo.slsimilù subrflduret bc'wei 'rè shcturq ud dH joiù ,h A *.d 'mrchGho!.lusù).onsk6 or tG', ,r(u r ,t(ar
,t(36 ,óll. roùn
sed 0rcbs re joi|ej itrboNis€n
in ,lP"t= t(,r.,,r,)t.aneremnc
CLUS'IERINC:COMANNC LOCALSIMILAIIII€S
a \,
rlt//
(
//
Y
\v4// t
//
l
I
B. l,
(
//
)..
/
\
\l/
''
^ _______:
--t/
djlvrhÀr,ùú(4 rr) n ró, atrnèÌÌ {'ù (rr rr sùe(1,.il) L^orsjmrtrolr,,3r. Gmmcùicb6hins n oie mdhodror
(t subfrudrG) wirhhth soe. oprionarFfinciffL îìis n oncndone
a n Ja r d, = I A , . . . . r ! t . a d
(a6,rr,(as, àt (zr,rro)t, {(a5,bt, (bxfù $anpre,seomeùic hNhnìr.rbc ddeni3 (@ul)ir3)nor depends otrúe !n6'ù.oisifcn.}sumoslhÍ{e (t,-lo-.'ì.ro
," rr.ra.r,..ó.ìk",.) a, bú, (or sed mbbdt car hesrouFd.
À
5E^RCHINCFORSEEDM^ICIES
HN I dlùsbin-Èpcrfonìcd (lLùsbrii-s ùlgorirbm9l HN n
ll,2
'nÒcltr{ùs yÒÍd (rhr sÒr
Searchingfbr SeedMatches
10 15a ammd crch cd. andsc.! n oie flom ach rructue. (rhis teps . h!$iis
[email protected] b{ ) G0ondh. h!
tqùiriie equallerFh\. iÍd usiîg disù
11.3 Consistency T{o dNtn cr c1 (a chrd consistineor one or *wLl seedmdchct can bc joiffd ir 'hey re smisÈd L€ he eìemenh (slbrrocture) rmm / ù dNtr I hc ci. thc dho( eúlcd ri úc sr. wy îÒ ùc dùstrd ù. hcjohùrtrúc
úmisFn *irh (cî. cî). ro dùidc cóisis'cncf {nhù E/adDi bdrvftr dcncn,\ orÌhe eme studurc !rc $ql d raitt
tmsfdmfiotrgilitrgagood$perys
11.3,1 Teslror consistency Ldúc'wodurùsbe c' = lrrl md c: = lo"j,vbúe r. ind 0, aepai6or
t
t
I
t
al
Bl rúùirm'ioo
1r). Rdfb!
i! búvc{
dcn{r
È. \'-, ì
FiglEf/E!ÙDIùdPLiolrÙLBo
rdn(a
1^r
oi.rr (4.c)3(rÈ$úrrudú.(rr
ùri & ùo(n,
I
L. 4 rr)^sinùb$úrtuuÈ(r,.&).6)rnqf D qndhdr Íq ad asii,, 1iN ùr{orùliù
3,)À$\nbrvdi(!i.rr) î81tuft;dùNùs,È,qr nFiù@nùrd cqjo bypù0n56 úìr4
dotre oi 'hc pai6 (rno*i
* ro.or .&r,als
on rd). \rheúù
pùnneed d
ifan [btuùy Fn P. rnd or n (Dsi
andcr= (&.rrL).(1{'6r)',,t
tt,4
I
II It
ll ll
I
11t.1 . od dsd ., dÌ Pù!. r!€y c@bjded ú rÈ$hbdrcorcdBorl/i,/ù, . ,ir./r,ili, . r L l t s i n r u( @ N K L í ) @Di1i4ú(rr, rj, .. Bj^.Bt, 81......4.). ro'hesuÈ1rudurc r trsitirity hoid\, ir is cioùsh ro conpm, ror dampÌ., (1, . rj,) vidì (!l . 3r ). Forr sinpleillusFnonNc c 1 7 , 5 , 4 , 3 , 3 Ì r{n5d, 3e, , a l , a d ' h e 7 rmm ùre6ni setand5 rrcm rhescond úrìsil ùis. bú ru' 3 rsr rhÈfi^r rnd 9 noó ùÒ$.ond Hcnr ro dc.id croùgbrocr Ro pails(ho* ndy depeidson'hecon'e* (se EreErc 2) a 2 cr îid Cr !rc looldl uponrs slo doie on 'hec 'No unrs (kno{n ó rroaar.turentr cnÈta) wiù turÒF narioo.globaìcriEnr arcorb q€n Subrhdue\ (Òft rlm eeh shdt)
11.3.2 OY€.l.ppingcluslers
rr c := {(A:,,r) (ii, 3L))rid c, := {(r,, ,t, (,.{r.,t1. rtes iNocì6re6 virh r, m cr.burqirì R, ù cr. ^ dú'bejo']'ed.sirc 11ise,ruiyar4ccd
11.4 ClusteringAlsorfthms r:" '''
-J " -P'i'
11,4.1 Linea. clustering
l: .lt.j,*, 11
e := ate sehúed s.ú ttutch it lx t uti:nú
rhíèd
úìth e ft.n join sm h e
nncb $hi.h ,b€r 16, Gors hshst
ro oie or úc .lsr^
r lÒùed ro
CLUST€RING A]-CORIIIIMS
1
-!
\
\
\,/
.,,
;ì,...,,.,.. :., ;;. ;; , ",,,.,",..,,, " "'.*,
ù4rè a .ar4t
-
ttTkt Òtatn rcà na
tììtt !t?d nìarh t t)||hhh4Ecùri:;knt Md.h\tllt:: to aR oî thè .hrùf tc ) jdì tù n, c: rehÒ'! yù f,ù r\tít no ùù. |èet Daú6
.5 (
ttt wr a.î}eld Dar:tgs
aE jÒiLà ta eltetth
11,4.2 Hi€nrchicalclut€ring ln aggìonùúivc hi.Efchicrl dùlÙi hd rc no{ \imjtù cqne tusìÉr) is jojncd ùe dùús
beirg i,ì84
bur rccofdiig
11.5,1 c
s(c) (cr,ct s(c',al rhcs.ÒrÒrsúùpins ùr(cr. cl := conrcr. cr) rndIs(cj.c, > md(s(.),,!(crDt i"c(4. c, n 'dc i. c andcj rc conisÈn'. md dÈ fùe iiùùss by$rins ùem
e := Et al ct6x6, eaci ked nath heinî a .t6nr .r?:= l(c,.c) 4 € c rnd .j € c inú(cr.c, I "nd c | := :laìalct. c) thc hìeh6troùn3 pansiÌ a
e:=(e- {.r cr)ulc,l
11.5 Clùsteringby Useof Transfomations cúfcnD3 hy us or tunsfomdionsc (@Fjgre ll.7). {.d mdchcs ì sed nn.h consisbor a koishrctro lir orcomparibre ssEs.c.e.l(nJ 4xÀr, 3rì..yp., y rolndby !sin! rhefchrions spoidirgrohs (cd) r drMincd (Lvhth 3n6 úc \opqsriú). îhe r66ibm*ion is rhei foùid Òm{r\ rhcme giviigmìrinumRMSD,
= (rl. sl) úd .: = (s4.q) ùù b.erctrped. !c mur 11.5.1 Compàring transtormtioB decideìf 'rreyd siùrlf mdgh b jùs'i't b.comprcd.auodncrm,nvricnd$dsùiò.(r9qr),,;r,wopforii. fdp.nd y
'oe-'' '. --a
o
o rù Es ù5 qd b erù! afdG rÉ,$ o R\uùA {hri {! r@ù b hrc aùFLi Ftumf.r
rE irue Íi rdÈ ds1{i !rÈ\
fú{ooidioqÍ!f $È@\hg(a
lie rìoqi (ir. rr ) tu (,l,. ,r.
vtrE
l,odo(nr,rr.
dùcnmlfomrioijseAútdb'vifld aid saodù(199r).^s iated itr cnip
(.tr,
at iìarics (n, ) qnh rhEinEEe ot rheorhd
d quaúiryiren\edepf'ùr fon rÈeodrnúrdxu) i'r rfi\ drúrcn\ùuùg
sorlr Òrc,iÒilrwÒ*c úr , = 0 iI À and Ahsr.rc' rL (ree6) d.ribc r íu'iÒ (lìc rhrcJìold úb a rlreslìoìd ùrd n 0i).
lllir.rr)(,4 , r0(.1:.rr(.1,. rj)l :nnlnr. ri). ri: = h
s,rEarbe'4r = rtrr/r.3r.7iL = ,t, î; = M(ar. 3r. h,r ÒnÈrc ùc* 31qtuf'hcsanPreft rertu berbedste
L-.1 È.
ftunxr o(rr,nrruft4odúE!Èio^e'{qúcuGGloopd 'b)('t ,t i rerd. @dú€ Ecbs k.{{tr rh..orè e ampd forrhebsr sùpeQosirioi of (rL. a L), 4r .hcmgtefor (rr, ,n, úd 4i1 sd ól' bcanarogoùsry de6ned.we iiid úr ó Lisinì b ó:, sd 0l L is simirù e c.. rhfefoE. ir mjghrberhf (1r,r') crn hc dunc((| k, (/r. rr. ùìd (a,.rl) 'Ò (Àr. at vc rhorcrnrc nccdro conpm ónpeddn4drssh6îinFisurc |.3.(À,RL)ir.onsife wirht,4:,rr.bùr dic 'Ùù* òd{m! ùèi' enEs (dNhedliis) ft difcfor rheEroa, we pùróh úe rcúrions(routrdro be simro oi rh'.obhned sub d in Figue ll.9 TheombircJ sùbrhcrtrÉ (31. ,r) is foÌaEdrir G). rhe bcr a É r o m d ' o b c l i m, a n d ( . { r . , ' ) c a i b € d 6 6 d b ( n , . , , ( ( ! { r , I t i s s i n i t ù îiÒ smc n donùfÒf(3i. 3î (se Fietrrc 11qb))ìi' n óbd ójÌ. ed úe rbs djr ad ril úÌpdd. rh.r xrcd round'o besiniìù, 'bercfm (r r. 31)ùmor b. crNÌeadro(,r:. 4) ((À'. Àr isiÒ' siEiù h (rr. ,t} Gs rhesumordres!ùGsoiihedifldcnùdoi3ochàis),andr'hrcsholddefrned
compad'erheÙulonm'ioisby'i.; ibdwiyÒrr {200r)md$rc ùLcqr4<
11.5,3 6 0reJo!r. Elchdufùc, dcrìi.s ùhi.Iofnùon I forrhesourcerfÌ /1, bedre trisîpdnd b 'he€r odoms fÍom'hesouE otrefbm rrd o'bcf)ir PP shi.h @ s
i i ! r i f , r r . 1 r e d h ' ù e b d \ e n L m d î L i s r h e n d e n n e d a t+r Jt r-lr-rr r i
asnnÈùo ofrhedusesaÍecj = l(7,r) (e,5)(lr, 7)(la,l0)(r7,r:r)| od c1 = (14.rD(17,E)(22,17)(25,2r1.trhc 'bdlh..r = (7.3)(e,tO3.7)(la.loxl7. l3)(21.L6)(22. Ì?X25.2DQ6.22) = rl) Ld thc cm$ndins hr tnr î* bc ar 119.:3)(30.2a)1. IO.5)(rr.6)(r7. (2r.r6)(2u, r7)(25,2r)06,22)(ll,27l(r5.23)). TiìerthedirùeF beù.en?; 'hnqnnu|yúGúI'ÒrbeinedNtfd uì'i'g fmmFiìiog qcb paúofds' dar (zujr). whcr rhcrwùrlNIomúúotrs sjvetrrhsùold (D rheinGBr r-r À), {hr
ll.s,2
Calolating
tlì€ rc* traEfomatiotr
ftn ú considkdd ú b. &ri'iÒ.onsumins.
Dpsi'iÓd,*peiyvhei'leaú€dúe loùdindc syr.r is rd ùwry rromrlìcnr$ ftNrc TaUie ùe denle mighrrc$x
0 Ìbe corcspoìdiig rom ,, herce ,Dad 0 fc ùefed s6 is xnodùed sr or'cuftnf cìufùs GrctìdRàbsd by cr4) ?.lonsuìlylhetJxfomrÙÒnlù\upeDosilionof dr crc'iÒrr n P or 'he eteneft ir c
rhebìghs' sùf ofjóini'!
my r*o dN'd
ir c
jo,i(cr.c, ioinso cìrrùs e - 4n Ícd Pots ks nnNbtÒhd!a!^) r := tht î@ttr,tutkxtù ltQ dntu^ ú e 3t := th! paîs aJaai:knt ú^rè^ ú e, tu rre ro?s iîj.úhs th!' tc 4, c B) := tt ! q'úì:îeú pan h a úL hish5t rok s,tua = r/ú !o,( ol joìn(cro. crr) únc (crc, cRs)+ rand sorc(ioin(cro,.Èir > s!,ù do r,,.r = $oft (join(cra,c{r)) c,,:=join(c/o..it ú. tined d6Er îú = 'he totr.fÒntuhulJo cÌ) e:= e - ctoúcRs)úcir f =r lrPorrkt)vrif Òrnrèú rùh cll liùauc.e,,hkrtan oú ùhút.ú. 'he rùa iîjokìÌ3 k41 dlaDs! k b] Eù'ovitg al dato îù c P
qúttkrcr d.,tua,t rùi,tt ht cú (c,a. cRs) := th" caúbqú pdit iÌ 3t túh hish1t r&
crr ì,corr.cfrHl....t.wrìeÉcrLF reans rheprr(,4r,rr.
11.6
crUsRfNc.oMBININcLoc{LsIMrL*fllIs
,""b,ù+";bL.h*111".*" d*t"
"
11.6 Clùsteringby Useof Relations ('1r' 3) xod (an 31)r
r",*,r n" .-n *,**r
Jl|(,''.3,]ÉJp|foìmJet.qul .",,''""''''','
dsiftd
aÍ
ir p (a' '1*) ;t",1"" p ls nsd rhe pii^ Ùc consùterÌ
'i.,.:"..--r.,o",'" -*r. I"J,Linvso*u" "m""..
CLUSIERINCBY USEOF RIL{IONS
/"1\I \\
!'iemll'l0AiqrnPÈsshBÙ{ duiù (À,.rr(ir.3rì úq 1nùrhc rini Èc dd pt,1,n,) : 2(rj.rr) 6!r ùùr 'hr (^,.À) isoq\nborpii (rr. rr. h{ùr }rù r(rr a.) l prrr4lh5 ll.6,l
IIow natry rclations 10 cÌnprrc?
AsìiE wchrvcrso.rtrrd! cr = (4 ta. .., p_t'\'i q = @ t. 01. .., e, t, vhùe each(r,,. 0r) js a .ÒúPÍjbÈ P ìdr oi rhuiòm of rhc€lariotrtomùh p (ur
11,6.2 Geom€lricrelotion
gi€nl*oskks,bolúsyPÙam4fs bc.ver rhcm?crh s ,neiN crìùìeìig r
an qampre is rir.i
rron Ahuùrc!
and Fislhr I t996), *bùe oie .igrr aod
rh! urstcd b'wùcn LhcA$ (Nhcnl,m.leded oDa pùrlld plrm) uid rbr nioimum (d'b) rtrd núxinùn (r.q) dúnc{ ronì ÌLedirion (*! Flgúrs rr I rG) For No piin orco,nFribr. ssE\ tu b
. IrrI,r,ril < r,ùuryr=50.: . , i ' i , r i ù < É , s h Ò tru. .f t x q m r c . r 5 À ì
11,6,3
LOCALSIMILARITITS CTISIERINCCOMBININC
r;-
-
Fis11.n
G)ftc FBnds or2.!n
ù{ rrej{cd q ! 6ùd pe!ùd
[email protected]
rdd
(Ùs isd{]y\ Dcsrhh)fhe ssÈ ee
0ùfdhL rid > rjiri + .. rlli < ,iù + É rre6)byFÀnsiorof$icyL .
qNhid n inùshrd in Fiem 1l ll(b).
I 1.6..1 Dislance .€lation dilgonak(idm sidùe dn'ancs). ro d
rdineúcdùncnh(iitrrcsiduedisbo*t. ftsgaEù!rc|ljos'InfieÙf1t,1, lnb ,j. ro sc ir (r,, ,) h cotriscDrwirh(,11.tD, wemur .oúpft 're dfba ro chrk whdhcf(ar. r, is coNiscor{nh (rt,4). we nNt @npf úÈ trnh ',. ln rùeligurerh$e e d$ E
CLUSITRINCtrY USEOFRELAT1ONS cr = l(,4,,'rt1r, &)ì,ndc: = ((an rr(n,.4)l.canrhe* bc.o'ibin.dlve coNkbr. Hd n i, sfriqenrrochc.kco6insncy beNeen {n,. r, od (,1r.4t ufedÉtÙcdrownNlhcmpl]pltnslf Fisufet t.r2(b) showspÒ{brc rrudrrcs fton r[c dnbne mun.s ir (x).NÒrùrh4 rructureshr* djncrcd 'opoìocies.(F or Hornandsrder099lb))
sd.rirudinEquaiioi(r0.2)(ch!p6r0.2l)coìbÒtr{d.NdcrhdrÉdifrdi.ds ohheduù (ni.3j). (A1.3d h rhft rùciih dishnes(11,. !, 1rj. B, and(ir. a{), 1ar.at. andrhedirief'* dìeidrefdi$nes1,1r.tl) (Àj.3,
in
D^Lr (Hormrtrd sddtr rs93b)dilids rhesh.tuÉs ine,(o\erìrppitrg)ùi.kh,i. ùc{ir|b, (,, ! + D: oed,ppinsdiro* (10.2). ùsig rhc{ono!in Equarion sì îc urd loi rdùcirgir simitirdkra ofnsidrj somcb.hnic'l opemriois or morc'h.n sir Bìdùè p!ì6 rhe , (- 100)bc{ *ed, !E rhenchoaDror oid ÈipxDir (cr6ErÌDgcydo. afiú ún .(", rhe,i (, = 10)b*r dsus re dìosen addis doie bya Monc crkr Érch GimùìÍcd rnncrjns).tn rhisrep dÉ se.d10bc ÒclNrcr Theovesn,pprÈch is ilusnrd ii
11.6.4 Us. otgraphth€ory úorcjs aÍ edseb€ser rod* (ir.1r) if 'lÈ rtdioi,{rr./1})
srijfiesso'ne
mFfl E
fZ
n
mffi E
F.i
H
ft simìh (graphnomorytusm)aodha\e the hishe{ (!rÍanry)
$oÈ rb.h cú e
ùorc.FÈ('odstÙThútàlighè
4 .;
L
q*| The(ioddpodur gmph oNruqLdìryq{uus! rúich$\'eii aleùfirnn5
h$mcmdhodsGÈ.r'Òrc&mpìe.Kocbúdtee6)m ?4! piodu.rgEphisusd.
Grd b FrefalÌze iioiì cor\n'!r?ior
c
foF - t,_,/(n,.4). trr. rr). whùe / n ! masr of dissinianry. rhàì s.hh rÒ,rhciìichsr fo;na cru:86À dùq
11.7 Reffnement . I iìfur djurmeú
or rcnidh.d
milhr he
11.8 Exercises limgeo|,j,ÚdihÍ1}issiiìjlf b&
LOCALSIM[ARITIÈS CLUSIIRI]!CCOMBININ6
f{ 1./ a 1 1 . 1 r . ( 1 r . ! r . (aat .: ( a r À 1 . ( a r , À n . ( r 1 . r r . ( i r . r r ( 3 : , r r , ( 3 r , r r $ 1 6rt rrc @Ntuhr Ù bdngde.s r 'tu . b:úrrn..
( i : 3 r , d : ( l r a L ) , ri ( ! r r t . / i ( h r L ) . 3 : ( ^ 1 r r , r : { , 1 r r J ,v h { e
ùi(hefhrb!r(ir,irnriml{br(3,rt
rh5nu'!'hí(^rtj) \coRtrèd!,ù
^a\ aj\ - pta\.R ),pta!,a3t- p\RL 3n,r(ir.ir'r(r1 \ùrb.daAd .,, /..+,r1i,hrd iIi:/i,(ll3]3,
,:) îEebE, ùee
o *amine iÍhe dusb (rr, ,r) .$ be stuùpedwirh (1,. ,,
(a€ on$srnr)
l n d c r= 1 0 , . . O , r , 14.... P,.a
w.hawaiheofemlolafccìljltlriÒ
G .oNanr runbs) or ùc pri( ii .r. rhcnr, is .oiskteù niÌh all pai6 ù
G) rah p3ii or sàis shoddnw bed
usen\is llb\,a5Jlb1,o)tbt,4Jtbh,à6lth t.lrt ainor bc eroup€d. (b) NN r rùrfonn iÒnrof och ot to crlmjne whichor úe *d prìÀ (ffon (a))rhd e consisrn. Forrnc so seds(,,1J. 'j) òd (ar. &)!ocon aE apprc:dnire\ eqrr ir 'hc ,ri'rrri (c) compùc ,nÒdùs6 vlh Lhedìrgum youdrtw itr Ermì$ I Òrchap
spm.,andcoNi'lcfssEs 1.. rhcmh e diEdion.dd bèrpr.sm.da\ùrls. a: ,'(c): (4.3) (4,3).nfr)r (6 3) - 00,?), nr(a): (3,t) - (7,7.1),n5(É):(8.5.7.5) - (5.5.s), ,{nz): (5.s,4.5)
01. 7'
B:Rns):(5.5, 4J t5,e),tjltp):t1.,r.r) (rr,o, rlr): (e.2)- (12. 5), (3.r,4.5) ,jlo)i 0 5.9).rnl): (0.5, ù - G.?). (a) ùaw úressEsitr a di€rrn. (b) rro ssEsre conìpribb itui0y rd rc) as rcLúons. shrr tr$ rìe dnare (d , dt. (r'e ieÌorc aeks. romk n c4y ) FindrherclarioGbe'v.etr Èrh prn or ssEsfor erh frucrùr. 10oic d&imrr prÌc'. (rf yoù{ish. (d) w. $rlì sy 'hú 'so rcÌ iónsh or drecoùìÉdblePiifs nu in (b).Yo G) Exrnire ir ay pRin or úÈ fir q
11.9 Bibliographicnotts @dnve' aì.(leer),arexiDdmv ardFiscbù(rqeó), Fi$hr !r rr. (tees.reel).
CLUSIEI NC'COMBiNINC LOCALSIMIT"4RIT1ES
Àeù€r-er^rl. 1lee6).Ptuù mdaylcho( teeù, veòibkycr d. oeeer,cnrdtey yl''gL.|j'dcore)'K.yqrglJ RNser(r 993).DALÌ h dcKnbedjn r{oìn ùd saider(1993b).orhern,qnoose
ibitity n b dchnerhec;rch s@a u | $. or
merhods aE io Dicdon.hs( lgsJ). Falicovxrd cohetro 99ó).nd HÒlnad sùdù h q E d Ò ] , . , 3 ' r ù n, ù , , d ; ! u ú q src xìt{,rifjcîl (chrpbfs.7.1). ^ úmmm y r|t er of tuns vnhin rhe.ùùi, atorg
berorrÒnsn {.h rù. andîre ssqdú s.db.! byBsgtcy ad aìr,iri ( t995)ùd
12
sig
Str
12.1
12
Significance andAssessment of StructureComparisons
. Diftd rìignmcnbiE rchi.vcddcpendins qNhichbio|o3ia|FPeniesedEla'ioNreenphsizdnìúèÙnp'jsúThìs n n eÀoiabl. rosqurc 6d lnìqucaì TteskúghrfoNaldapmehloru*à\
rnbúìon. nndin addnion.orelÌ shap€(ìnets &r d,r, s.oidaJy rddm .or'Èr' xndÈckùs derlig. a5 a qreFe qamptq imúE 'b hepr'eiJ's b€i's .onpùe Éldom *t ofshrùRs collainsonly, búakoinc|udeEpEsenhlivesofiypj iu6 (noeL) shoddbecaÌculared for qch onpdì$n b nd.h rhenomdom
l2.l
ConstructingRandomStructure Models
risuEì2.r îD 'odùi ùiu!, {d d ' doi di.É h\ ÉhLis h ù! ùN ùind by(ar.dr+r.dr-,
t2.2.1
s ur 6cd. Tnc rimpk! pMedurck . sí si'l8. px.kinsir ravóuEd(o rrarrhcsructuE rcnlirso,mptrùîìcìqEh irar$prfned.
12.1.1 Useof disrùcegeom€lry ,,!{ar
(4,ódi ind Ta}ror199.1)k a pbsrn fof.orxh4ii!
pfÒrciod.!dì: in
.Ii.hfomnnodelìed^xspherc e dons is sr b 3i À. . Thed'rind dnbìce bdrft! nvo(iri vrc$sve) aetu À drn, = 2,iùv by rm (vinurì) rtr8ìesror erchresdù. òc bon,r dgrcp didrh.b*ioo ssr( e ù! c4c (oi+r,or+' isrhean,sre rhd
12,.2.
(+04)i h.nr / = 2d1úr,/2r
(sondims crÈd .d(oys.) cf, r,€senenÈd oflhcmrìnPmPcnÈsoflherilive![.fut.Dbercbireddudigújs.hhgirg,
12.3
SIGNIFICANCEANDASSESSMENT
12.2 Useof Structùrè Databases súucors (q) riú I tundrDdad
shcùe
ddrbr*. rtrd ny ù Fmow (of ictroE)
12.2.1 Conshucliùg nonredùndantsuhsetr
lunns úd no rvo rDcùÉ\
havc sequeie
elrucror ? is (sbJú $d s.hncid.r r99r)
r ( , , . , )= 2 e o . r ( l ( + , ,))',56r, úea n ad ! m ùeleùernorrreshcrùrs, !D,rsquene i&iriq n $mpud rrcm
dBss ror (.oish.dtrs) ronrtdùdùr sùb Jd$men.globa|]ricms'b'rlow
12,2.2 Demùcationlin€ lor sinilùiry \fr ùi RMSDlrùcisù$d*arofc.
rh
Es (N) atrd i cùfle drawnîo spdre he' (ronhÒ,iolÒgÒut &o,n trdoN xiórù1i6 Typi.rùx ùc RMSD ù.moLogÒut onhomologotrsshcMs (or rnetrib) tue f, rì,r cÀamrr. 999i of ùc fruc\ frrr aho\c
r rom( + r(N + t'/'. whùcr is1.2ù3.
12.3 Reversingth€ Protein Chain
12.4
l'*u-ULns;#,t'
dfr.
rcrP
ù (ìnpbud16 i- ,,
r ùlde ùe robc n1rùùe i$f(lr Esidwr {úarRNrsDd0r dhù 3rd À (r rcr40 Éldùr Nd 'h! ph}@cyuùÍinry ocri5
(rhc L s !rc dìfriùL' b nahhin i Gndonnylenflrd Lofismysp$iflcovenllsì'ni|aìtyl
shdorcs ) wrrd wil be
iy odrnabLtùlLnearingÉ
coînkroqs bc'wn rhnÀ in,I
I lne d'h.ln (rnrl $@'( !c \(Le).
iab)
ommodds
and lhc hard {,f
12.5
SICNIfIC{ICEANDASSFSSMENTOFSTRUCTURECOMPARISONS 5?
r2.4 Randomiz€dAligrnenr Modèls
ùniqucrssol rbex$w.i an advrnùs
u\onlb ealigúìèr|sfo.a Fúro|enicns an b€sdpld
P dli'elh$c j,|Ùdùs by
rp boùidàryonib ìoser.dgÒ(rayb, r999a). Thh!\redfÒetpi$'helimil,i|IlAile
i sndis RMSDamd b. rÒud.as.l
renr (rùr 's, oic *iù a to*, if Nr rìlimrl
RMSD) thùc n .hq i eood.hà.c úr
of ir onî m;Ds iircrion ( \uùìiiF rhd rhero{ kÒa b$n
12.s Ass€ssingConparison and ScoringMerhods To b{ ddduÉl (r seqùúrid) corgei d!ùbè* (rc! chxPd 11) h!! bmome dd tuidiònar sintirìry). ìPDBì
Thcn erch shduc
ù PDB ùcrsiig rmr
(€qùid.g Lrìdùe sofe msr
be codrfe rd eeh 'lùèfy), xnd ùE a beNed'fol*mplgbystrrili'y/spÙ|n.i'ydirFams (Bieúf
r àl t9s3).'ahc pxi6 Ér th dsl,.hB:hoìdTforx$umedhomology poinÌ 'o b€ 'he *hse rhe !ùrnbùor útrhoEdosou\ pans(ds pos.iB) oosi'iv*) posilirù\pùqDy(.rrhdcm^pùquùry€PQr)*0.0t
dî
rheretùef
12.7
rìs
ùÒùrd bc.qúr'ÒrPQ.(0.0r) d r. rnu urctrrab 'hc sn\i'iù'y (hc ftldio! oa ùd honoìoss rhd halr , vitE nbo€ dÈ dìrcshord)ad 'he dìtrftme beMÈi rhÈ, úruc md 6e EPa. one en or coune do thi, ror difieEn' uìus ofEPo.
600 ifuufcs $oine syrenú) 'o p€rromì .ll by sll o,
1ve us ùo pmgrims (ad
Plhabsrspe.ificityalúìispoirl,lir. scoPi5 dso 6cd À a sbf,dd for ass
12.6 Is RMSD Suitsblefor Scoring?
12-8
.sienihcarccofRMSDvadeslilhlber'peolpfttiisì
rxúryì'lolÙmpìqgidEsmlbe h rh,i 'ìof m rìc corc (bú Noghb nigh' be uscdin rhecilcdarion). n dos noI (for rtcoÈnt)
peniìize ci
showtrrtúr ihc óúù sonig ryrèns suajty
12.7 Scoringand BiologicalSignincancè othd nrocture (or vith a rpacibri. sledìotr), 'bei sors vin rrl1
ìFìrìercc
of rhe we
which soB
rdr obviotrdy lorrftd
pain. Brwen
ftÀ ÈobÈn
ffre'nci* Òftqùd r.rs$ (è lbovc). olhss, $ch s 'he DALr nÉdìod(chapd r t), hrve doptd , siùi|ù,Pptrfh bNd o
npirciple,i'shù|dnifktrwcùÙmÒùd
12.8 Exercises uk thispmgún (orùo'bcr) rd.oi shdstr.h.ùb*s r'ormlnmunl€rce.
G) uE 'hn b edimderbenunìbdor (t) cln yourropN n eÌplrndìonofr biÒrÒsì.rrD.Nredsetom jr, or n rbeE dabasofs000sddÙsAnirlby xrr.ompùiiÒnh dorc, 'nd ùè p!4 or srudús mdsed in a tin $ftd m iicFaine $oE Gsùn're dú 'r. lÒs rms bcr) cc lir ìs w,*ed doNn
12.9 Bibliographicnot€s rdiscù$ediicodrikO996) ailryrisofúd (|eea) d BeÌamou n rnd skolnck c00 r ) Rudom ruùnì mo.elsft disusod by rayror (Lee7a).Dnson is dsdib.n ii Avódi rryrù ttee4). asssmeft "ld d.$ribedin Brùù ersl (tse8),cc^bù ald Levin (les3). Levin 3ndc.n'cì! 09e3) od srrk d ii (200r).compùirc ùic ir L.vtu andccrrcii 0993) aid fnytor
l3
Mu
some*eb sddrts* rd (rÒlshdins) hrpJ/lffi.rccc..duÍ*àrcb/Lùbs/dùihÍllJ!0irft v j]' VAST,trpdb,h'nl{dda h'rpr v"w..bcjprp$irhNbse/rìo
13.1
l3
M ultipleStructureComparison 'lìc
n{ei
ror doiis ùuuidc rruduÈ conìpùisoù is ùe $mc * 'n! doins motiipìe
nodnqeùsishtlhmgaiNisesjiìjìÙ
r. coshrc' nùr'ùrrcùrigí,rcm\ dorùllfsqpúoithefrucùainrbeinr
orof
r. Dncowc'mnon rúirpaftns Gddm norìft - a smdìefpartofn\e$nrc
a lùnì poinicourdaìsob€itrdrded:Endinsrupù*.o
ù] nÒúli Gonriruc! or
Thegrppmahsftstmelyrc|xl{w n .od al$. blviq founda conmonI)drcm. 'qffrudinexnd'iple.lìgnftr \ihen dislusi,r (xnddevelopins) ùeúod
j.r $quo@ qrmPùisoisre eirended.
l3.l
Multipl€Sùperposition
|ÈJb@rlri6$ÚdvRl!}|ú
dìmr ore rnduF is bisÀ (pivoo.i d 1..t,{',....,.1-tbcrhciNdutu\.andA' (lfhout thc ba5n Fi*, rheceniEsd maf of arLrhcsúu.b( be ro$ orssrc.niry) sùp.Do*4 só 'nd dÒ rdr.\ m I aÍù fhcn br aí dúde 'he;rhúord'de sd oish.turc ar 4d 're ùdsrrdon (denned hy rhrcèvùlù.r.îh.irorerh rrudù
II,',ot;r: -.,t' nr ol coqdùab s' i Òrrhdu. i sÚÚm lciehr (rr){idrposirion sciebr(1,).
13.2
T:i-\ at)R;^1
ri= q,
vhde .R; h rheop'tmaì Fùrion mrn
p:=at^, lrp'pov
fof drcctuE ,1/
= oE aJtrt rnanftî rr ntuc'uE' ù ^'t'. nndbs tt'. nùtb'
@rùù' lù |.' u|ú+^ ùntir RMSD(/,'.
na
\i
ùd.ì6
tRj'l
Ll=t ùi R:Pai
' . K or n6iùút\ )
l=,.,j
ttDhù aJqcts pefanìd
ÀfUIÌIPTE STRUC'IURICOMPARISON Àtiù nÍi..s
t I
ii I siddbms
optifn,.
L",
{heE,li5nqeilrrtfoliheshcfuIep
13.2 Progr€ssive StructureAlignm€nt e 6 nùì,iprefnctuE aitimm' n b Ne om bedecmìjnedbyfu!qìculdiJ'edìb a{xnstùelyhiRdbw,rsL.ÌPi nerholsm *cnde'l Geech|'p'err).r
uiaìe or a coDsenus squence(of prchre), rorBc sh.n riieriig (súbsollisrn.nt ssa?(sa?cmalsobeùsd)(cbipde),itrd dnMUufAL(Tayìùdar 1994). FjA! ùrh p.]l oishdurcs iscompftd (ùsiE sAPorssAr) toass sI pÀinisesidkiitis 4d rhènùù p,irui$ sìúÎlfìriA b4v hd rtu ndd siril,r oI úd* r *ch shsÈ(re prcersiotr .ricmeú). Noc úii 'h. oI mÒrsinìlù Gùbsèo,ligrdhb àligicd rifd. AlsÒnlim r 3 2 rh.ws 'nÒpfe(w.
P r c s 6 s n e i i s ù s Ò r ù c s h d ù s 1 1 i . r.2, r. ' 1
rori := I ro,, doLr:= n! I/,t cnd tof ?uthpÌn e . c\ e u doatiet lc) ,c^) îonn,ltheîae úd t.tlct , c, ) hc thepf
n1Ù \ih
ù3h4t !ar!
c':=.otrenr^lcr,c\
r/ =(t-./ - c)uc,
ford.rc+c'eudo
lì,à ttE rÒE ,t tlienih{ lc. ct)
ù€DtìrÈleb^ (di\lù.Ò
mùix).
li ordù ro dain
bdNcc! irre É.crbois
a
usqrs Òísù$s. rd !;i be'her&n rriemd úo'$ bmiH
(,' is ùreNmberor
r:a t* =; Ltîi
13.2.1
aÙdreoth{s:
,_, nclns b|xik)]
rl rj
Ai
o r lcngrh ol livc Dosiriois Ld dìc sùond v<'os (vedo6 rmm Ìhe scond P$niú b
( 07 , 23 , 2 r )
AI
^1
-6.D (5.r.3.4.5.3) (r.6.2.5. lj.:1,3.7,5.6)
!l ij
( r . ó . 2r , 5 ? ) ( 4 9 , r . 7 .5 . 4 )
ai
at
ndkvhrch.byclìúe.njgltrkkidÙ'r.,r
I
sS^Pi 6.d rÒrxlcllxbr sìniLnty lrrec m'on (h.n.c ! prirwi{ s.onnes.hcmcn usù]) rhÒs.Òringdcp.ids on FÒfú. rrw rdd s.Òfingfrdfir u s (stuargofiúm 9.2),rìc sofing bcrson ù.
Nbccd ùd,m.oishtrrs,14!
= (h
rr: ùd ù is h òlcnlL.onsùd {cighr. ehiehlad$orinem' x (se chapÌefe).
13.2.2 CoNtruclior ot coúetrus
ltdsùit!
hdtuú,
útt^tqt
tuttdls
({hich c
ouBidedìe sope or ùis bNk)
i 'hc dis'rner ofihe con*sN 'o tìe(ìstrofiic rhcdi€dioÌ d is dkhnce),/(.,. q) = rl, àG,, a) = 16,,r1.r.., = re. Tnisis = t2.5, 'trfo'is!Òf ighrÌchi4r(r.., rGÌ..r) - r5,r(cj.!!) = r?,lhrchnposible.
13.3 Finding a Comnon Core frorn a Multipl. Alignmenr grps itr dÈ iìigmHr
(Ior.raùpìe, Lr.sùib.d í s..liÒn rI |). rlEn @h iodufc bevÍiabilìt}hlheposidorsollhÈíorsfc
13,4
ú
rhèmrùÙu wiùù, gip h,he murripìedjErem
Ar"ú€'bnsi.lhdúÉ,JúcLlfolú..miidde,
t
tbÈhold îor !ùtubility or ù. dÒ'i pÒsiriom p := a: G| )= nE cotun[ af ti aL := ttt dd$ ú !tueÈ at eumc ú c' c ! ,= ttE @q4. !rctur qfa| \aít fo|echsùb:fucfure^],dr 7ì sùrut'di1e ta,p,c ?) na,t rh! bnio.tutiù tu6km At Bùsrl Ior eaehúhnn ù M do ú|1'kie th. w.ùbitit, Òfth?pditk^ Òlthedans o | 1- rhè.Òtttn in ti th.rt thèvtidbitit, < v aJq-d6 ptom.d
13.4 Discove.ingComlnon Cores An in.ui!!. wàyor èr@ndìrgúc pri 'o laqer nulriple c16k. Nine r senùil .lDsÈidg pmcdùre (s chap'erI l). rn DgnÙì'iplecìsÈnig,sP*idlyúcbc roton.ilmighlb.wonhv|nleEpkciJE'àe dscribe lor nndinsonmon coa MÙS'ra (kibÒeiu d ,1.2001).îìe geomdnc chosnasa ÈtsE .g. 'heorh&sùù .ar
ìpleFd nr.h c l nDlde lrigdúúl ol (seomdfnal 6suEs, om ftom ea.h sdda) r qium^. îc 4 $brrudù6 dcrìn.J by ùe doN ii rrìcdigrmd ùc lppÒifrrtrry \inrxf (.oisruùnr.
so!rcc n ùÒ Pójcctron ot a nlrriDr .nce. The rc! î. ror ercL Go'me,rr*nce). is r ct or piir{is d úrchÈs. prisis clusedie pRÌdue (rsi',c d.'nùdscxpllii.d ii chlpb rr)b rî 3. Mlte nolriple sùbdigmer6 (piopoirs tn! órcr) Èon múu, q$ircí ÈúNise$biìigmfltfronaìl'he$úG'
13,rt.1Findingrh€nùltiplesecdnatches ror rhe pMmrer i (dìe lenlth or the Fd mfchlt rhis trirl he à .orpfùEnc bLrlEtr sc lnoud or wo* (ircENrs !Íh i',cFriîc r), atrdÌhe nrmbero|nor Ndtrl Red na'.hcs (deu\irg *i'lì ircuslr-a r).Lsboqi'zdal. sct = 5 in rhenc fon! (d!è rftr Èmhresidùt .h be.dmpr&ry sFlirìcd Gp 6 tas úiDj rchtion d 'ninor inúgt br 10 ims dishîcs (sc chapù 3.r). Howsci by usins tri'tr jnrefdisatuq lod dGriitins rhcrypeorsymmc'ry ('hùc m four+ccErcEjss r),
thùt cq{.on!
\34.2
od rft ùch dÒh (c) ,ded. cjedi,ìg tuprs o$isiinc ofonry rou! c ùornrolrh. riupiùs úd oròrd inùidry
(úir ùdcnrs lùid bc rdr oo! 4d ù *ould not ic.cstrily 5c dierm.it). a etr}lypeThdefof.il|rfupìescoNdrúine
trìis tuPlcs rrucbE' ÙPLcrì:
L r r , G , 5 , 3 , 9r 3 , ) : ! r , ( 2 3 , 2 6 . 2r r7. 3 3 ) ì r r ,( 3 r , 1 5 . 3 ó . 3 c . 4 1 r , a r : { 7 ,e .l ] . 1 4!,7 ) , , r 1G: 7 . 3 e . . r 3 . s , 1 3 ,
13.4,3
Fmhù'iorc. lci rljcI rùp[ a | : G, 5. 7. ro. 1] rrcmùe Erefr.e belrshedroùis
'ios or 'h*e in ùder b frdd lmd nùf iplè
13.,1,2 Pftùiseclúst€ring an mùftipreFd nr.b* fe póje'rn oniosúh (of rhe,r - l) paiBor (crercmq soùe) sivins rof eachsùchpaù a sr or pd i* rld r{ch6. FÒfrhccrampte !bd{, rrc p'ÌNi$ scd mfchs for (,{'. n,) nod ú. cÒtr5idscd húbr ùe rherc ft oily Mo difffil Fdjdjms fof /r 6om ùe tor mùripìe$.d narhet
Ther for erch (crerctre. $ùre), clùsbi
!o r.
i Jrùo g .ì Be;os n pejù
13.4,3 D€t€rmiùingconnon orcs \Drcc a sr ofclufùs (rlismeis) oodrcinglhehighs!$Ùbgnul'ipìe bc $npúrr wiù .sh orhc( a pm.Ldurc being of od{
bdween
IIT=: ,r. wrìee ,j h 'rÈ
bdigrgjrcrcdinÌhebucresfDmwhkh d6br c . @trhcbd fton (amoDgodrs) Pr, ùetr . is rghrrcd in 'bc búkcr whùc 4 is son,r rhis .rn b. Nd 'o dursn. a corc (md'ìple ilisnrnàr) shoùldcoorsir (pd of, dep.ldirs Ò! now ùc
c: Pi Pj .i..i cr*tnrs
,].,: c1.c:
/'t Pj ct.4 ct "j /i.4.i..i. 4.t: .i
,t P; ci.cl
is bcìrg dÒodar.ìupÈ rinn rheEftrence. úich ììd simitd sbfmdùfs Òiry b snPmc Gd gfouP)durcn qhi.h
Fnr, !1u(ù\ ú,8
ieP reo) whrh c
suruL thd \c Ìralc rhfccsourccs(,14. rìd 'hr roft bù.k r gd iìùl'rpb ced nnlhc\hr4r.dùdlnbyPJ]jiTa
rb.dsrîs cl, afd $ùrLdusrc*f
rcdùd
cr6Èf for,4r. rh d6rF mmeaEdibrÌnrM'ÌoD e. rrcmbùckd, (cj. ci, c)
ùid{ci.cl. cj)
rlre ùusdion
ofi - I durds (otrelionerch suro.ù di@
Ld rhc cllica
is fofd
trov b. dotrcby ùsiis ùc À a candidre fo. a comon
b0 (f m soure! 1:. nr. ,1a. .,p-ù"Iy.
Ìh" fr^i -.b.^
d ú"
= { ( t , l ) ( 9 , 5 ) ( r 3 , 1 (1104X, r ir,l ) ( 2 rr, ó x r 2r,7 ) ( 2 5 , 2 r ) ) . cl = 1C.r3X9.r5Xll, r3)05,20)(2r,26Je2,?7)125, 1ZJl27, 15\. ci = {(r.e)(e.l l)(13.1rxl.1.1ó)(17. ?0)(r:r:. 2r)c5.27)(r:7. 30)l
13,4.4
13.4.4 Scortngclùsters RMSD vtrìu$, ùd rn cvdtrfioi oî rh Lcb drd
coFs shiftd by sùbs6 (no' ,rì) of
13.5 Local Structurc Pattems prb$
n rharlmy of rbenor in€rring
ùe îàturs orrcnrakerherm orrmdw
ssÙ
be$ed {hm rheyre rosd ii nóf úù I Danol prcrcins.somcpfuìn
tulmcmb$ rhe aidEs lomin! a s
.r píbrbirisi.) .hior €osùr
cal,sqmnce ptrtrenN(d.rmini{ic $rh frucuB.
!scqÙci..pr'onldono.gjlclomp
n.ÒstrchpÍbms aÈ.!ììd t .?, P.ij4 nolfs dsrlopedby Jomsn erd. 0 999,20{où) lrc tu,hdjr crprai.d ùpb noN
13,5.1 Lúsl p.cking pattem
rorso,ic , > r ùd cIh u (rpie
ii! a biduo cir
vhrturr.r.:r !efti rumbsslcoo inart.,u, h ù diio kid mrh sr (M, q ,l ùd aJ ds.nbes .oishirc Òn ú Hpsriverr,d heÙx,/ sìÈdrìd rmp (o ! 1!,b,4).rnd.fiinrcnanÈc{end cd 10jndtrdc orherdnorcÈ prcpenies(e.g.expo*drbùncd)
apftki"ep!'bn F - (q.rL.:1.ùr.q'),,(r,,ì.:i.iri.di)nmÍchedbyi ido$ (o . . ,,) in drepmEin suchrhat | úcrsidrs!.
. !r d urjquc(ceh o(m
ontyotrce)l
2. úc rniùo !.id of a, n ii ú. ndch sl ,ìr,:
,di deino
er rmm rriclninÒ (Nl ro rhc lubóry (c) ì::
il ? {]thii ]tr RMSD of al ddr f ( ( & j , , : r ) r o ri = l . . . . . r ) . c sd orqìodDrs (r.ì, rof aardue: rrìcsidc rhdi lro'ns Òfrhc Íridùc ('iù rhù qep'ion of gìy.inc, rof Nhjch 'hc d.rbon is ùren).
p - l 1 a.3zo.t4 1,F. ax 4.4.5..1. r5.r.c.d) ( 6 . 5 , 2 6 ú. 6. 4, , 1 r r . / ) ( e e 3 50 .Ì r . 2r. , 0 j .
13.5.3
MULÌf il SÌRUCIIJRE COMP^RI'ON
ap4*4'dif is x p{rìDs prfcn P rhí hó mdipÈ marhs (@uftncs) n n$. or (or lnbìn oDe)prc'eìn fnctoEG). PtrckiE pdrms (ù nÒ'ifs) dc\.nbc .rùstr*
13.5.2 DscoY€.iry packing part€ns 5 1 r r . . , , 4 4 tú. e ce di\lu.c\ bdrcd ,ll p,tu orEridu$ iosìdca srudure r A,risrr,lrrdd
NrL or i rcsidre
! a l4t3tàoúrool !t4
Ns.,(of n.i-!hbnùrnns r'o'rhú) * ùe Èsiduo, and drn rquiremd Tte reiùbN
md
t o ìs rhe @r/ior d n\ rcìghboùh@d f.ìng Ns/.
'be stuctufs (is mrchedby ai ìed r rmcùa,
rte genenìizÍions.rc pickins
3úrrÌed (fewer fesdus) shdurcs rrc
s D î +.rb.varrilp)
rlakèrtishbò(. !.i4! tú aI Esìàu4 iÌ an !rud&:
P:=rnhed.jE,lbrn6ù Pf := DF probesdlì(p)
u
E acE G4GEA Ir cin becenùRrizdlro tr t]ig{ijnbfilhereilhboÙhoodÙi4.s
MUM[É
SIRUC'I] RECOMPARISON
c$rdngh ù, Nistrry sùìr3acsT9Rrw Tbcs.mh h I simprcdcpri jìsr s.hh ùcd ù findatìgcncnlizrioff ofù.pDb.
r a p en P (eqml ro dÈ rtrchor)s alì ncighboùnnngr wirh anchorcqui sb cdby lppddirs a e\id@ a 6nmùe plobe,fomìtrg Pd (noÈrbd a dos nor P ii úc doidiboùhood).Îc naÈh.s 6 P (M.) e aDalysdrosÈ ìrn\eycm be {hedrerF a hr 6utrcieÍ suppon(Gcm ù enoughsúucùrt Í n do*, ir ù aeRii
Lerhe neighhoùs€lEne oi thepmh€beÀcEîcqEEA Ld dne odrdneigbbou *qù.rftr bc DELWgRTÈ,lL@RqîrEc. aERmglrRla.-Its rhcsqucnc. ot x pekirg panencm beB!GrE if ùe dyìng'hemNtnseivdbyÌheNtri€
13.5.4 Soring the packing moiifs
RMSDlarus. rhesorc &kodlpcnd5 oo'rù si1joi\]ùrids|hcsNtiùts.lheugltrf
s P ot=
lL
rlhùo Ì(P) is 'h! inn,nnfior corGd ùr ùe fquence patr.nì (chQrcf 6) ,
13.6 Exercis€s
rd3 posrior'Gbove orberoN dìebîs.)uiiqùdydriìrc'hu4ru0tc^s rryte
s Ga[ ir MS{) h ùn eEEne ee
MSuDSan iiMSAnekrysúrùÉh
(a) Èapìrinbowyor trilLnpr*í
r
(b) EÌphù wbenandhowylu*in gedùÍe rheconsèNus shdùrc in Msup (c) Dicuss hÒwsupsm .ù bedpandedb Msupsm in | {ay sinilf b Thèrirb iiÒn r mdhodror 6ndineì tufc. etr bcùx.li rog!ìdea paisi* p !n úè shdtrG) is. rof oraúpìe,ued e iobr{€d ro a mhiple.lìctrnHr nbeusedbgljdesAlùdMSAPb 1r) hPoE a vry ro $e sPnh pinfln (b) How.m (è.gsP!,f) PrM\ be (Esds2){ ftyingsDd.?
e shciB
b rheEkn RMSD
wb.n I prÈú ? is dÈidùl ù\e 6N a n ,lMrr
kr Pa (or a P). e€q narh ro P G inv6{ch Pa (ora D. rn 'hisamlysis, ù$d GliFcd b úe o i! !\e pa6m) Érphin {hy
(ii Ìlè positiùdsin rbepdrd,
úc ú fc sho!ìd bo .hars.d ro lvoid rhe
wehfl e rhepmbeacDrcr16
(dìedchorn rùted). údn\eneierboùr
13.7 Bibliographicnotes | !1.(r93r)rhd scs |Î$,is $Faosiriotr iu€ fúadey oeeo).shrúo d d (ree2) ,id Dbnord 0e921, d*qìbe mdhod
t4
(nbrn inrlyktrd j (r994)ceErratrd Ldir(rge$)N.rsinirùbdiimpìef merhod ThcinutriphàLignnc óf sdiand Bìrtrdcr(1eeo),Russcrlnd Barùi(rcal Diigdi'1.I eel).id Miy iid rorsor ( 19e5) !r forù! rhcsliìe gaÉmr.Leo
Pro
borb$ rs rh.pivÒrh,1d.lcnion (E\cilicrd at t9931l(mtì d rt 1996) (lee4)ùsanrl aodzirud skoìnrck (ildftdùiùrùs)appfolù HoÌridaymdlviLler(199?)hsdevetop.dImdhÒd ii (!!ya ùdThomkÌ ( 999).\IrlLxce pdhsdls.nbiDsdrlesirs:
xrdrvouionresr).YdaÀfhssimia. ronls (FFFs)is pÒpù\dt by rìúnN lod s|órtrick( tees) nÒr ù6 n hibolirz eraì (1001)T$r (srrú5rycr rr. 2002h1 ùid MAss (Dfq a !ì 2003).rhelùer bisedon ssEs.Borh sParnd$cohediironas$nerd(reer,?000r).firdigpútrmsffodmukir cùfcir md Alrnai (lqql) ad deRiialdn rh (ir ddnioo b sPn() {c ù \yaÌo lnd van@(1993)Sudd 0999)lidCo,*di,r.(r996)rcpfsÒtrrheshctursAlirer er.phs([era sèqùejd), ù!c lcÀnD o et rì. (leeeb)havededoped a p'!em roPs srùfs (Fro$ r ar. eel).
14.l
t4 ProteinStructureClassification 09ó9r in úc PDB or Jatr@ry2003),md b be e€snble ùìs ùJle Nnbù ot dssiiid'úoi d donomy or 'hc objec or úèir evohrron(a shctur is roft .Òn*^cn rhlDsquence).lî addirion,{hen a úc luncúonof ùc pÉ1ein.Ift n disco iI h. .*i.f (rìnncs{ch)
r\ dssificrioi unir.îiÒn daùbÀ6 ,
l4,l
Prot€in Domains
(dilìihr
doíùis
or r piobin ùÈ Òrì6 x\wùd
virh diíeEí
rùlor rùn. úùghry. !î rdsidtr6 (ùdrurcis trda domrin m:i$
fundiDt
rhi\
ormc hydrophobE
.sidùe\rg 160Íd290 165,donai!2o1 r6r-239,lnddomrinIof 366i3.fte
14.2
coF, wirh Jierìdl úiiìcrd fsidud tdonlin n 2er-3ó3). r r8-i65and s@Fj!ù€ r
ior (nùnìbù or coùch)
búwcli thc
t s{ondrry sh.rúrcs (includiîgp{hÈt) \holrd.mlv oos hdweÈìdinÙdd
14.2 An Ising Model for Domain Identitication rlylù (leeeb)prsdts r bohm'ùplpprcrhford.rab
idenrih.rtidn,rlsousingrhe
ù Ged.Crhcmoddr Àsa. lsircmoderìs i ì ad h;s úodd úN's6 0r ndr*, {
n" *n, *u. -.r' i"a.,' ;t,t"u*o.
h oùr iDDlic!.ìoi. ùe mdes rcpÉrrr *t,"s. Duriig dreirrÍio!, n is hop'd rrnr Esidus *"" r. w *..i. "lo.*n.a u.r"^"t" ,J 't'" .." a"."h ,tura c" úe emc sút lrLueorc Girìpre){av of
rcsidus.qhiú a"1""nl, ;, r-- *a .,'a* o t""r "t ib (sprùtlv)neighbounns -ì d*.".d
ifuh" *".a
r, ì*.
. ùe jnnid suk valu$ (he $l$
ò;,,
srd ùrchanerir cqmr.Îo
r rmrìns ro
ol .hc sùbs ar ran):
n. ' ebb",, "
r is rhc úc
tp1€sentdss-{! 5:,' ,ql qi4 o. tsjdue i. rhen a Ginùùhloùsìv) chnsine ndd'oi Òù tr rdeú
=.j + is re$ úf
,lì.
"ri \;
0, .nd 0 if rhea!ùncil
ir 0
(14r)
\N ISINCMODELFORDOMAINIDENÌIfìCA1ION
l]re rumriotr I iscdhd ùÒÒùerù!tu
F,
wheF 4j n dE inMronìic disùic b..$èi n ùù inlersd dnkne r/4j. md r rhe
ii
0..
the ú crboN or rsidùes i ùd j, p|
14.3
14.3,1
PROIEINSTRUC'TURE CLA5SIFICÀNON islesrlo r0.,orif rhenumbsof cycl6 . chÀii.îtris gils sutì.ìo! opporun y ror
ei havins refonwins uìues in rhE sùc * s i w c y d $ :1 . . . , 3 , 3 , s5,,. . . ì ,t . . . , 3 . 5 . 5 , 5 , . . . Ì . 1 . .r..,53,,5 . . . . tr.h e n' h e r,5 \ )Td r$ic.r.\bmnef ..'.1 ..( . )4ll " h B ó , b . r i o.rvcruled rhc[i.kns iN ouf. Ttelsi4nodeldesnbed,ùùghsinpl
14.3 Dorìain Classes Trìecofeof rbepfobinsn mde by p€ckinsof rbÒs.ofdary rdduÉ .].úhs. Sìn.èúf e oiìy .!o ryr$ or ssE hkine lrú i' ù. prcrins. rh.a rÈ oolyrh@ rypèsorprìiwisccmbìlrljo :
ro dred.frDiùonof 'rnc (h,j n) d rss of dodain.: maiDìy{, mrìnly , andc ,. 3rì{. ùd hdopusry fof /r. rÈ s-, cl$ (sc bcrow)Ttc bofdù b.rs.èi tu rrsÈs rlwysú'rghfdwlt'l'F1gÙÈ14',shds
14.3.1 Mainly-r
donaitr
dìd adjcú c heÙcsG om*d by !naìì4rhù dlèn.irìy-, rndc, dom Ò1/-srddsG.s. owiq uPb 5%/r4radt.
14,3
14,4
14.5
naùry.p(Leb,cishÍod,ù
'rcùdn,
a/t ([v) !Ìdo +t {id1). (DDM bi scot
i4.J,2 M,inlr-Pdùm.ins rlt rlrjiry,p donìirs orisi ofom d noa p $eE, úd pssibry , snd uuú of c herices(es. ì6s rhan5*). ftcy ùd ric mmbù md difdion of 'hc p luftherclasily rÀe/ doraif thu thec donaiN
PROT€INS'IRUCTURE CLASSIfICATION
Ìi sh. cla$ificfrons 'hs c-, domii\ f" dividedjib lvo: 4/t domms hE a minry ,lrcÍidjng atrmeem df c-hùcs md p.súnds arois rhEr€wene. Thc ,r{heb m nìainìypúuel. d + p dùnai6 e morc$sresard ùio Mo (ornorc) pds. ùd úe ,{heb m mrrìy anripannd.
14.4 Folds ci6 e rrcked (msed
iD iD ùchihrùe).
allnù$dofditdenr rotdr (,n4 i!rca y is rìrirè numbùord ìLrft pIoFiB). rhar orly $DedúÒeo$ibìgl)rljngsdbp.logic.rrrógcmmbreobsddprcbibìy .Òn4 6on ùe physi.al ,nd chenicsl .ÒNrains on rbc .h!ìn. scwnl pe4rc nNe ùedlopldEllhgnunbÚÓ|diflÙeÍilol (rr 2002) 'hea de hund 300 dincfi. fotds. ned ro havc I sÈ{d lllobrbilig ofbr!ìrg i comùn arcsror (bdng hoúolosour. bùr ùcy mishr aìs hrk rhc smc fold dùe ir'lft ancsron (heìngulosoùr. shdùrs
14.5 AutomaticApproacherto Classiffcation
samphi by.urúinr rprhenunb6 ofssEsdhryiiE$me prcp.nis, ùd tookins ]do$bcdinpEvioÙchaPe6' lte dibrir ù.d lbrchsificarior wirLc dopredr.ìnilar fold b(rN iì fep,sqN J cvolurion).Fd rúomric clNifr.nion. ir depeods konre. ù' rh. cú.offs usd aofpìaci4 sdrùc ìi ftc sme erup. Îhse cù.dfs hoc bèn demìned enpir,
14.6 Databasesfor Structure Classification rKrur rssP.D.ìiDD,
FSSPh a ford crr*ficdion
dÀliiì.xri.6
!e Gn resibte
bÀed or sùuc'ma@dufc
vir
4 su, q(
f.'
,i rrisnmcí z > 2 ae issn.d
dÒ diEèeÍ lbìds (rwo {Íùdm\ ro bcLou b rìs vme roìd rypo.
Dri Doniin DicrnìnaryGÈ hltprwl2
ùbtuioiî!
ebim uk/dirrdo'ilii) d$siiìado
hon',|oeoNgpedmìì|y'*+oGhmilr
har/sùp.nc.rú úm e.urhop/.
14.7 FSSP-Dali DomainDictionary
PKOTIINSTRUCTIJRE CLASSITI'AIION
(2 = 2) @6 6rd
Diri Oded puEìy ondE rD coordiD!'6or prcÈint. rhe Aìignmenstrè cotuhctd rù sch pairandsorcd by lnc z vlìuo{u1c (rlcn€c linkqc) ispcrom.d bNd Ò larue of ùe ariFmcí
ofb\c Pfobìn$ N
I < z < 5.cùtringùchcdrhcz w e D,riFssP i5ahoúPr€ù\ of Prc dominsm d n óÒùrdÒmains.'lÍ. fiv.b'hco'ldsha.bstd.hìghljIm. sionaìshapcspec (by a m.'rod rcìaÈdb pn.cipaì compón.ntaralyli'. Fir do$|y PdPul,kdÉgiús m idù'i|ied, wlùh 'hey.,r| dhb6, m'ìdy. d/p'
dr.l. rr.d. oripuùet , bùcL {d d t, meindei'Ìì*e ddtu hE dases.
wi* z vdùG (ùc oLi'gc or
atr prin ir {hc dsref) above2.
14.8 CATH
14.8,3
Fndyanorsovù50;àÒnh.pf,b
lis. 'hor rorshi.h 'bcscigtrihms asF 'o
1t,8.4 d'ornricrly Àsigrcd Gsùg the mdhod by MrhE d rr. ( 996' iaonii'g ro ihe I {iùh ùè (ruduE ^ $an nùmbtr (utr'lù r0+) hó r, b! mrNally dasihed. 1ìÈ rr d!tunrs aP fcFsnred ror).Ior rìidin! ú. b$r rilk rcllrcs
by srick tv-,
conp{silion. nìe sond.ry rrudm conpo.nìon rcprcsflÈd by dìe.o'r'rj úc Èríilc numbs or residùsrarììdgir crb or rherhe sam&Jy rrudùs:
'heryps (heìr helix,beìix rrund xndsú!rd-súard).rrh viìì Fned vbef ú. ssEs í mcNEd À rhenmbtr or cd rd di.à cs rc$ ùan a givencú on e gps of rle irhrins ssÈi
ard ìc$ rhin 5* , composiÌiorb ddirioi. úae prcrì$ nu* bive not rhm í% a{ ud È$ ùm 5ri / pmtr 'so mdtr crr\rd\. îù a-l (hdlRr qn be frùi\er sbdivid.d itu o ùd, and d/É caGcodes ùsirg À sotrdq tu eerenÈeeor puÙeì p shds
'hesienhrjdN. indepe'dd' onÒpÒrolfar n\emomed(odobeL2m2)'r.rc rc a3 diÍi{ed dhiÈcúes. thes m.uftnd tior of ùc scondaryrrudÈ dmed Ìì kìown rchihrms (è.súù beú.prcpcltcr
14.8.4 Topology(fold fmily) sh4ufos ù gÉup.d inro rordFanìiììes arrris reverdepÒdìrgÒnboú ùe ovqax shae.údotlddvi'yol'hs|@odaly compùisotrsigonrln ssaP (sc chlpÙ 9.r. Pli'mdds ror duf{ine donains dr,hùk ShduEs wbìchhaE aossAP sóf or?0 sd wb.€ d r@r 60s ol lhc opular4 particùlùìytrùhin rbenìainlyp rhc d p ù@ras sndwich aÍchiÈùÈs
! hisrrersr or on rbèssAp 5.Òrc(75rq smc nairy., rnda l liniìies,30ror
14,3.5 Homologoùssùplrfrmily
11.9
upeff.nily if rhererisfy ore ofrbefonoritrg
oflagfsddurcqun!ìentlosmdlq
14,8.6 Seqùuc€ famili6
r5tó (sirh ú lcN 60ti Òr rhr r34ù domiii eqdv.len b úe shrlq),
ùdrciBs
14.8.7 The CATII cla$ifcarion proc€dùre
14.1 js rof, cbù doic b),úknlnrinsrhedigtr iúr (úrry ssaP) bcrweqrhcchiìr.nd rheiepÉ*írjlc orjr (s4Eie)
ó A$j,$ r chs ro rhedomaiis(90t4o'omfid
FR TEINS'IRUCTURÈ CLASSIFICATION
3. a$tn dcE etor, ùùù,nt Noreúf ù poùr 0) chainsc ,$uped. md in poii' (5)don,ìtrsre gbup.d.
14.9 ClassificationBasedon Sticks (iick\), n i5rlEu 'hÍ m{y 'o lire *sor\ scodry sddùrc ({idr o'ry oie 'ypcor SSÈii dch l!ycr). ar
[email protected]úl a ,{he. {iÌh hercesabo\r iDd bdow coNi'uÈs úru r{yùs (BAr) xndir ,ncr ed s 2-s-3. (as rhctuis tu ful dir;icrìor ìer Nmbtr or hèli.dsis silcn iìN.) wiihdilÌse .orbjrdio$orr!y.s,p (bmh), n\issyren cù csptor .lro
od i{x visui;arion. a fóng EpEfnúnon tr
Mdre
ìibnry or po$ibk dmceme
14.10 Ex€rcises desìÉid G tu*nù d37 údcL, ddmined by NMR).úd uq ror erimple,RNmolb vismìize'he sdctuE. sft,defii.dbyscoPÙdbycr'îg, CaDyougiveanqpbîíion of NhyràedominsrE è6ned sodifeEDù?
ngunr4,r
ni{t4qbN60
i/ ùe. GÌPP c rBnc ud b qrer a nnu &lndir
o{}rq
Í ú rhr cÉii n r ùi Fr,iÈr!r{d ( B){ s[ior ú/, ftrc
iE ndr Èrh rcic(rcc 6 e
Findwltrrrhrdd,rains aroor.h|in a. trnngssaP lyo! vrrr rinda búoi for rhf jn dìecrLrH vindÒw)How
(d) conplE Ihedonaindefid'iotrs{irh úolc siwr ù c^rH, úd conmenr
14.11 Bibliographicnotes iscN*d ir lxyrÒ!(r1)99b), andndhoc {e dd.nb.n ii ron$ d r ( r993), swùdeÌts ( 1995).Hoìm dd sxtrd$ ( 199rr). Hùr'n
atrdsùdù(1ee3a). kramdrr (ree5) !tursiddjquiaîd Brnon(tee5).Horm aÍd slnds(1993r)ds.nb$!merhodwh
PRO'IEIN5TRUCTU]IECLASSIFICr'jION scoP n desdibÈdio Mwjn * aì.0995),c{rTr in onso cr,l ( r99r), o,enso eral. (1ee7)?ld Peùl d d. (2000).aid FssF/DaliDDin Holmúd sùdù oseó). Hoìn ind sandq0994, Horn úd ssndù( r993h)andDieÌm iDdHoìm(200t). Thedifffil crusìncdionsc conped in Illdtùy 4d Jor6 ( 1999)rnd Dier mùn md Holn (200r). EloÀsonand somrzlmù 0999).onpms S(OP atrd bcdji ràytù (2002).
Part III
SEQUENCE-STRUCTURE ANALYSIS
l5
Str
15
Structure Prediction: Threadins rùdyon. wouLJlikc rÒób1rininroú{úon on e shdue qPeindany Gy x ny ..r,far rocdphyÒrNMR)ù cxpensive andrinÈcoosùùitrs,rhùe hd h*r nu.b woÌ* in ryi'ìglodereìopeoonm.úÒJ.fofpfu rcrìadei b se homolo!*4rercs w dopedbtheftws'ìÙela,inaaÒnlrn.c h *qudce.deriveddabbrs{ sucbN Inr* cixÈdvith pntid (ù doúdr) fhiiics.
drùbrsof sùucrms.rfIsigriici
nùnìbùof trdtrfl[y Ò..uriis p@iî lird. úe úat the Dewpmreinba a rold which ÉaledbyoùeÙiigùcsqKrcaeaùra tu nc nigh. adoptbÀ*donrhcnruduic orrhe
,r. squme b bc@icÈd bu'*n am m.ù br{d on sfl.ùro (Éùs rhin a squene/squence,rjgmeùù is poreodsny nùe bioloEclnymqoùetuì .nd mon {cuàb *hùr .odpùod wirhrì. Lquivatar
ommmly cilìed ,rrari4.
sone of draÌ us úe scondry 6hdúe ol rbenev onot *.ondary rrucrurccremnb iron! rhe
l5.l
Protei. Secondary StructurcPr€diction
,l}ÙedrloloImúhod\rofFcond4
ds:bclhcsqìndî}rnrfuttypelo$ br!. or roodruF (or?/4) nìe i,,pù
15.1,1 Artiffcial neùml rci*orkl ^d'hciaìneunlndForts(.s-,
rc rto
spe.i.llon orANNerLcdtudt nL?d crrkdkrMt
orÈurhd jn I oùhrrù
I hr4 ,.r/r
.onieded) a ,?i/,'
fe tr)
cotrered Gs opposd ro
n AsmiR
t. aidrùrùL nÒd$on ir\ rd(csso 'he vàgh' brLvè.i iod* i od I rre ù 1]Éi i \"ìù. s, = )]l=i e,iq i5drùìrtud (i = o (rcsDords Ìo îì *ù. rooe sîn .0= .ù€dr I 'hfeshord) Tlìe rcdvfioù lercrf Nde t n rhencaìcuhred 6 i toftrhrcshordtuú oi r(sr, anddn j . a ù , i ì i ì ù r $ d i o i i s/ ( s , ) = r ^ r + . r , )
d[ir reím
nfances. i e. a sr ol ièsidu* Gn.odcd by v-ros) rlons vi{h {hln rored hb€ls (hercúdì lrbd beiiìg d, , of rJ.
-o -o +o +o
-o
id ùd qr. dssE,!nunú d{dbiq
Duing drefuìtriJ,ephG of ùè aNN. úc wcighb*sùred
virh erh of rhelim
'(pmvidiic iipùr ro rhcoùrpú rryk) 'rd b îdjuf 'heweish'sof,he rjns n'b 'r c enoBb au iayen úd rdjù$ i\e {èighb úrir a[ Neighbin rùeaxN baveb*tr rdjùsrd arr inreùs ir ù. ùairirc sc!aÉ
è weishbb@ms lqy we djured b rhe Ìharn independen' or ù\ehiditre $r (do
15.1
15,1.2 îEPHD prcgrom ifwoù i;ÒÌj HrjDdb4) dLydoll.d by Ro$ 3id srdr o 99, lisbrd of di. sìigle scqucncc,rimjl) (xìism$r) irfùfmúioi ir trlrd !5 ùpd. I mùftiplearismfd is.ùrshdcd o
reN!*b.dKE.'fol*mple,Ú]
siidow fom,r fdidúc r or !ù'!rh u (= tt. îs shovn ii rhe figrr rhcr
ù = 5
15.2
in seconde.t 15.1,1 Accùrscy st.uctureprcdictior
(rrrsl)osrr, rLvcn{ ovcF0redidrons
or
ld*PEdidioD\ (raE ncsn\rt
sot,
prc\d!ú
ù Ro( a^d slDdq(r99r)
qùq,!r dìpùjrù
h ir hìv$€r
\hs!n ùr
fta!c&!y
ry ssome prcjers. r undr ot {rudu(/ 'idruÀ (liru{'rirgmdl'oJs)[r!. h!ù' d!!.lopednù.ú
Nú rndoerùù (cdrdr/r-lD Fdhodn,ùfou8hrho$ùr (Drìdqpì! menÈd rnc(rDsÉi!l)ifufúriù\$'wuuìNriru6(tru!'/ù&tll/{ddl'oJn.o'lr\!thr rionror dÉ qnoìemodd (,rd.//r,! mrhodt. ltu nnse Ìbu5rorru ! hndseron
Alisnment 15.3 Mcúods B.sedon Sequcnce
MnrHoDl B1\EDoN sEaLi'rNLE AL UNMENT
mdqrDdlcidcu herix.pdfdo !Òr (Nar fon soùenr) $r|!!N fca (s^s^ or sas). as erb o le exnìPtc, wc úr
onty 5n c x l) dhriú
15,3.1 The 3D-1D narchi.g nethod Ir rhhùdhorrofBNi. crar ( rc$, rsst). t3dìtreÈùfars (6bunlt/ I scondrry shctur úh) vù! usd Ld ÌhepLobrbitrJ, ornidiis!o lmrnorcjd(d) in in rirfrri.d(r)hdqÈc d Í P.r k) .i}*hèrc & Pi rhey.henrkùk
,r=*(?) (sìrdinsúneddx rrriùievr.ft cbÒuidlnsto'sasa dîr$îrÒL róprulcin\ !r u!s) búwr.i diÒsr hund {!rcs {erc adjùsed$ úid 'hu {m or tKoE5 $rf r p|n of dÈ nìiirg $r ('bc 'r .rr, ,he iocx (rnEqufiú ( r5.ì r.'fte Esùìri'rs ùt. orlrrp.6ni6 |.r cachposrioiin drol P@/r(,b.iD-È rìmiLf b eurú LF.N orrùriPlrquocPrìhrs(chrpd1j. I
n)oelobiisi:itfmrghhin5htù9o
15.4
15,3.2 TùemtCItE nì€lhod -nìisapploelìuseaEefdkìelyol Fmrd
spdlìc subsúùionrabre(ovdiielor d aL ree2ishie'xr.200r) rhe redbyi'shydrcgenboiditrgpinem.i6sc íNd é n!ii.!hù
lNon {st
siù.cbaù
y doùoN lù ù Go gtuup) Ùd rcccprorsCd N H ,!rcu0). solvcî' crpo{É. ho{ britr3 rhesehÀ ,esuÌredm a sr ol ó4
15.4 MethodsUsing3D Interactions
sinpldl;trdioísfur.úkj]ìolprded i1ìgotrly '*o ftsidr.s (pai$ne in'flction,
ùcadi']rcdigíIneísftgeiertdpnùbeú|uo^, r'dsrde ctar$a€ flexrbre. $rturcsrdB. sy iÍ alliiine(ì1)dìdim 0anlrc(D).
ber{.enrcsidus (ar Mì aoms in rhcÈir mu{ li! \irhi! 5A) aod 'bci dord
rhsèr d $rudùet. AJÌ{ rnrbh io
lÎl
siduesioi crchof ùe .!lo (20 : 20)lr)s br! rheprorh rrudùfr dilblok (or I fcdrced
. ú or rniio x.id sonrs mÍries
a:= f,NciccrrieùEúor(4 a) 6ùs R .dtdtde
ùì! w't
.,J tF aliPììùr:
r o re & hr s i d ù ea r j r n € ! ; l + r d o tr (rj. ir) ù'sads ùù
lntit sat^!ù1Ìt4M4t n'ad
:= 5+a(.r.b) €nd
15.4,
15,4.1 Pote.riatsof meanlb.ce ileìihoolorapitrore\idu$(ùr g ùrrypd) ipsù.h dmlj!{ bfùùc nuik.Îú usds.Lopldbysjppì(19q0)kdeln he |ollpcr'iiL lhrn ) h Í!di!t.
d,! urEr
sipplùrlbó^iq!$ù6I]l(ùnr olidstrfuNidllì}'dmp|rcbjcrc\jiuc\.Ju. iiie (a) ùd vi rnc (v). bdh of \i, ! 54,Ún\pol]ri|g'oìelÌ.qÙeyor] i(rj.rl)
lr slrouLd Lìendùr Lhr 'hcrclirtu
. \Llình hre a f€qùeì.r' dn'nhdim /"'itr l (ok rì! di{un!! ircruln
Tlì\. ri.
^*=
,"(-4*)
qùorc/(r) i!'berreqren.yorrr0ui6
orni )ft$! Nurcr bftikpÌriú apnaiùD (o rlihù disriirgfùl ririq
{rudù!
n (ìcuìr'iig
1rir6) uld i Nd'ufund
e
ùc tulolirls is flc (r) Ùd th!
MEINOD5 U'NC ]D INTEE{'-TIONS m'hcfructorcdúbùkis/(rr,rhe /d(J)
//(J)!, *--rlG)+affu!'!u. lhqc /(5) is Ìhe farùercy lror dúrame inrí
r) fù îr amiio rid pain ad 4
15.4
nd 3id 'h. sFcific più (1 ,) ob{rudon, al8e(big,r. rhcnùc lsr bm oi rhesùn
prùishst. sivins/,('
= s,c), wtuìe.
bam'ionq (,ì = 0), rh /ó(.) (nrc fdor d rrsoplaysr tut h 6e rrnsforiì ni PdMF)
= sd(.).
chrù ron lypcf jlcrùdùE (cd{d. cr{r. N-o. N,o. N N ùd o o. {orh4 pliings rch s cd{p {ùe,r$vgsered ) Thc* s6 ofdnkills providei ferM ro rhenq
\wó
{
;
i i,rs
,ù .14 h1
cd se,mrioo rh pnncipalidvùrú$ or ùc N aid o dkÌaas jr rù
15.5
.o. '. 1l
rc (of 'he qu@@ of dìe f0ctore nstD jj
me r novel (sry ro'i I r.cd s6omc) rhe e '|yi']d'jj'yùgfruNe.Hol€t(dÙnrgùl
hbrùslrtu trlining set. Mof wo*s
g
15.5.1
caFÀsqrrilliviL
qninic !d
o
15,4,2 :losards no.lcllingnrelhods
nn dr! vrrL. rqid)
h ir rh*e lp$ s Gqrjr ùú rì ùo iuohqofrnndds
rr
15.5 AlignmentNfethods
ArNft li|e mrir)
i.d dic n.0'lùd drMnic
15,5,1 l-rozen.pFrntin!liùn
b$.11rfoì tri\1!nroi r.rv ìotr
D i ù r c ù l o - 5 1< j - ? l < ? - e l < n r r K l
q=MT E RV'L ' i otùÈ ladrg d x 6., = 1lrdj = N 4 (F)
r (
\vih 'hD alucmc 4 (of ririo*n
!dÈsqequL'or)
shúùÉ) aloig orc rimetrioi and rhe sqùen.e
posi'ion/itr,.lTjrep:jJliseirtrodi ubr. r.I Leh lofhrcd) îmiro rcid rrpè pRn úd c!(h bbÈ h6 a0 $rqr tara dishîce
hr,!h$' sconrr pú lor ìoqd eEsy rbf PoMF meÀufen n exhd.d by ordinùy rc ùÒ?Ìnxprrmdion arlonrhm (FAA) reMs ilrlrvdrb ùsd by skolritrt dd rbla ir0oD. Th
nedioi aidn\i {as
nK ncÌlrior. 5J Íuf'nù tua1iÒtr nrg rhcture de cìosìy ftrard (bomologout,
n.lsi'iplè*+oGjignmm'ÚasimPlr
\1: f,
.E'
15.5.2 DotrbleDJnaóic Prqramúiùg
,,i;h,ó"d
ù,h" '.,.. ",.r"*Í
A'.{^"*...Éi.*lh"h""i,
,ituj;clrr,\,...nb.Ni!."d\L',1ìdìe! ;:
"dL
r"d" -h."--"
, . , i a i . ir *
u^"i
' <,.1r
i.*
!r orrhopùpaú1(/r'i,
P.MD n-"-i
> i.! > r rcdh.d!udld
( rhe
" " hqPded " n - i 'rob! i ' hc('!ourire ore thr\ÒÙldbe [Nrì {'r ùe rr,.j !$ùìPrio0 pnr.iN krìoLln)rhÈùirciÍ hohoPud(ifr|tsorc rúùÙiofnd dl nru{ùd ofhdh (úL') tulridic),hf ùt issisnms\ \ rl ìtriirì a hish(gool) sÙE !hih spÙrious (|îr) I lov ssisnndL siìì or\ sd ù 0 , ; r h e l ù c hL - J ' . , d r g m i t r n ( " ' d d c f i b e d n c b l o L . e r ' u d i r o i f F A A d
\tr Ì IJPTESEOIT ENCE^N CTURTTITTEÚNA
dùeÙrerbighìe€|Jigfnù'hhjls'heiìì!ù.eds|n,núelo!ls.'eìmr.i.5
15.6
rfc andobs^3d bu1ìir (ii hdh prc,cmnr howeve(n' $rc!dù!, .he ìo(ì rNcur
quiv enrfor$nÈ fuiúionatFason lpeòxps
15.6 MultipleSequènc€/Structure Threading t'hod\rtitdsmbcdfo|noìlipkaqubc!
For mdhods ùsine 3D (Èi.vnel rcn
(5umrfl)lift or sP sore). rn rhrcldù!, hoqrE. a
or.pù(vnd nly norbe
15.8
15-6.1 Simple mùltiple seqùenceth.cadirg Ls! \dr drnblrs 1ìrlo1(r9trbrdù ndl'oJ (MsTrbisedor rhe!r3re D!
15.7 C',nrhinedSeqùence/Threading\,Ielhod.
Nr rir. {,1' ú!È i,sîi,ui,)
A\ n$
on$ (rerehì krù!d.L idúLnd
15.8 Assessment of ThrcadiogMethods
ro fec
r$sqlÈcùdy by Mccrdìùcr rl. (200r).
15,8.1 FoldÈcosnition
Llithlbesmcfo]d'llgeù€dlii.dliesÌn
hemmirù.r by rbcrunb€. of on-' mfth6 (ru! È\nd!t p ored sgrimr ùe ùù,nbùoÍmis* (fitr Il*nnrn orfumriois onhes quftli1j(, jln ó ssriÌivir], 4d srediviry (k cb!p€.3). oùìctuoùFi$ns irc.ird ii ùe bibìiographi!ùfcs
15.8.2 AlìCnmen t accùr.cJ
Gù.h A ndr').h
car&r*
Dhcc
e .onú.! +,1 (or +ù \htnj ù o hdics or i2 (o: +:1)sbú5 ù tj.rnnds 1orliiI di5pLrd k, rhcFsi'ior orùc ldjae auù$rnisbrbe orÀahdix(ùrErd)
frcmrheed-qcoradomir
15.9
15,8.3 C^SPard C,\I^SI
/t f, k r," rrri,
llro ebNrk).aì
ù(n!hòhlcn rhr
ri'r lc^sP). \ $ h.!m D
nLbùi'ù€irprdiúiùm
\d { (!
.diraFAs
ri! rù\ù\ (!dr!d 0ù&\cnùn
15.9 Bibliographicnotes reer) oLhùDdhod\n'ruqrlLtrt rr!dút $!.bcd ù aú^ ud srjlorslj (ressrmd
dcslnbeditrBovi.dr (r990.1991) d REeandEisàìbùe orBrunddr 0997).àndrhcrndhod d d lFlJclrr) il oùng!ú dd.(r992)iÍdshidir.e00r) poÈDrùì orrìlri rù( j\ dcscnb.d iDsippt( r 990) ru|'jgmd'helMMdg|isinkJpìNdù1, FLrlNîqcr i'1.(1995)s,l Skinrk r|d Khúa (200r),andDDPiù r4b ed orcrlo(tqe) lnd Jones dat. (1992h). u* of mur'ielc e,ìùd!$ i\ m,faylor hd orci30( leeT),dmhin.d{q!crcdrh€adiis in Joies(lr99bl, hd Qmn{isr or ptug.Jm\fof rord€.osniriodrc nùnd in sìù dr 1200 r) andRie ùìdEi$hú! ( reeT).
ApI
Ba Prr
oprdhy codTtkd .1.( 1992) hú vnhoù rh. .kÍd (ihrn (2001)iskoLrt.k d d (200t)
deErop.dby Ffid;ds md woìyìls a 9391;Frcdnchs.r Ít. oser ) ji yhich rhe
A.l
(avH) TtÀ is ! miùixbsedona H
ctrtutrd(aNN! bortì r3D lD md6d (Lù d ar.2002i) aìd iNùfrdúù8 rD irrormrioi (Lin d r. 2drb). nÈ ANN
0'
AppendixA
Basicsin Mathematics, Probability and Algorithms
A.1 Math€maticalFo.mùlàe and Notaiion
n*.ru,i, = 1,r^* ,r" *"*, Ilî=,r, ms
rhcPrcdrtlÙ1 úe tr vù (/t r)r NùLerlft0kdebrcdò l
l" ì,,r ur--r.*m.*'. *'*J * -1:
"""''u=,,.,,='' -
/6\
-:L
L , . h m -re)
tiE Id|. Ìk
ti|c Ìk tùrÈ
kk. tart! tqkt
îhh. tnE rw
uR tE Jak.
4.2 BooleanAlgebra Boolc$ dgcbÍ!mlrs ù5dÒfiógjar f niy hke ù drev"rm '( rfr\?. rhr bsrr ro8rar openror re a,r. or ùd dr.
ìrs P = r!Àe, Q =,xr. R = r@ ftci'h. P) a"r R) = (lake a)d îtue) at (hp ùrd n!è) =fuL!
.rpr\iùr
{P dzl Q) or ({,
únbsor
y rype:nmbd, ÈreB. Gùb)seqùei.es, (strbxhduc\ s,FFurd ùh, 'ype. rlìe elerc$ u ùìdù.d bl 0.
E t u n r lre2; , 7 , 1 , n , f a!, t.,. h t . t . r . . r ! rr. ( n rr.r . ( r r . 3 r l
Ìdu
t/
is,u itr s no.rn clcnsno
v i5 thc /ir.i.,.r. a E d d i yu v + v
thor. ot rI c u
u = t c K . LR . , s . Yvt .= { a . c . R . w r .t
! v = ta.c K,r..R,s.r.w.Yl.
uu(v tA,R.r,- Ic.K,r-R.s.uYt.
{ c D . E . n . K . R , Y 1t =a , rG , ,r ,L M , Np. , a , sy. . v . w l .
4.4
Prcbability
,{,4.1 Pernulation and.onbiùatioù
róúmpres: ÀÀ.^c, ac. aT.ca. cc. cG,cT, CC. CG.GT.TA.TC.TG, TT
oÌdù.dandno1repìsad. r2ÍDpìes:ac,ac.Aîca,cc.cîca, CC,GT.TA.TC.'|o cc.GT,1T, desrAc. ac. f. cc. cr, cr.
4.4,2 Prcbab'lity distributions hjliiydnùiburionlorsinDiyrhcdisùibùrioi)drùrbeìilrrr,Ìr,.! rogdhù bcr ri = pti = r,l. Ìhepftb,hirnysat x h4thev:,ruea, Nor rhú0 < 4 < | aùdù\drt=,, = r.
Ld x b€rhercstrhÒrrbóqiq i (rue)die.rbù 'he{ìue sprc.i\ 1r.2,3,4.5.6t. andrheprobabiriiy disdbújoir ll/ó.1t6,1/6. t/6. tt6,tt6t.
reis ùe *.ighrd (bl ùc pmbrbiliq) avmeè
E= L,,,,'.
A.5
4.5 Tables, VectorsandMatrices m,r hrt dli.ùùrn one (. Lad4 ). u rcdo.ae sp!! rld br$. úer. /i
A.ó AlgorithmicLanguage
6fs|e.fyii3ùlrds'.n']=j\Àù
hù A |.\\lti1úuÌ
ìt alxÒnr? 2
îm {ix serúeHru,ri=r&.
-ùm\,rsrúe \iEt:=,
f,.
\ ù m : =0 ; r r i é ( 2 . 73 . l t d 0 $ n = $ m + { c n d i r I I v i h . nv : = y ! { r l . r d
4.7 Conptexity
dda size(i'ìporsi*) Fùrridpricny\
r[c. (/r r)r. î.
'iiù ùfeas$ rrparlr {irh ù.xrd v! sy rhd.henmecomphxiry
;
apt
Int Bic
B.l
AppendixB
Introductionto Molecular Biology
R.l TheCellandthe Molecules ofLife: DNA-RNA Proteins
ar ! Geììsofliisrìq orgìÌisnn lrhin a u,r4A sfùidd
(Led DN^ rdeoryribqùdo! ú ) lr hsNror. ( i Frk,ryoÈt of !s?aÌ trtr!úrrú$)
h\ úecybplòù rhc
xid rlr ene DNA , DLc!ùL\
fl &-{
\a
DNAisÙifÎudc'rbylòùdiflGí idúy,iiic(dcÍd.d bya.c,crndT).îÈ {ruduroofDN jr ! donbhherix(whichwfsoi hd cork disco\rcd in l9s3). 11È
oNi\linsd qc Dunn.us! tA ora) rndoÌe Dîmidiic bi$ (c or r) paied rccord c pria vnh c rhi\ me!6 rhd irone
ror.rùryrc,
r 5.qù.ica acrTccr
fsrùsn(orr ceìì a'nuhi..llùlfofgrniii)
I t 1 lf
ò"q[ .,i
l.
"i
ri8urB l(r),rd(b).Tlìemirìoridsaloìsrhepolyl.pt dc!hînùcdrcDrùcrd
îiì
i
-zi--
,,
it.
i-4.,
,.i l,l
;
r-l:t
z!-
-ll
...
"*"lhi:,,
risr
Lr
(r) aù ,0 mùo !iù
hirc ,
"^ùùdFpú"bù
rh",;._no
d"
onsotifrc]idlgsisc.lledlltz.Plid.,,,/, aldlh.chiinarodg'hsa6msNcd{
B.3
.ùbÒ\ (nc) 'cm rh ("hlE 'hec I úr prkc of rhFine. $ úc brs ìn RNA ú. dsnoreda, c. c .nd u. wììjtc DNA
(hq$ensùRlra), úi.h camcsgeieri
B.2 Chromosomes andc€nes TheÙdÙsd'ùes|irydictrr5@ €rns(mi.ty hnbn* aid roeeúùùc conchrcmrii) Elrry soiìrj. {hddy)cell ùsùx[y ùduda.*o.opic lld is Òarrcd (exdùdir! 'hc sex x. y .bDmosoDet. rh. nùmbq or ch oI sch .hfonorm
8.4
iinrmrin
j.rqnirudi'r! rrdrj"
(ruì RNA.qrlùrif rrr !ùLL.JI gNproù ú). h r.N (i priùirive s'Niryoric or,lrntrm).
beencrlsd.lÚkDNA!n!cnnu]dùfufrhd.i\ ùp ol s.xller piefts (exoN inc+|gsd bt $ (lllqiftEqiiD!
ÈsiEs (ùLrùn
ir! ofr ofrhegcicÉ tr Drlnbcd(c s (opidk,nRNA) r!nrhci'rurfo{)li!.11 ouraìd rbeEnrùù3 {Fru()nRNA n hDikd ro r pturir GÈ fiem r.5)
{iys (ndudifs direEí
$bid\
Òr thc orB
in , hxÙr fs;a)
-!iriù,! sero i
8.3 TheCentralDogmaofMolecularBiology ,.'sdrg* RNA)comaininc'herme i0lorú hm rh piots is.ikd 1a,3.,?/, oehnery (ohsomt Nhilhrer A irpùr Nhirhn goirgro be (pú ot l pdeir sÈ Fietre 8.5 tu iù' a porypeorre Dùrbúof rifcred p.odods hm.eint.
B.,t Th€GeneticCode o acidro be in 'heprobii rhc fdtislnn' .'hc'hrc! d'sofrcs vùrd hc (r) rs,sd.ì,.. . (2) r!-!, .r4.
. aod G) sc
(
B.5
FEIEB!
EIMPIè GÀI&Ù d{ DNA
^Jlj
Fdd, oF!D
ftcE !rc 64 (= lr ) drjrrrcd r prcr 1 lnm rid lh..ode is fid ro be rcdurdar).Tr nq,piig Íor Lnptr\ó h^\
(hùnrn) gcncn 'bo!n ù Fuúo B 1
8,5 ProtcinFurction u! NLd {rr 'li|!' wr.L (h}dropìùìic an o a.idi). oùc6m rìo{ hrppj'qhetr trorii difú coÌrld Nfh rd{(hydrcphob,! lmuo a!i'(). HldùphÒbiraùi',o i.
í lompq,otr! ((nFrdutrt
or llr Gìl
s@ft
---fiìd
PksÙTtIcFU
l.!|lo|BA4c
xerrAs$fù
vi
arr kP
cry
rhis i imFdbr
b;rb!
-
8.s.1 ThegeneonlologJ údioi
ru! qos-
}|'J!tr"3l,.hadddfI
8.6 Protein Structure
l,lÌ .r.-. _.'.
1 1..".
ì-.1:
Ìl
"' i,"r'-'
, .l-
:)
r .tl
i.r
[rd4qb.ú!bfukplrc!.ùc9+d
ron'nÒr). i' NaN dìal NÍh
nrd\ 'h
snrtue Òr'G rur $m4 FoÈh' b! ùlirs pórn dot.ù.
dÈ Èfabúhld (F!t (\{ucnfl
ùb hy] nÎn lìùÈns'ú,di.dad
ìúir iùren.rlois
bd{ùr
p
Thc/3fNd\l€romedbylì],drceeùb iD3, ed iq sd jud! qDnfúc I , tfd. bÈpùdÈr (gÒùg ù rhÈ$Fc dn4riotr
rro îrj&rd
fdÀ
n ! $@' mrshr
I :
!rudure or a snrn prcbio (instin). fte pdr of I imtur mr beinchori
t.\
:". i ,. ^.
i\
-..1\
,'ltrt.;' i-; .\ t I i"'l ,-'.tQF* i
'--/
,5i"''
\
hllPr !,r rcs of/ù/mdq!
Prnmry This
"
óhù ! rù
simrr! Lhcùùf Glqúùtù
|qriJù,sùcbúnbesiroby(r., oùrtmrry.
or mino úds lLons ùè porypepùde
cNiúaL !ùf,ùrice or ùe omroù (Òrepory J.ooidiir$incrlhrq!i,tunotùre
f rhcPro'Èi'ì.oùsist or
uÈ wlúchmr bc gtrcnhYd,o ( r. r. ,
Bindi.s!Èi dic!úlystona,tur(Lh.sùbúrceuoD{hi(hrholn/ynoldt @!o''icÙ.ùis'in]cdbylstdfc\ùu$' nsuquult!FoL.rùÙph.thco4neL-ypln
8.8
rdss.ies'fd!E{.'1úaklbúoi
ll;n{r!.
tu{rns (xodsmnúnhe súqed ido/dir.r. vhicììcoùnLtì' pftrns (or
rEPfesùrld
6 I ke) mon tiliuly kr g
bfih ù'h! (tuDgodio roirs.rìlerqrenes of r€n.qnwld qr 3eF\),oi rorÈ{ npi..ù\iig gciÈ
gÙ€sG.g I hGomi
Iì.8 InsulinExanplc (i ìo) c|mrro id{ìized oriimprijirdextutrples
nÌci
sqrcrq. sltr
posible.i red
(oLdìonr poìyDeÍids) ThedÙa
r joNr ú di! rÙcÈ$ Go úr Ì Én rt \F!ìrì.Jly tor sll !úir!. Frci
(tr,cNLir €.epro').rhrcush\ùtb dnnesrle
^!n\ dùlt Thei.su!ùmdrùebNo aDddùdphidecmslii$(hd\cs!y{c m {orcdft,Ò\irisPrled $pfucpoìypeprid.chah(!r!èJA,idÙ.ìxii,1 xs r ùn or n r ndcar$ (itÀÌDrt) s
ql prpmi'sùìil. pPr)h!! rr0 rcliles ir ÈrofrPÌ n mbdrúobir beexporedfmmrh!ldrnù\ î roi:g. rùidc k, $úr rchò9 Thi i'Nir
rorLcdL (!
À Lrt,sD) Î,nqÍriukd
{fnh sf j'Nlitr.i'ro
rcure.nùung
Re Fi4FÌ.e
rùsù4ior des.tuftLh
or
rodermiDed.n u\ hund dBri6ulin wù] r.e..
)j,
1. , r tmed'.
'i .....^,1"' ,'",,"
? \:r
r\ nrli
8.9 Bibliographicnotes lesnedùBmdciand foor(r999)ùJLcsk(200D
, _. úr.,s î! rPfl b,
References úk
úrwd14
t 1@2 Mdda.
bìdap atb
%fuhúioîdrcÈlqb|ogidndGji'drú
cat rercè ù ct@rúLd D|LBn,z {rd hsruoîlÌd
arMùt st c8rcùRud Liptu D rs39weùhstue
rcùshcMllMd&ir?4J'I?-4,
GùdrD, m ssr6.
eL,ú byI rE,
^ ùd Èr-" d.úo
rd ,br ú,
ùtd"ù"dihsù
. bri Nrljrur
&!ús s{ s+r \
{in4LrDjry,P|1140]Jtr,$/'
s ordr !úrqr
.lu
.rz rr ,ù/ 6. sr ro
r r 3, ù0 b ù{r rj sùaoi ri!. uùiituE!
oiniú!úrfDB
cùret
iì.;,ùi.rf
tùtl 4, Eiútd
ù4\:
4rùrt@
G.8e3l
d,.1..!r&rrlr
RqìLù! r
ù
\<ìd
tu .on 4 hkÌzùt
XùÙn ùl Fs ,( ú/a/,/,4
sst!ù'lo
4'i
,dtrlt!]i:'',/41:f'55'fAaenL!Prc$
,iiit,ro,!r
rlèj
r. ciiJr r dr Rrd
s reri î'D
s(r0). ú5...ri?.
.ttr\ 4rl{h4ù"-
! iso{L
!\
Úbrlo,ù/ir !l'
r07r1
msbl J. Di ne |! D.lùDor N6' s
\6dr Púrir
! Lic
I ó'F|(J'A4cPI
nD)!
+i.'P4
$ùhdurc5
D Prubi s. ,r'rtqzdd
Pdiùlao M r4r hrtuic
tqN.K6.
h rh.
eìjor
at ù.rú
14, nG5,
on
ht cùI
ùt Lttthat !ri414 ttt Mù!ùù BbL9 @ L$gWT,sldFiryl'EJctfi0dùt$4
nfbdù
Ro$Ìar!
Nrc 4d&q
v ú6r sr@drÌLry
orPllscPd&
P re76 rixrhntrg
úros itrùml rsrcoes .rrnr
!(r), 14.
ùoDig
tuhd
i
Br
Nfhd
ù
!q
!ùq!!
s$d, (, I r{ nì qq RH!D Pr{.itri! .l r,r 4 r, r(nr. spnos
ù,h/j'|44[,tJ(edc!]so!ùdiì[nÈid
silhdùK
k4rs
K. B,{ , \'. ùlstrlr-
or .rr?osr00) re le
Int qhs4@rcè{ci!lú4.Nliinvsú!sll
ondm ii er nùÌ{Nrc
r+nmsr }i[ 4pri.
Index
ó\zlLì6
PROTEIN BIOINFORMATICS Ar Algorihnìt Apptodth tó Se|uend dnd Sîtu.ture Andlfsis Dt
nùfl
aÍtthntittu,Irli*rir'.l
Dtuikr 4Ndkúúi
Birl'}
tuedù oI Fn$ Ed Fdúi.
Bú!ù, Nr''q Nútn.t hritút rt !tuti.tt ù&îr\
búhn. uK
r h( i rt rcrhs rbú
Dhitrlo'rrnJi
PNlù' uùilanùúr, an Akú,ìiùi app trmrj5r ù idary uisr Io, lrEnù,r r 6ren( soúa ù ihe sbied 6r