DOI 10.1093bioinformaticsbtg1024 Predicting

Vol. 19 Suppl. 1 2003, pages i183–i189

BIOINFORMATICS

Vol.19Suppl.12003,pagesi183–i189DOI:

DOI 10.1093bioinformaticsbtg1024 Predicting

10.1093/bioinformatics/btg1024

Predictingphenotypefrompatternsofannotation

OliverD.King1,JeffreyC.Lee1,Aim´eeM.Dudley2,DanielM.Janse2,GeorgeM.Church2andFrederickP.Roth1,

1Department

ofBiologicalChemistryandMolecularPharmacology,HarvardMedical

School,250LongwoodAvenue,Boston,Massachusetts,02115,USAand

2DepartmentofGenetics,HarvardMedicalSchool,200LongwoodAvenue,Boston,Massachusetts,02115,USA

ReceivedonJanuary6,2003;acceptedonFebruary20,2003

ABSTRACT

Motivation:Predictingtheoutcomeofspeci cexperi-ments(suchasthegrowthofaparticularmutantstraininaparticularmedium)hasthepotentialtoallowresearcherstodevoteresourcestoexperimentswithhigherexpectednumbersof‘hits’.

Results:WeusedecisiontreestopredictphenotypesassociatedwithSaccharomycescerevisiaegenesonthebasisofGeneOntology(GO)functionalannotationsfromtheSaccharomycesGenomeDatabase(SGD)andotherphenotypicannotationsfromtheYeastPhenotypeCatalogattheMunichInformationCenterforProteinSequences(MIPS).Weassessthemethodologyinthreeways:(1)weusecross-validationonthephenotypicannotationslistedinMIPS,andshowROCcurvesindicatingthetradeoffbetweentrue-positiverateandfalse-positiverate;(2)wedoaliterature-searchfor100ofthepredictedgene-phenotypeassociationsthatarenotlistedinMIPS,and ndevidencefor43ofthem;(3)weusedeletionstrainstoexperimentallyassess61predictedgene-phenotypeassociationsnotlistedinMIPS;signi cantlymoreofthesedeletionstrainsshowabnormalgrowththanwouldbeexpectedbychance.

Contact:fritzroth@hms.harvard.edu

SupplementaryInformation:Completeresultsareavail-ableathttp://llama.med.harvard.edu/~king/pheno.htmlKeywords:decisiontrees;phenotype;genefunction

INTRODUCTION

Whenanorganism’sgenomehasbeensequencedanditsgenesidenti ed,thereremainsthetaskofdeterminingtheroleofeachgeneintheorganism,aspectsofwhichincludethegene’smolecularfunctionandthephenotypesassociatedwiththegene’sdisruption.Thereisaninterplaybetweenagene’sfunctionalattributesandphenotypicattributes,witheachprovidinginformationabouttheother—seeHampsey(1997)foranoverviewofSaccha-Towhomcorrespondenceshouldbeaddressed.

romycescerevisiaephenotypesandtheirrelationshipstofunction.

Effortstostandardizethevocabularyoffunctionandphenotypehavefacilitatedtheuseofstatisticalmethodstoinferfunctionfromphenotypeandvice-versa.InClareandKing(2002),decisiontreeswereusedtoextractrulesforinferringfunctionfromphenotypeinS.cerevisiae.InthispaperwebuilddecisiontreesforinferringphenotypefromfunctioninS.cerevisiae.OurapproachdiffersinmanydetailsfromtheapproachinClareandKing(2002),butcloselyfollowstheapproachusedinKingetal.(2003)forpredictingfunctionalannotationsonthebasisofotherfunctionalannotations.Asgenesmayhavemultiplephenotypicannotations,andastheremaybeinformativepatternsamongtheseannotations,inthispaperwepredictphenotypenotonthebasisofannotatedfunctionalone,butonthebasisofbothfunctionandotherphenotypicannotations.

ThetrainingdataweuseconsistsoftheGeneOntology(GO;TheGeneOntologyConsortium,2000)annotationsoffunctionfromtheSaccharomycesGenomeDatabase(SGD;Cherryetal.,1998),andtheannotationsofphenotypefromtheYeastPhenotypeCatalogattheMunichInformationCenterforProteinSequences(MIPS;Mewesetal.,2002).

Weassessourmethodologyusingthreeapproaches:cross-validation;aliterature-searchontop-scoringpre-dictionsofgene-phenotypeassociationsthatarenotlistedinMIPS;andcomparisonwithahigh-throughputexper-imentaldeterminationofphenotypeforacomprehensivecollectionofyeastdeletionstrains.

METHODSTrainingdata

WedownloadedtheMIPSphenotypicannotationsfromhttp://mips.gsf.de/proj/yeast/catalogues/phenotypeandtheSGDGOannotationsfromhttp://www.51wendang.com.Theversionsofthe lesthatweusedweredown-loadedonOctober10,2002.

i183

cOxfordUniversityPress2003;allrightsreserved.Bioinformatics19(Suppl.1)

DOI 10.1093bioinformaticsbtg1024 Predicting相关文档

最新文档

返回顶部