ÿþ<HTML><HEAD><TITLE>25º Congresso Brasileiro de Microbiologia </TITLE><link rel=STYLESHEET type=text/css href=css.css></HEAD><BODY aLink=#ff0000 bgColor=#FFFFFF leftMargin=0 link=#000000 text=#000000 topMargin=0 vLink=#000000 marginheight=0 marginwidth=0><table align=center width=700 cellpadding=0 cellspacing=0><tr><td align=left bgcolor=#cccccc valign=top width=550><font face=arial size=2><strong><font face=Verdana, Arial, Helvetica, sans-serif size=3><font size=1>25º Congresso Brasileiro de Microbiologia </font></font></strong><font face=Verdana size=1><b><br></b></font><font face=Verdana, Arial,Helvetica, sans-serif size=1><strong> </strong></font></font></td><td align=right bgcolor=#cccccc valign=top width=150><font face=arial size=2><strong><font face=Verdana, Arial, Helvetica, sans-serif size=1><font size=1>ResumoID:1919-1</font></em></font></strong></font></td></tr><tr><td colspan=2><br><br><table align=center width=700><tr><td>Área: <b>Genética e Biologia Molecular ( Divisão N )</b><p align=justify><strong><SPAN STYLE="FONT-WEIGHT: BOLD;"></SPAN>ARTIFICIAL INTELLIGENCE APPLITED TO BIOENERGY GENOMICS: PROBABILISTIC ANNOTATION OF MICROBIOAL GENOMES<BR></strong></p><p align=justify><b>Fabio Filocomo </b> (<i>FMRP-USP</i>); <b><u>Ricardo Vêncio </u></b> (<i>FMRP-USP</i>)<br><br></p><b><font size=2>Resumo</font></b><p align=justify class=tres><font size=2> <meta http-equiv="CONTENT-TYPE" content="text/html; charset=utf-8"><title></title><meta name="GENERATOR" content="OpenOffice.org 2.3 (Linux)"> <style type="text/css"> <!-- @page { size: 21.59cm 27.94cm; margin: 2cm } P { margin-bottom: 0.21cm } --> </style> <p style="background: transparent none repeat scroll 0% 50%; margin-bottom: 0cm; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; line-height: 150%;" align="justify"> <font size="3"><b>INTRODUCTION:</b></font><font size="3"> Brazil became a key player on energy production from renewable sources, in particular ethanol derived from sugarcane (1st-gen) and  waste cellulose (2nd-gen). Microbial Genomics has a key role in both approaches. However, it is important to acknowledge that sequencing data acquisition have been growing but the actual functional characterization of these sequences</font><font size="3"><i> </i></font><font size="3">grows much slower. </font><font size="3"><b>OBJECTIVE:</b></font><font size="3"> We aim to develop computational tools for automatic probabilistic annotation of microorganisms, in particular the Bioenergy-related: </font><font size="3"><i>Leifsonia</i></font><font size="3"> </font><font size="3"><i>xyli</i></font><font size="3">, which causes sugarcane diseases and </font><font size="3"><i>Trichoderma</i></font><font size="3"> </font><font size="3"><i>reesei</i></font><font size="3">, a cellulosic ethanol producer. The technical challenge is to define the probability that a gene belongs to a given functional category instead of just assigning a function when it meets some arbitrary criteria. </font><font size="3"><b>METHODOLOGY:</b></font><font size="3"> The starting point is the Phylogenomics-based SIFTER method, developed by </font><font size="3"><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">Engelhardt and colleagues in 2005, 2006 and 2009, which uses the</span></font><font size="3"> </font><font size="3"><i>Bayesian Networks</i></font><font size="3"> (BN) methodology. The BN topology for each gene is build using a phylogenetic tree (PT) and any information available on related genes are propagated through the tree using classical BN algorithms. This procedure is an advancement compared to BLAST-based approaches. The datasets used are the genome sequences of </font><font size="3"><i>L.xyli</i></font><font size="3"><span style="font-style: normal;"> </span></font><font size="3"><span style="font-style: normal;"><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">and </span></span></font><font size="3"><i><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">T</span></i></font><font size="3"><span style="font-style: normal;"><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">.</span></span></font><font size="3"><i><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">reesei</span></i></font><font size="3"><span style="font-style: normal;"><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">, made available in 2004 and 2008, respectively. </span></span></font><font size="3"><span style="font-style: normal;"><b><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">RESULTS</span></b></span></font><font size="3"><span style="font-style: normal;"><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">: The</span></span></font><font size="3"> SIFTER methodology was implemented as a pipeline in our lab and tests with the current state-of-the-art tools confirm the claims of superior performance in preliminary results</font><font size="3"><span style="font-style: normal;"><span style="background: transparent none repeat scroll 0% 50%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial;">, considering the published manually curated data as gold-standard. </span></span></font><font size="3">A careful theoretical study of BN properties and Microbial Genomics requirements resulted on the identification of some limitations of SIFTER's underlying models. In Microbiology it is well know that a phylogenetic network (PN) would better represent evolutionarily relationships among genes in contrast to PT due to events such as lateral transfers. Our results pointed that BNs have properties, such as being direct acyclic graphs, that fit perfectly into the PN framework. Theoretical studies resulted in improvements on the annotation methodology and we anticipate that such new mathematical models would produce better results when dealing specifically with microorganisms. The necessary confirmation is being carried out. </font><font size="3"><span style="">Acknowledgments: </span></font><font size="3">MCT/CNPq's grant Universal-A/470616/2008-3.</font></p> </font></p><br><b>Palavras-chave: </b>&nbsp;Bioinformática, Biologia Computacional, Anotação Genômica, Inteligência Artificial, Redes Bayesianas</td></tr></table></tr></td></table></body></html>