Aplicaciones de la bioinformática en la medicina: el genoma humano. ¿Cómo podemos ver tanto detalle?
Bioinformatics Applications in Medicine: The Human Genome. How Can We See Such Detail?
DOI:
https://doi.org/10.15446/abc.v21n1Supl.51233Palabras clave:
e, bioinformática, genoma humano, genómica personalizada, secuenciamiento. (es)bioinformatics, human genome, personalized genomics, sequencing (en)
Descargas
La bioinformática es un campo novedoso que soporta parte de la investigación biológica dirigida a la identificación de variantes génicas que pueden ser descubiertas desde los estudios de genomas completos. Basados en esta motivación se presenta el panorama general de los aportes principales de la bioinformática en el desarrollo del secuenciamiento del primer genoma humano. Adicionalmente se resumen los principales avances en cómputo desarrollados para responder a las demandas requeridas por los métodos de secuenciamiento de última generación para lograr re-secuenciar un genoma humano. Finalmente se introducen algunos de los nuevos retos que deben asumirse para aplicar la genómica personalizada en el desarrollo de la medicina.
Bioinformatics is a new field that supports part of the biological research aimed at identifying gene variants that can be discovered from studies of whole genomes. Based on this motivation the overview of the main contributions of bioinformatics in the development of sequencing the first human genome is presented. Additionally it is summarized the main advances in computing developed to meet the demands to re-sequence a human genome by using the next generation sequencing technologies. Finally some new challenges that must be faced to apply the personalized genomics into the medicine development are introduced.
Referencias
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403-410. Doi:10.1016/S0022-2836(05)80360-2
Arratia R, Lander ES, Tavaré S, Waterman MS. Genomic mapping by anchoring random clones: A mathematical analysis. Genomics. 1991;11(4):806-827. Doi:10.1016/0888-7543(91)90004-X
Baer R, Bankier A, Biggin M, Deininger P, Farrell P, Gibson T, et al. DNA sequence and expression of the B95-8 Epstein—Barr virus genome. Nature. 1984;310:207-211. Doi:10.1038/310207a0
Bailey LCJr, Searls DB, Overton, GC. Analysis of EST-driven gene annotation in human genomic sequence. Genome Res. 1998;8:362-376. Doi: 10.1101/gr.8.4.362
Baker M. De novo genome assembly: what every biologist should know. NatureMethods. 2012;9:333–337. Doi:10.1038/nmeth.1935
Benjamin AF, Dale W, Jessa L, Kevin T, Eric O, Tyson AC et al., Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat Methods. 2010; 7(6): 461–465.doi:10.1038/nmeth.1459Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456(7218): 53–59. doi:10.1038/nature07517
Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013;41(Database issue):D36-42. doi: 10.1093/nar/gks1195.; disponible en URL: (http://www.ncbi.nlm.nih.gov)
Bermudez-Santana C. Buscando agujas en un pajar: viajes de RNAs pequeños in silico e in vitro Acta biol Colomb. 2011;16(3):103-114.
Bilofsky HS, Burks C, Fickett JW, Goad WB, Lewitte FI, Rindone W, et al. The GenBank genetic sequence data bank. Nucleic Acids Res. 1986;14(1):1-4. Doi:10.1093/nar/14.1.1
Birney E, Thompson JD, Gibson TJ. PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames. Nucleic Acids Res. 1996;24:2730-2739. Doi:10.1093/nar/24.14.2730
Brunak S, Engelbrecht J, Knudsen S. Neural network detects errors in the assignment of mRNA splice sites. Nucleic Acids Res. 1990;18:4797-4801. Doi:10.1093/nar/18.16.4797
Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78-94: Doi:10.1006/jmbi.1997.0951
Burke W, Burton H, Hall AE, Karmali M, Khoury MJ, Knoppers B, et al. Extending the reach of public health genomics: What should be the agenda for public health in an era of genome-based and “personalized” medicine? Genet Med. 2010;12(12):785-791. Doi:10.1097/GIM.0b013e3182011222
Cannon G. Nucleic acid sequence analysis software for microcomputers. Anal Biochem. 1990;190(2):147-153. Doi:10.1016/0003-2697(90)90172-6
Chen R. Snyder M. Promise of personalized omics to precision medicine. Wiley Interdiscip Rev Syst Biol Med 2013;5(1):73-82. doi: 10.1002/wsbm.1198.
Dayhoff MO, Schwartz RM, Orcutt, BC. A model of evolutionary change in proteins. In: Atlas of Protein Sequence and Structure. Vol. 5. suppl. 3. Dayhoff MO, editor. Washington, DC: Biomed Res Found; 1978. p. 345-352.
DeLissi C. Santa Fe 1986: Human genome baby-steps. Nature. 2008;455(16):876-878. Doi: 10.1038/455876a
Feng DF, Doolittle RF. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol. 1987;25:351-360. Doi:10.1007/BF02603120
Fickett JW, Tung CS. Assessment of protein coding measures. Nuc Acids Res. 1992;20: 6441-6450. Doi:10.1093/nar/20.24.6441
Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995; 269(5223):496-512. Doi:10.1126/science.7542800
Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W. A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. 1998;8(9):967-974.
Fraser C, Gocayne J, White O, Adams M, Clayton R, Fleischmann R, et al. The Minimal Gene Complement of Mycoplasma genitalium. Science. 1995;270(5235):397-404. Doi:10.1126/science.270.5235.397
Gelfand MS, Mironov AA, Pevzner PA. Gene recognition via spliced sequence alignment. Proc Natl Acad Sci USA. 1996;93:9061-9066. Doi:10.1073/pnas.93.17.9061
Gilles A, Meglécz E, Pech N. Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing. BMC Genomics. 2011;12:245. Doi:10.1186/1471-2164-12-245
Gribskov M, McLachlan M, Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci USA. 1987;84:4355–5358. Doi:10.1073/pnas.84.13.4355
Grossi R, Vitter JS. Compressed Suffix Arrays and Suffix Trees, with Applications to Text Indexing and String Matching. SIAM J Sci Comput. 2005;35(2):378-407. Doi:10.1137/S0097539702402354
Guigo R, Knudsen S, Drake N, Smith TF. Prediction of gene structure. J Mol Biol. 1992;226:141-157. Doi:10.1016/0022-2836(92)90130-C
Guttmacher AE, McGuire AL, Ponder B, Stefánsson K. Personalized genomic information: preparing for the future of genetic medicine. Nat Rev Genet. 2010;11:161-165. Doi: 10.1038/nrg2735
Kelly MJ. Computers: the best friends a human genome ever had. Genome. 1989;31(2):1027-1033. Doi:10.1139/g89-177
Kent, WJ, Haussler D. GigAssembler: an algorithm for the initial assembly of the human working draft. Technical Report UCSC-CRL-00-17. Santa Cruz, California: University of California at Santa Cruz; 2001. p. 1-11.
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The human genome browser at UCSC. Genome Res. 2002;12(6):996-1006. Doi:10.1101/gr.229102.
Kristofferson D. The BIONET electronic network. Nature. 1987;325:555-556. Doi:10.1038/325555a0
Kulp D, Haussler D, Reese MG, Eeckman FH. A generalized hidden Markov model for the recognition of human genes in DNA. ISMB. 1996;4:134-142.
Hamm GH, Cameron GN. The EMBL Data Library. Nucleic Acids Res. 1986;14:5–9. Doi:10.1093/nar/14.1.5
Henson J, Tischler G, Ning Z. Next-generation sequencing and large genome assemblies Pharmacogenomics. 2012;13(8):901–915. Doi:10.2217/pgs.12.72
Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, et al. Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput Biol. 2009:5(9):E1000502. Doi:10.1371/journal.pcbi.1000502
Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L. et al.
The Ensembl genome database project. Nucl Acids Res. 2002;30(1):38-41. Doi:10.1093/nar/30.1.38
Human Genome Quarterly. Oak Ridge National Laboratory Health and Safety Research Division Information Research and Analysis Section United States Department of Energy Office of Health and Environmental Research. ISSN: 1044-0828 1989.
IHGSC. The International Human Genome Sequencing Consortium, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409(6822):860-921. Doi:10.1038/35057062
Jurka J. Repeats in genomic DNA: mining and meaning. Curr Opin Struct Biol. 1998;8:333-337. Doi:10.1016/S0959-440X(98)80067-5
Jurka J. Repbase Update: a database and an electronic journal of repetitive elements. Trends Genet. 2000;9:418-420. Doi:10.1016/S0168-9525(00)02093-X
Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetitive elements. Cytogent Genome Res. 2005;110:462-467. Doi:10.1159/000084979
Lander ES, Waterman MS. Genomic Mapping by Fingerprinting Random Clones: A Mathematical Analysis. Genomics. 1988;2(3):231–239. Doi:10.1016/0888-7542(88)90007-9
Langmead B, Trapnell C, Pop M, Salzberg S. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009;10:R25. Doi:10.1186/gb-2009-10-3-r25
Lesk AM. The EMBL data library. In: Lesk AM, editor. Computational Molecular Biology. Sources and Methods for Sequence Analysis. Oxford: Oxford University Press; 1988. p. 55–65.
Letovsky SI, Cottingham RW, Porter CJ, Li PW. GDB: the Human Genome Database. http://www.gdb.org. Nucleic Acids Res. 1998;26(1):94-99. Doi:10.1093/nar/26.1.94
Li H, Ruan J. Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008; 18: 1851-1858 . Doi: 10.1101/gr.078212.108
Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25(14):1754-1760. Doi:10.1093/bioinformatics/btp324
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078-9. Doi:10.1093/bioinformatics/btp352
Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program. Bioinformatics. 2008;24:713-714. Doi:10.1093/bioinformatics/btn025
Lipman DJ, Pearson WR. Rapid and sensitive protein similarity searches. Science. 1985;227:1435–1441. Doi:10.1126/science.2983426
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis toolkit: a mapreduce framework for analyzing next-generation dna sequencing data. Genome Res. 2010;20(9):1297-303. Doi:10.1101/gr.107524.110
Magi A, D’Aurizio R, Palombo F, Cifola I, Tattini L, Semeraro R, et al. Characterization and identification of hidden rare variants in the human genome. BMC Genomics. 2015;16(1):340. Doi:10.1186/s12864-015-1481-9
McBride C, Alford S, Reid R, Larson E, Baxevanis A, Brody L. Putting science over supposition in the arena of personalized genomics. Nat Genet. 2008;40(8):939-942. Doi:10.1038/ng0808-939
Medvedev P, Stanciu M, Brudno M. Computational methods for discovering structural variation with next-generation sequencing. Nature methods. 2009;6(11):S13-S20. Doi:10.1038/nmeth.1374
Metzker ML. Sequencing technologies — the next generation. Nature Reviews Genetics. 2010;11:31-46. Doi:10.1038/nrg2626
Mott R. EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA. Comput Appl Biosci. 1997;13:477-478. Doi:10.1093/bioinformatics/13.4.477
Mural RJ, Uberbacher EC. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. Proc Natl Acad Sci USA. 1991;88:11261–11265. Doi:10.1073/pnas.88.24.11261
Mikheyev AS, Mandy MYT. A first look at the Oxford Nanopore MinION sequencer. Molecular Ecology Resources. 2014; 14(6): 10971102. doi:10.1111/1755-0998.12324
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ. A Whole-Genome Assembly of Drosophila. Science. 2000;287:2196-2204. Doi:10.1126/science.287.5461.2196
Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48:443-453. Doi:10.1016/0022-2836(70)90057-4
NHGRI. National Human Genome Research Institute. [Sitio oficial] . Última actualización 26 de Enero de 2016. [citado 5 de Febrero de 2016]; disponible en URL: http://www.genome.gov/
Offit K. Personalized medicine: new genomics, old lessons. Hum Genet. 2011;130:3–14. Doi:10.1007/s00439-011-1028-3
Ouzounis C, Valencia A. Early bioinformatics: the birth of a discipline a personal view. Bioinformatics. 2003;19(17):2176–2190. Doi:10.1093/bioinformatics/btg309
Palca J. Human genome-Department of Energy on the map. Nature. 1986;321:371. Doi:10.1038/321371a0
Pevzner PA, Tang H, Waterman MS. An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci. 2001;98:9748–9753. Doi:10.1073/pnas.171285098
Pirooznia M, Kramer M, Parla J, Goes FS, Potash JB, McCombie WR, et al. Validation and assessment of variant calling pipelines for next-generation sequencing. Hum Genomics. 2014;8:1-14. Doi:10.1186/1479-7364-8-14
Pruit KD, Maglott DR. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 2001;29:137–140. Doi:10.1093/nar/29.1.137
Reese MG, Kulp D, Tammana H, Haussler D. Genie—gene finding in Drosophila melanogaster. Genome Res. 2000;10:529-538. Doi:10.1101/gr.10.4.529
Rusk N. Torrents of sequence. Nat Meth. 2011;8(1): 44-44.doi:10.1038/nmeth.f.330.
Sade W, Zunyan Dai Z. Pharmacogenetics/genomics and personalized medicine. Hum Mol Genet. 2005;14(2):R207-R214. Doi:10.1093/hmg/ddi261
Sanger F, Coulson AR. A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. J Mol Biol. 1975;94(3):441–448. Doi:10.1016/0022-2836(75)90213-2
Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA. 1977;74(12):5463–5467. Doi:10.1073/pnas.74.12.5463
Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes JC, et al. Nucleotide sequence of bacteriophage ΦX174 DNA. Nature 1977;265(5596):687-695. Doi:10.1038/265687a0
Sankoff D. Matching sequences under deletion-insertion constraints. Proc Natl Acad Sci USA. 1972;69(1):4-6. Doi:10.1073/pnas.69.1.4
Stenson PD, Mort M, Ball EV, Shaw K, Phillips AD, David N, et al. The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet. 2014;133:1–9. Doi:10.1007/s00439-013-1358-4
Sinsheimer RL. The Santa Cruz Workshop—May 1985. Genomics. 1985;5:954–956. Doi:10.1016/0888-7543(89)90142-0
Smith TF, Waterman MS. Comparison of biosequences. Adv Appl Math. 1981a;2:482-489. Doi:10.1016/0196-8858(81)90046-4
Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981b;147:195–197. Doi:10.1016/0022-2836(81)90087-5
Smith DH, Brutlag DL, Friedland P, Kedes LH. BIONET: a national computer resource for molecular biology. Nucleic Acids Res. 1986;14:17–20. Doi:10.1093/nar/14.1.17
Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996-2010. Available from: http://www.repeatmasker.org/
Solovyev V, Salamov A. The Gene-Finder computer tools for analysis of human and model organisms genome sequences. Proc Int Conf Intell Syst Mol Biol.1997;5:294–302.
States DJ, Botstein D. Molecular sequence accuracy and the analysis of protein coding regions. Proc Natl Acad Sci USA. 1991;88(13):5518-5522. Doi:10.1073/pnas.88.13.5518
Tateno Y, Imanishi T, Miyazaki S, Fukami-Kobayashi K, Saitou N, Sugawara H, et al. DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res. 2002;30(1):27-30. Doi:10.1093/nar/30.1.27
The International Human Genome Mapping Consortium . A physical map of the human genome. Nature. 2001;409:934-941. Doi:10.1093/nar/30.1.27
The 1000 Genomes Project Consortium. A map of human genome variation from population - scale sequencing. Nature. 2010;467(7319):1061-1073. Doi:10.1038/nature09534
Valouev A, Ichikawa J, Tonthat T, Stuart J, Ranade S, Peckham H et al., A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Research. 2008;18:1051–1063. Doi:10.1101/gr.076463.108
Weber JL, Myers EW. Human Whole-Genome Shotgun Sequencing. Genome Res. 1997;7:401-409. Doi:10.1101/gr.7.5.401
Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, et al. The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008;452:872–876. Doi:10.1038/nature06884
Wilbur WJ, Lipman DJ. Rapid similarity searches of nucleic acid and protein data banks. Proc Natl Acad Sci. USA. 1983;80:726-730. Doi:10.1073/pnas.80.3.726
Wold B, Myers RM. Sequence census methods for functional genomics. Nat Methods. 2008; 5:19-21. Doi:10.1038/nmeth1157
Zhang J, Chiodinic R, Badra A, Zhang G. The impact of next-generation sequencing on genomics. J Genet Genomics. 2011;38(3):95-109. Doi:10.1016/j.jgg.2011.02.003
Cómo citar
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Descargar cita
CrossRef Cited-by
1. Claudia Navarrete-López, Manuel Herrera, Bruno M. Brentan, Edevar Luvizotto, Joaquín Izquierdo. (2019). Enhanced Water Demand Analysis via Symbolic Approximation within an Epidemiology-Based Forecasting Framework. Water, 11(2), p.246. https://doi.org/10.3390/w11020246.
Dimensions
PlumX
Visitas a la página del resumen del artículo
Descargas
Licencia
Derechos de autor 2016 Acta Biológica Colombiana
Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
1. La aceptación de manuscritos por parte de la revista implicará, además de su edición electrónica de acceso abierto bajo licencia Attribution-NonCommercial-ShareAlike 4.0 (CC BY NC SA), la inclusión y difusión del texto completo a través del repositorio institucional de la Universidad Nacional de Colombia y en todas aquellas bases de datos especializadas que el editor considere adecuadas para su indización con miras a incrementar la visibilidad de la revista.
2. Acta Biológica Colombiana permite a los autores archivar, descargar y compartir, la versión final publicada, así como las versiones pre-print y post-print incluyendo un encabezado con la referencia bibliográfica del articulo publicado.
3. Los autores/as podrán adoptar otros acuerdos de licencia no exclusiva de distribución de la versión de la obra publicada (p. ej.: depositarla en un archivo telemático institucional o publicarla en un volumen monográfico) siempre que se indique la publicación inicial en esta revista.
4. Se permite y recomienda a los autores/as difundir su obra a través de Internet (p. ej.: en archivos institucionales, en su página web o en redes sociales cientificas como Academia, Researchgate; Mendelay) lo cual puede producir intercambios interesantes y aumentar las citas de la obra publicada. (Véase El efecto del acceso abierto).