|
Current
Bioinformatics
ISSN: 1574-8936

Current Bioinformatics
Volume 5, Number 4, December 2010
Contents
Graphical Representations of Protein Sequences for Alignment-Free
Comparative and Predictive Studies. Recognition of Protease
Inhibition Pattern from H-Depleted Molecular Graph Representation
of Protease Sequences Pp. 241-252
Michael Fernandez, Julio Caballero, Leyden Fernandez
and Akinori Sarai
[Abstract]
[Full Text
Article]
A Review of Methods and Tools for Database
Integration in Biomedicine Pp. 253-269
Alberto Anguita, Luis Martín, David Pérez-Rey
and Víctor Maojo
[Abstract]
[Full Text
Article]
Pre-Processing of Affymetrix Gene Chip
Microarray Data Pp. 270-279
Ahmed R. Hasan, John E. Pattison and Alex Hariz
[Abstract]
[Full Text
Article]
Microarray Data Integration: Frameworks
and a List of Underlying Issues Pp. 280-289
Chintanu K. Sarmah and Sandhya Samarasinghe
[Abstract]
[Full Text
Article]
Network Building of Proteins in a Biochemical
Pathway: A Computational Biology Related Model for Target
Discovery and Drug-Design Pp. 290-295
Chiranjib Chakraborty, Sanjiban S. Roy, Chi-Hsin Hsu,
Zhi-Hong Wen and Chan-Shing Lin
[Abstract]
[Full Text
Article]
A Review of Ensemble Methods in Bioinformatics
Pp. 296-308
Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert
Y. Zomaya
[Abstract]
[Full Text
Article]
Abstracts

[Back to top]
[Full Text Article]
Graphical Representations of Protein Sequences for Alignment-Free
Comparative and Predictive Studies. Recognition of Protease
Inhibition Pattern from H-Depleted Molecular Graph Representation
of Protease Sequences
Michael Fernandez, Julio Caballero, Leyden Fernandez
and Akinori Sarai
Biomacromolecular information is hinged by sequence and
structure representations. Because structure is often more
conserved than sequence, achieving function inference from
structural similarity is easier than from sequence analysis.
However, structural information is sparse and only available
for a small part of the protein space. Detecting subtle similarities
between proteins from sequence depends strongly on the representations
used. Continuous-space representations yield promising results
in comparative evolution analysis, structural classification
and sequence-function/property relationship studies. These
simple methods provide a pre-classification and/or feature
generation stages to sophisticated classification methods.
We review the state-of-the-art in protein sequence graphical
representations along with some derived metrics for statistical
pattern recognition analysis. In addition, the binding stability
pattern of protease-inhibitor complexes is modelled from H-depleted
molecular graph representation of protease sequences and ligands
using support vector machines with about 80% prediction accuracy.
[Back to top]
[Full Text Article]
A Review of Methods and Tools for Database Integration
in Biomedicine
Alberto Anguita, Luis Martín, David Pérez-Rey
and Víctor Maojo
The post-genomic era, beginning at the end of the Human
Genome Project, has led to new needs and challenges in the
management of clinical and -omics data. One of the main issues
in this new context for biomedical data management is the
integration of heterogeneous sources, enabling access to different,
remote biological data sources and the interpretation and
discovery of new knowledge. Many researchers and practitioners
in a wide range of biomedical areas, such as, for instance,
all those related to genomic and personalized medicine, have
to access these data located at numerous remote sources. Over
the last decade, this new scientific context has stimulated
research into developing new techniques for seamless web-based
data integration and access. Some of the main challenges include
the integration of scattered, non-structured public databases,
how to deal with sensitive personal information, or how to
manage image data. This paper presents a review of methods,
techniques and tools for data integration.
[Back to top]
[Full Text Article]
Pre-Processing of Affymetrix Gene Chip Microarray
Data
Ahmed R. Hasan, John E. Pattison and Alex Hariz
Microarray technology has revolutionized biomedical
research because it is now possible to concurrently determine
the gene expression levels for the whole genome of a target
organism. The accuracy of the computed gene expression levels
is extremely important for the successful use of this technology.
However, microarray gene expression measurements are inherently
very ‘noisy’, meaning that appropriate techniques
are required to compute accurate gene expression levels. Therefore,
the pre-processing of microarray data warrants special consideration.
Although there are many candidate techniques for the pre-processing
of microarray data, there is no clear-cut best option. In
this review, we discuss some of the most important pre-processing
techniques applicable to the Affymetrix microarray platform.
We also discuss the problems involved in evaluating the different
candidate techniques and consider other crucial issues related
to the pre-processing of Affymetrix microarray data.
[Back to top]
[Full Text
Article]
Microarray Data Integration: Frameworks and a List
of Underlying Issues
Chintanu K. Sarmah and Sandhya Samarasinghe
Microarray technology is expanding rapidly providing an extensive
as well as promising source of data for better addressing
complex questions involving biological processes. The ever
increasing number and publicly available gene expression studies
of human and other organisms provide strong motivation to
carry out cross-study analyses. Besides, microarray technology
provides several platforms to investigators that include arrays
from commercial vendors like Af-fymetrix®
(Santa Clara, CA, USA), Agilen®
(Palo Alto, CA, USA), and other proprietorial arrays of various
laboratories. Integration of multiple studies that are based
on the same technological platform, or, combining data from
different array platforms carries the potential towards higher
accuracy, consistency and robust information mining. The integrated
result often allows constructing a more complete and broader
picture.
In this work, we highlight as well as exemplify two frameworks
of microarray data integration approaches that are in prac-tice.
This follows a discussion on the important issues that may
influence any microarray data integration attempt. The re-view,
in general, intends to serve as a starting point for those
interested in exploring this area of microarray study, while
realizing the pertinent issues underneath.
[Back to top]
[Full Text
Article]
Network Building of Proteins in a Biochemical Pathway:
A Computational Biology Related Model for Target Discovery
and Drug-Design
Chiranjib Chakraborty, Sanjiban S. Roy, Chi-Hsin Hsu,
Zhi-Hong Wen and Chan-Shing Lin
With the advances in bioinformatics, drug design strategies
have been advanced with the focus on target discovery. ‘Proteins
and enzymes’ target class represents potential drug
target for different diseases. A networking of biochemical
pathway of a disease and their drug target needs a thorough
understanding. The accessibility of fully sequenced genomes,
its total information and their product (proteins and enzymes)
have enabled researchers to reconstruct and study the networks
based biochemical pathways. Network building of proteins in
a biochemical pathway is of utmost importance which helps
us to identify the protein (enzyme or receptor) as drug target
whose inhibition will achieve to discover a set of compounds
(drug like molecules), while incurring minimal side effects.
In this paper, we have discussed about the networking of proteins
using biochemical pathway which can be used as target for
drug development, architecture of biochemical networks, network
algorithms, tools and software packages for network building.
[Back to top]
[Full Text Article]
A Review of Ensemble Methods in Bioinformatics
Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert
Y. Zomaya
Ensemble learning is an intensively studied technique in machine
learning and pattern recognition. Recent work in computational
biology has seen an increasing use of ensemble learning methods
due to their unique advantages in dealing with small sample
size, high-dimensionality, and complex data structures. The
aim of this article is two-fold. Firstly, it is to provide
a review of the most widely used ensemble learning methods
and their application in various bioinformatics problems,
including the main topics of gene expression, mass spectrometry-based
proteomics, gene-gene interaction identification from genome-wide
association studies, and prediction of regulatory elements
from DNA and protein sequences. Secondly, we try to identify
and summarize future trends of ensemble methods in bioinformatics.
Promising directions such as ensemble of support vector machines,
meta-ensembles, and ensemble based feature selection are discussed.
|