Protein secondary structure carries information about local structural
arrangements. Significant majority of successful methods for predicting the secondary
structure is based on multiple sequence alignment. However, the multiple alignment
fails to achieve accurate results when a protein sequence is characterized by low
homology. To this end, we propose a novel method for prediction of secondary
structure content through comprehensive sequence representation. The method is
featured by employing a support vector machine (SVM) regressing system and adopting
a different pseudo amino acid composition (PseAAC), which can partially take into
account the sequence-order effects to represent protein samples. It was shown by both
the self-consistency test and the independent-dataset test that the trained SVM has
remarkable power in grasping the relationship between the PseAAC and the content of
protein secondary structural elements, including α-helix, 310-helix, π-helix, β-strand, β-
bridge, turn, bend and the rest random coil. Results prior to or competitive with the
popular methods have been obtained, which indicate that the present method may at
least serve as an alternative to the existing predictors in this area.
Keywords: Pseudo Amino acid composition, support vector machine, protein
secondary structure content, prediction.