Structural the number of domains present in each

Structural domains with no more than 40%
sequence identity with each other were obtained from ASTRAL 2.06 (corresponding
to SCOPe 2.06) and these were organized into superfamilies. The superfamilies
were then classified on the basis of the number of domains present in each
superfamily into single-member superfamilies (SMS) and multi-member
superfamilies (MMS).

Initial alignment of the structural
domains within each superfamily was obtained using MATT (Multiple Alignment
with Translations and Twists) (5) which also provided a structural distance
based tree. Using JOY-5.0v (6), the initial alignment was annotated with
secondary structure, hydrophobicity, solvent accessibility and other structural
features. JOY-4.0v (6) was used to identify equivalences in the
initial alignment- non-gapped aligned regions in each member of the
superfamily. The structure-guided tree and equivalences were provided as inputs
for COMPARER (7), which uses variable gap penalties and local structural features such as
backbone conformation, solvent accessibility and hydrogen bonding patterns to
create the final structure-based sequence alignment. In general, the variable
gap penalties ensure that there are no unreasonable gaps in between secondary
structures and conserved regions within the alignment. JOY-3.2v or MNYFIT (6) is used for rigid-body superposition of the structures and it
requires equivalences as input, which is extracted from the final alignment
using JOY-4.0v.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

Although members of a superfamily are
expected to be structurally similar or have a common fold, we came across cases
where one or more domain(s) in the superfamily would be structurally deviant;
these would either have more than 5.5 Å RMSD with other members (structural
outliers) or fail to align with other members (extreme structural outliers) (8,9). We also encountered situations, where
the extreme structural outliers would align within themselves, forming subgroups
ofsuperfamilies (split superfamilies).  There
were cases where even on removal of several extreme structural outliers, the
remaining members of superfamily would fail to align. In such cases, using the
structural phylogeny of all the members as reference, the superfamily would be
split and each subgroup aligned separately, thus giving rise to ‘split
superfamilies’.

 

Hidden Markov Models or HMMs of alignments of
superfamily members, along with conserved secondary structural motifs, have
been created using hmmbuild module of HMMER suite and in-house SMotif
programmes, respectively (10–12). Absolutely conserved residues
were extracted for the alignments for all superfamilies using a Python script
to read alignments and look for 100% amino acid conservation at particular
positions.

The alignments were annotated with JOY-5.0v to
produce accessory files such as PSA, HBD, SST etc. Principal component analysis
(PCA) plots have been constructed on the basis of sequence similarity
distribution of members of a superfamily and are available for download. Other
than these, alignment statistics (ALISTAT) and indel information (CUSP) have
been provided for each superfamily (13). C-alpha RMSDs at structurally
equivalent positions of members of each superfamily were used to construct
structure-guided trees which are available for download. Gene ontology (GO)
represents properties of gene product under three major terms, namely cellular
component, molecular function and biological process (14). GO term(s) corresponding to
each member within superfamilies were retrieved dynamically from www.rcsb.org
using the RestFul API clients written in Python.

MySQL
5.2 was employed as database engine for this version, along with Python2.7 and
BioPython (15) for back-end data
retrieval implementation and manipulation logic. The user interface has been
built on components from HTML5, CSS, JavaScript, Ajax and JQuery. The visualization
of the molecular structures and phylogenetic tree has been implemented using
JSMol and raphael and jsPhyloSVG (16). The visualization
of the alignment and mapping of conserved residues have been implemented using an
in-house plug-in.

x

Hi!
I'm Barry!

Would you like to get a custom essay? How about receiving a customized one?

Check it out