# [phenixbb] sculptor : "Wrong alignment format:"

Dr G. Bunkoczi gb360 at cam.ac.uk
Mon Nov 29 03:36:59 PST 2010

yes, it could be. Sculptor has to find the sequence corresponding to the
protein model, and it will first align the chain sequence with all
sequences in the alignment, and it picks the best one. This can take some
time. On my machine, searching a 190-sequence alignment takes about 5 mins.
However, if you have several chains in you model and you want all of them
to be processed, the total time will be the multiple of 5 mins and the
number of protein chains.

Now, I am wondering what you are trying the achieve with using such a large
alignment. If this is something you consider routine, I will spend some
time speeding up the calculation.

Obviously, you must be trying to extract as much information from the
sequence alignment as possible, and I am not sure the sequence similarity
calculation as implemented in sculptor is optimal for this (right now,
sculptor will just take the minimum of all pairwise substitution scores for
a certain position). This works well for a pairwise sequence alignment, but
for a 190-sequence alignment just results in gap scores everywhere. Could
you also give some advice on how this is best calculated? Would it be
better to calculate the average?

Best wishes, Gabor

On Nov 26 2010, Bryan Lepore wrote:

>finally got back around to this one, but its about speed now, not format :
>
>On Mon, Nov 22, 2010 at 5:07 AM, Dr G. Bunkoczi <gb360 at cam.ac.uk> wrote:
>>>> Is this what you are running (0.3.0)?
>
>yes (via dev-590)
>
>> Could point me to an example that takes very long? I can give another
>> go in finding the bottleneck.
>
>i could - but if i told you i have 190 sequences or 189529 characters
>(via `wc`) in the alignment, does that indicate anything?
>
>-Bryan
