Welcome to MilkyWay@home

Unusual..........time to completion

Message boards : Number crunching : Unusual..........time to completion
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Conan
Avatar

Send message
Joined: 2 Jan 08
Posts: 123
Credit: 69,546,274
RAC: 1,032
Message 11383 - Posted: 18 Feb 2009, 11:40:29 UTC

G'Day Travis,

Here is something different to credit (I might contribute to that later if this problem keeps happening).

I have just returned a result 5048610 , that has gone way past normal run time with optimised app. Normal time is around 990 to 1,100 seconds for the s21 and s20 work units but this one took 6,301 seconds.
I have another s86 running at the moment that has been going for 33 minutes and is at 30%, this one should of finished after 670 seconds.
If this becomes normal then I will have to stop this computer as the recent credit cuts make this a 5 credit/ hour machine and that is not worth the electricity.

Thanks, Conan.
ID: 11383 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 11393 - Posted: 18 Feb 2009, 12:07:44 UTC - in response to Message 11383.  

G'Day Travis,

Here is something different to credit (I might contribute to that later if this problem keeps happening).

I have just returned a result 5048610 , that has gone way past normal run time with optimised app. Normal time is around 990 to 1,100 seconds for the s21 and s20 work units but this one took 6,301 seconds.
I have another s86 running at the moment that has been going for 33 minutes and is at 30%, this one should of finished after 670 seconds.
If this becomes normal then I will have to stop this computer as the recent credit cuts make this a 5 credit/ hour machine and that is not worth the electricity.

Thanks, Conan.


Have you tried using the stock app? I'm not quite sure what would be bugging your optimized application out... especially since it might not necessarily be our code. If anything it shouldn't be normal...
ID: 11393 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11445 - Posted: 18 Feb 2009, 15:55:44 UTC - in response to Message 11383.  

Here is something different to credit (I might contribute to that later if this problem keeps happening).

I have just returned a result 5048610 , that has gone way past normal run time with optimised app. Normal time is around 990 to 1,100 seconds for the s21 and s20 work units but this one took 6,301 seconds.
I have another s86 running at the moment that has been going for 33 minutes and is at 30%, this one should of finished after 670 seconds.
If this becomes normal then I will have to stop this computer as the recent credit cuts make this a 5 credit/ hour machine and that is not worth the electricity.

Thanks, Conan.

I have seen this, too. I couldn't find the reason yet, but it appears to happen only for certain WUs (but for all types, so it is no single/dual stream issue).

@Travis:
Are there some WUs with an unusual parameter set (some angles outside of [-2Pi, 2Pi] or whatever) and/or are you seeing some unusual results? The affected WUs just take longer, they still come out as valid, so at least the outlier detection does not kick in.

I will try to narrow it down and will report, if the stock app also shows this behaviour. At least if I can find an affected WU on my computers.
ID: 11445 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 11448 - Posted: 18 Feb 2009, 16:07:55 UTC - in response to Message 11445.  

Here is something different to credit (I might contribute to that later if this problem keeps happening).

I have just returned a result 5048610 , that has gone way past normal run time with optimised app. Normal time is around 990 to 1,100 seconds for the s21 and s20 work units but this one took 6,301 seconds.
I have another s86 running at the moment that has been going for 33 minutes and is at 30%, this one should of finished after 670 seconds.
If this becomes normal then I will have to stop this computer as the recent credit cuts make this a 5 credit/ hour machine and that is not worth the electricity.

Thanks, Conan.

I have seen this, too. I couldn't find the reason yet, but it appears to happen only for certain WUs (but for all types, so it is no single/dual stream issue).

@Travis:
Are there some WUs with an unusual parameter set (some angles outside of [-2Pi, 2Pi] or whatever) and/or are you seeing some unusual results? The affected WUs just take longer, they still come out as valid, so at least the outlier detection does not kick in.

I will try to narrow it down and will report, if the stock app also shows this behaviour. At least if I can find an affected WU on my computers.



Angles outside of +/- 2PI shouldn't effect the calculation time at all.

What might be happening is there's a line in the optimized client code that does the following check:

if (alpha==delta==1) {
double rs = rg + r0;
(*bg_prob) += qw_r3_N[i] / (rg*rs*rs*rs);
} else (*bg_prob) += qw_r3_N[i] / (pow(rg, alpha) * pow(rg + r0, alpha_delta3));

What might be happening is that since alpha and delta are doubles (not integers) they might not always evaluate == 1, which would make the application do the more expensive calculation.

It'd be doing 2 pows and a multiply, instead of 3 multiplies in the middle of our integral loop, which could make it run quite a bit longer.
ID: 11448 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11472 - Posted: 18 Feb 2009, 18:25:44 UTC - in response to Message 11448.  
Last modified: 18 Feb 2009, 18:30:35 UTC

Angles outside of +/- 2PI shouldn't effect the calculation time at all.
Who knows? But the trigonometric stuff is really unlikely. Maybe these WUs are hitting a limit of the argument range for some other function causing some additional work. I have no idea in the moment.

What might be happening is there's a line in the optimized client code that does the following check:

if (alpha==delta==1) {
double rs = rg + r0;
(*bg_prob) += qw_r3_N[i] / (rg*rs*rs*rs);
} else (*bg_prob) += qw_r3_N[i] / (pow(rg, alpha) * pow(rg + r0, alpha_delta3));

What might be happening is that since alpha and delta are doubles (not integers) they might not always evaluate == 1, which would make the application do the more expensive calculation.

It'd be doing 2 pows and a multiply, instead of 3 multiplies in the middle of our integral loop, which could make it run quite a bit longer.

Have you seen this behaviour also with the stock application?

If this is the reason, it would surprise me as 1.0 has an exact binary representation and the values in alpha and delta are just read from the parameters file and not the result of a calculation. I think the runtime difference is slightly too small for this being the reason. Furthermore it should have been visible much earlier, not only in the last days. Already the public version of the optimized 1.22 app has used this trick.

Maybe I find some time to look into it later today. First I need to find an affected WU to play around with. But in the moment I'm busy trying to get the Win32 ATI GPU app running ;)
ID: 11472 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11543 - Posted: 19 Feb 2009, 1:14:12 UTC

It is definitely not the selection of the code path going wrong.

Just found two neighbouring WUs:
ps_s23_9_search_parameters_132749_1235002485
ps_s23_9_search_parameters_132750_1235002485

The first being a slow one, the second takes the normal time. The difference is that for the slow one the first stream integral (it is a dual stream WU) is very close to zero (10^-269 after half of the WU, just looked in a checkpoint). The second one shows more common values (some hundred).
I guess what happens is an underflow within the calculations. The handling of the denormalized floating point numbers most likely cause the speed difference.

I have saved the parameter files, so one can investigate that later. Have to get some sleep now (it's 2:13 a.m. here).
ID: 11543 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 11568 - Posted: 19 Feb 2009, 3:40:55 UTC - in response to Message 11543.  

Yep, just got a pair as well.



One is blazing and the other iced.
ID: 11568 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 11579 - Posted: 19 Feb 2009, 6:35:13 UTC - in response to Message 11448.  

What might be happening is that since alpha and delta are doubles (not integers) they might not always evaluate == 1, which would make the application do the more expensive calculation.

It'd be doing 2 pows and a multiply, instead of 3 multiplies in the middle of our integral loop, which could make it run quite a bit longer.


Should you not be then doing an ABS difference and then checking to see if they are within a tolerance?

Probably not C code but:

if ABS( alpha - delta) < 0.0001 then ... (or whatever the error bound is ...
ID: 11579 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11591 - Posted: 19 Feb 2009, 9:25:34 UTC - in response to Message 11568.  
Last modified: 19 Feb 2009, 9:38:29 UTC

Yep, just got a pair as well.



One is blazing and the other iced.

Is that your Linux Notebook or the OSX one? For your Windows Core2Duo it's too slow I guess.
That would show it happens also with the stock app (OSX) and the Linux variant (compiled from unaltered sources). That would make it a general problem.
Maybe not for the GPU app (at least performance wise) as GPUs flush denormals to zero. In fact, I have not found slow WUs running on a GPU, yet.

So we are back to some strange parameters of the affected WUs or some quirk of the algorithm present in all applications. Have to work now, a more thorough look at the problem has to wait a bit.
ID: 11591 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11593 - Posted: 19 Feb 2009, 9:31:43 UTC - in response to Message 11579.  

What might be happening is that since alpha and delta are doubles (not integers) they might not always evaluate == 1, which would make the application do the more expensive calculation.

It'd be doing 2 pows and a multiply, instead of 3 multiplies in the middle of our integral loop, which could make it run quite a bit longer.


Should you not be then doing an ABS difference and then checking to see if they are within a tolerance?

Probably not C code but:

if ABS( alpha - delta) < 0.0001 then ... (or whatever the error bound is ...

As I explained above, the values alpha and delta are not the result of a calculation, they are read from a parameter file. It is okay to test them for being equal to exactly 1.0.
ID: 11593 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 11598 - Posted: 19 Feb 2009, 9:50:48 UTC - in response to Message 11593.  

What might be happening is that since alpha and delta are doubles (not integers) they might not always evaluate == 1, which would make the application do the more expensive calculation.

It'd be doing 2 pows and a multiply, instead of 3 multiplies in the middle of our integral loop, which could make it run quite a bit longer.


Should you not be then doing an ABS difference and then checking to see if they are within a tolerance?

Probably not C code but:

if ABS( alpha - delta) < 0.0001 then ... (or whatever the error bound is ...

As I explained above, the values alpha and delta are not the result of a calculation, they are read from a parameter file. It is okay to test them for being equal to exactly 1.0.



Yeah, I haven't had any problems with this either, however if we compile our code using MPI it does throw a warning. That's why I thought it might be a problem.
ID: 11598 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 11599 - Posted: 19 Feb 2009, 9:54:18 UTC - in response to Message 11591.  

So we are back to some strange parameters of the affected WUs or some quirk of the algorithm present in all applications. Have to work now, a more thorough look at the problem has to wait a bit.


Yeah this is very strange, especially since the code is pretty deterministic, just a bunch of for loops with some calculations in the middle. I guess the next step would be to recreate the problem.


If anyone sees a WU running slowly and can copy the parameter files for it and post them here (or in the code discussion forum), that would be really helpful in seeing if it's some problem with the parameter files or something else.
ID: 11599 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11603 - Posted: 19 Feb 2009, 10:31:46 UTC - in response to Message 11599.  
Last modified: 19 Feb 2009, 10:36:20 UTC

So we are back to some strange parameters of the affected WUs or some quirk of the algorithm present in all applications. Have to work now, a more thorough look at the problem has to wait a bit.


Yeah this is very strange, especially since the code is pretty deterministic, just a bunch of for loops with some calculations in the middle. I guess the next step would be to recreate the problem.


If anyone sees a WU running slowly and can copy the parameter files for it and post them here (or in the code discussion forum), that would be really helpful in seeing if it's some problem with the parameter files or something else.

Done that already.
slow WU (ps_s23_9_132749_1235002485):
ps_s23_9
parameters [14]: 0.384982866758338 11.716938315232948 -9.530803058571149 244.000000000000000 24.879482473428617 0.671980681029048 1.195094522806092 0.347010963442642 -2.503350179338553 127.081445210280563 42.589815649952655 -0.773784478366408 -0.170601219297428 4.217621990463419
metadata: p: 25, v: -0.32212604431145441808 -7.24983019171141052084 -9.99016639791811300597 83.60989900627373572206 -3.96856150450361511162 -0.89646890334008133561 -0.14631965843522620752 -3.41733213852705741687 -2.50228714667047125175 -12.67721075492640459004 -1.45595380931267825986 -0.55616183961568199656 -0.11059700452165788231 1.41831072203777797114


normal WU (ps_s23_9_132750_1235002485):
ps_s23_9
parameters [14]: 0.500328464135606 14.146488926207834 -7.482278426832348 125.656033665464520 34.791471228744534 -0.409786454609398 0.197787903906911 10.587824794891745 -10.612079867124347 120.000000000000000 39.700270546547692 -2.369933143104002 0.456520722526815 11.994485610071056
metadata: p: 26, v: -0.00658038053454158313 -0.42932120142078844527 2.49709041383319219065 -10.91291757291236130811 -10.26014421603925796944 -0.96618751761879506645 -0.09580232216327819184 7.53826578563810478784 -1.87679150831014185385 0.00000000000000000000 -5.57224893664772302770 -3.01263491905762048617 0.65262994540488483697 1.64374790339396126271


Both should use the astronomy parameters:
number_parameters: 4
background_weight: 0.000000
background_parameters[4]: 1.00000000000000000000, 0.56999999999999995115, 12.00000000000000000000, 1.00000000000000000000
background_step[4]: 0.02000000000000000042, 0.00000400000000000000, 0.00008000000000000001, 0.02000000000000000042
background_min[4]: 0.00000000000000000000, 0.29999999999999998890, 1.00000000000000000000, 0.10000000000000000555
background_max[4]: 3.00000000000000000000, 1.00000000000000000000, 30.00000000000000000000, 3.00000000000000000000
optimize_parameter[4]: 0, 1, 1, 0
number_streams: 2, 5
stream_weight: -3.300000
stream_weight_step: 0.000001
stream_weight_min: -20.000000
stream_weight_max: 20.000000
optimize_weight: 1
stream_parameters[5]: 148.00000000000000000000, 22.00000000000000000000, 0.41999999999999998446, -0.46999999999999997335, 0.80000000000000004441
stream_step[5]: 0.00003000000000000000, 0.00004000000000000000, 0.00006000000000000000, 0.00004000000000000000, 0.00000400000000000000
stream_min[5]: 120.00000000000000000000, 2.29999999999999982236, -6.28318530717958623200, -6.28318530717958623200, 0.10000000000000000555
stream_max[5]: 244.00000000000000000000, 45.60000000000000142109, 6.28318530717958623200, 6.28318530717958623200, 20.00000000000000000000
optimize_parameter[5]: 1, 1, 1, 1, 1
stream_weight: -1.400000
stream_weight_step: 0.000001
stream_weight_min: -20.000000
stream_weight_max: 20.000000
optimize_weight: 1
stream_parameters[5]: 177.00000000000000000000, 23.00000000000000000000, 1.19999999999999995559, -1.60000000000000008882, 8.50000000000000000000
stream_step[5]: 0.00003000000000000000, 0.00004000000000000000, 0.00006000000000000000, 0.00004000000000000000, 0.00000400000000000000
stream_min[5]: 120.00000000000000000000, 2.39999999999999991118, -6.28318530717958623200, -6.28318530717958623200, 0.10000000000000000555
stream_max[5]: 244.00000000000000000000, 45.60000000000000142109, 6.28318530717958623200, 6.28318530717958623200, 20.00000000000000000000
optimize_parameter[5]: 1, 1, 1, 1, 1
convolve: 60
sgr_coordinates: 0
wedge: 23
r[min,max,steps]: 16.000000, 22.500000, 700
mu[min,max,steps]: 120.000000, 244.000000, 1600
nu[min,max,steps]: -1.250000, 1.250000, 160
number_cuts: 1
r_cut[min,max,steps][3]: 16.000000, 22.500000, 350
mu_cut[min,max,steps][3]: 207.000000, 209.000000, 400
nu_cut[min,max,steps][3]: -1.250000, -0.800000, 40


As already mentioned the stream_integral[0] comes out very low for the slow WU (which gives a quite bad fitness value). Most likely the application experiences an underflow condition and the handling of the denormalized numbers slows things down.
ID: 11603 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 11606 - Posted: 19 Feb 2009, 10:51:01 UTC - in response to Message 11603.  
Last modified: 19 Feb 2009, 10:52:23 UTC


slow WU (ps_s23_9_132749_1235002485):
ps_s23_9
parameters [14]: 0.384982866758338 11.716938315232948 -9.530803058571149 244.000000000000000 24.879482473428617 0.671980681029048 1.195094522806092 0.347010963442642 -2.503350179338553 127.081445210280563 42.589815649952655 -0.773784478366408 -0.170601219297428 4.217621990463419
metadata: p: 25, v: -0.32212604431145441808 -7.24983019171141052084 -9.99016639791811300597 83.60989900627373572206 -3.96856150450361511162 -0.89646890334008133561 -0.14631965843522620752 -3.41733213852705741687 -2.50228714667047125175 -12.67721075492640459004 -1.45595380931267825986 -0.55616183961568199656 -0.11059700452165788231 1.41831072203777797114


normal WU (ps_s23_9_132750_1235002485):
ps_s23_9
parameters [14]: 0.500328464135606 14.146488926207834 -7.482278426832348 125.656033665464520 34.791471228744534 -0.409786454609398 0.197787903906911 10.587824794891745 -10.612079867124347 120.000000000000000 39.700270546547692 -2.369933143104002 0.456520722526815 11.994485610071056
metadata: p: 26, v: -0.00658038053454158313 -0.42932120142078844527 2.49709041383319219065 -10.91291757291236130811 -10.26014421603925796944 -0.96618751761879506645 -0.09580232216327819184 7.53826578563810478784 -1.87679150831014185385 0.00000000000000000000 -5.57224893664772302770 -3.01263491905762048617 0.65262994540488483697 1.64374790339396126271


The v in the metadata shouldn't effect what the application is doing. The metadata is basically just cut and pasted from the parameters to the output. The v is just that particular parameter sets velocity for particle swarm optimization.
ID: 11606 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 11607 - Posted: 19 Feb 2009, 10:55:14 UTC - in response to Message 11606.  
Last modified: 19 Feb 2009, 10:56:12 UTC

It might be because one of the stream parameters is at the bound in the main integral (ie, the expensive integral):


slow WU (ps_s23_9_132749_1235002485):
ps_s23_9
parameters [14]: 0.384982866758338 11.716938315232948 -9.530803058571149 244.000000000000000 24.879482473428617 0.671980681029048 1.195094522806092 0.347010963442642 -2.503350179338553 127.081445210280563 42.589815649952655 -0.773784478366408 -0.170601219297428 4.217621990463419
metadata: p: 25, v: -0.32212604431145441808 -7.24983019171141052084 -9.99016639791811300597 83.60989900627373572206 -3.96856150450361511162 -0.89646890334008133561 -0.14631965843522620752 -3.41733213852705741687 -2.50228714667047125175 -12.67721075492640459004 -1.45595380931267825986 -0.55616183961568199656 -0.11059700452165788231 1.41831072203777797114


normal WU (ps_s23_9_132750_1235002485):
ps_s23_9
parameters [14]: 0.500328464135606 14.146488926207834 -7.482278426832348 125.656033665464520 34.791471228744534 -0.409786454609398 0.197787903906911 10.587824794891745 -10.612079867124347 120.000000000000000 39.700270546547692 -2.369933143104002 0.456520722526815 11.994485610071056
metadata: p: 26, v: -0.00658038053454158313 -0.42932120142078844527 2.49709041383319219065 -10.91291757291236130811 -10.26014421603925796944 -0.96618751761879506645 -0.09580232216327819184 7.53826578563810478784 -1.87679150831014185385 0.00000000000000000000 -5.57224893664772302770 -3.01263491905762048617 0.65262994540488483697 1.64374790339396126271


ID: 11607 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11608 - Posted: 19 Feb 2009, 11:09:12 UTC - in response to Message 11607.  

It might be because one of the stream parameters is at the bound in the main integral (ie, the expensive integral):


slow WU (ps_s23_9_132749_1235002485):
ps_s23_9
parameters [14]: 0.384982866758338 11.716938315232948 -9.530803058571149 244.000000000000000 [snip]


normal WU (ps_s23_9_132750_1235002485):
ps_s23_9
parameters [14]: 0.500328464135606 14.146488926207834 -7.482278426832348 125.656033665464520 [snip]



So maybe the bound can be lowered a bit to prevent these cases (it appears to be a bad fit either way from the fitness values of the affected WUs).
ID: 11608 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 11618 - Posted: 19 Feb 2009, 12:36:13 UTC - in response to Message 11591.  

It was the OSX host, too bad it happened just before I went to bed.
ID: 11618 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 11619 - Posted: 19 Feb 2009, 12:44:18 UTC - in response to Message 11608.  

It might be because one of the stream parameters is at the bound in the main integral (ie, the expensive integral):


slow WU (ps_s23_9_132749_1235002485):
ps_s23_9
parameters [14]: 0.384982866758338 11.716938315232948 -9.530803058571149 244.000000000000000 [snip]


normal WU (ps_s23_9_132750_1235002485):
ps_s23_9
parameters [14]: 0.500328464135606 14.146488926207834 -7.482278426832348 125.656033665464520 [snip]



So maybe the bound can be lowered a bit to prevent these cases (it appears to be a bad fit either way from the fitness values of the affected WUs).


I'll get a change in the boundary checking code today which will keep things slightly inside the bounds, which hopefully will fix the problem.

ID: 11619 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 11650 - Posted: 19 Feb 2009, 17:23:15 UTC - in response to Message 11619.  

It might be because one of the stream parameters is at the bound in the main integral (ie, the expensive integral):


slow WU (ps_s23_9_132749_1235002485):
ps_s23_9
parameters [14]: 0.384982866758338 11.716938315232948 -9.530803058571149 244.000000000000000 [snip]


normal WU (ps_s23_9_132750_1235002485):
ps_s23_9
parameters [14]: 0.500328464135606 14.146488926207834 -7.482278426832348 125.656033665464520 [snip]



So maybe the bound can be lowered a bit to prevent these cases (it appears to be a bad fit either way from the fitness values of the affected WUs).


I'll get a change in the boundary checking code today which will keep things slightly inside the bounds, which hopefully will fix the problem.

Just tested it, one needs quite some distance to the bounds, just setting this parameter to 243.0 is not a solution. Furthermore, the slowdown doesn't disappear suddenly, it just gets less and less pronounced.
ID: 11650 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Unusual..........time to completion

©2024 Astroinformatics Group