Welcome to MilkyWay@home

Posts by S@NL - EStorm

1) Message boards : News : maximum time limit exceeded bug (Message 50376)
Posted 26 Jul 2011 by S@NL - EStorm
Post:
I also had the problem, unless the app_info was used.
So I tested it yesterday on boinc 6.12.26 and you know what happend:
When the WU started the estimated speed was larger then estimated WU size but after the WU finished OK and the next one started the DCF which was just below 1 changed to 99 or 100 which caused the next one to have estimates which where OK.
When I was on boinc 6.12.33 this did not happen.
2) Message boards : Number crunching : HD5870 dead - help with replacement! (Message 50375)
Posted 26 Jul 2011 by S@NL - EStorm
Post:
My new 6950 time are between 1:00 sec and 2:00 sec. I think I even had a few below the 1 minute. But most are 1:04 sec.
Memory runs at 900 Mhz (downclocked for the temp). Gpu 850Mhz.
It is the MSI R6950 Twin Frozr III Power Edition/OC.
3) Message boards : News : maximum time limit elapsed bug (Message 50331)
Posted 21 Jul 2011 by S@NL - EStorm
Post:
seti
<rsc_fpops_est>8707547718264.028300</rsc_fpops_est>
<rsc_fpops_bound>500000000000000000.000000</rsc_fpops_bound>

Where did you get that from? All my SETI WUs has rsc_fpops_bound = 10x rsc_fpops_est, and that's what it's supposed to be. Milkyway has 100x rsc_fpops_est, which is VERY high compared to other projects, a CPU task which should be completed within for example 10 hours would run 1000 hours if it gets stuck.


As mentioned from client_state file from one of my machines. But is doesn't really matter. And I don't know why the values have such a big difference. Could be because I replace my 2x5770 with a 6950.
When I started a week ago I checked the properties of a running WU and there the app speed estimate was greater then the WU size estimate hence the problem. Which was fixed by creating a app_info file.
4) Message boards : News : maximum time limit elapsed bug (Message 50324)
Posted 21 Jul 2011 by S@NL - EStorm
Post:
I know Link which is why I did this test to explain what happens if you do not have the app_info and you did not follow your solution.

I also checked the client_state difference with seti and mikly:
milky
<rsc_fpops_est> 25262095395789.531000</rsc_fpops_est>
<rsc_fpops_bound>2526209539578953.000000</rsc_fpops_bound>

seti
<rsc_fpops_est> 8707547718264.028300</rsc_fpops_est>
<rsc_fpops_bound>500000000000000000.000000</rsc_fpops_bound>

And i think that in milky the values translate in a larger app speed estimate than the WU size.
So they have to increase the WU size estimate (not to happy with that) or decrease the app speed estimate (which would be the same as with app_info).
5) Message boards : News : maximum time limit elapsed bug (Message 50321)
Posted 21 Jul 2011 by S@NL - EStorm
Post:
I just did a small test and I got:
<core_client_version>6.12.33</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
]]>

I changed the <flops>1.0e11</flops> to <flops>6002.0e11</flops> in the app_info file.
So if the program runs without app_info and the estimated application speed is greater then the estimated task size you will receive this error.
6) Message boards : News : maximum time limit elapsed bug (Message 50319)
Posted 21 Jul 2011 by S@NL - EStorm
Post:
I agree, but not if people are talking BSOD\Incorrect function. (0x1) - exit code 1 (0x1)\etc. instead of the maximum time limit error.

On that note:
I wonder what will happen if instead of <flops>1.0e11</flops> (100 GFLOPS) in my app_info file I put in a number greater then the estimated task size received from the server.
So if the server estimates 50000 GFLOPS for the WU and I enter a value greater then that. Then the program should be finished in less then a second which is not true of course it takes longer.
Perhaps a test for later.
7) Message boards : News : maximum time limit elapsed bug (Message 50315)
Posted 21 Jul 2011 by S@NL - EStorm
Post:
Can I ask a stupid guestion ?
Why are people discussing other problems ?
This is a thread for "maximum time limit elapsed bug".
8) Message boards : News : maximum time limit elapsed bug (Message 50173)
Posted 18 Jul 2011 by S@NL - EStorm
Post:
When the program is running check the properties of the WU in boinc manager.
That is without app_info of course else it would show the value from app_info if entered in the file. But perhaps it is only reporting this high number on my machine.
9) Message boards : News : maximum time limit elapsed bug (Message 50170)
Posted 18 Jul 2011 by S@NL - EStorm
Post:
Ok, I can understand that.
But where does the Estimated app speed 36043.67 GFLOPs/sec come from ? Which is way to high.
Perhaps the programmer should change the program so that when this error occurs it should be allowed to continue and report the new speed value to the server or reset the server side when the error occurs.
10) Message boards : News : maximum time limit elapsed bug (Message 50164)
Posted 18 Jul 2011 by S@NL - EStorm
Post:
As far as I understand the program and see in the properties of the workunit without app_info. The server give you a time in which you have to finish the WU in my case it was a few days or so. But the WU takes about 2 to 5 minutes to finish and than gives the maximum time limit elapsed. So it looks to me that the program estimates how long it should take and with a to high gflops value it will never finish in time so the program gives the error even though it finished in the time given by the server.

Without app_info:
Estimated app speed 36043.67 GFLOPs/sec
Estimated task size 29604 GFLOPs

Time needed would less then a second.
Perhaps that it why it is crashing.

With app_info.xml:
Estimated app speed 100.00 GFLOPs/sec
Estimated task size 14777 GFLOPs

Time needed 147 seconds.
11) Message boards : News : maximum time limit elapsed bug (Message 50162)
Posted 18 Jul 2011 by S@NL - EStorm
Post:
True but if the application boms out on estimates how can it calculate the correct value ?
I am running with the app_info so no problem for me.
I also checked collatz which is running without app_info and it reports a gflops value below 200 and not 36043.67 on the same gpu. I think it is better to start with a to low gflops value then a to high number if the program is using this value to calculate how long the application should run. To high a value and the maximum time will surely be exceeded to low value and it will correct itself with the duration factor.
Since I don't know how the program calculates it, it could be total rubbish what I am saying.
I also think the program estimates how long it should run based on these values and if it exceeds it it gives the error.
When i started a few days ago on milkyway all the workunits crashed so how can the server learn the correct values if they all crash. Only after adding the app_info and changing the duration factor the program ran without error.
I have no problem changing the xml files to fool the server but not your average user.
12) Message boards : News : maximum time limit elapsed bug (Message 50159)
Posted 18 Jul 2011 by S@NL - EStorm
Post:
I did some testing (running the same program milkyway_separation_0.82_windows_x86_64__ati14.exe):

When I remove my app_info.xml file and look at the properties of the workunit:
Estimated app speed 36043.67 GFLOPs/sec
Estimated task size 29604 GFLOPs
Resources 0.05 CPUs + 1.00 ATI GPUs (device 1)

And now it crashes.

With app_info.xml:
Estimated app speed 100.00 GFLOPs/sec
Estimated task size 14777 GFLOPs
Resources 0.02 CPUs + 1.00 ATI GPUs (device 1)


So it's no wonder the maximum time is elapsed occurs. Or am I wrong.

I have the following in the app_info.xml (<flops>1.0e11</flops>)
Perhaps a combination of the duration factor and the flops calculation is way off.

It also would be nice if the files stayed a little bit longer on the server to check the results.
13) Message boards : News : maximum time limit elapsed bug (Message 50147)
Posted 17 Jul 2011 by S@NL - EStorm
Post:
Running 6.12.33.

I had the same problem until i changed the <duration_correction_factor> in client_state.xml file (I changed it first to 100 which was way to high then changed it to 10 and let it run). Currently it is 0.959884 so it's ok now.
If found this in one of the forum's. If it was mention here then I am sorry.
Currently running win7 x64 catalyst 11.6b.

Found it in number crunching thread - Massive "exceeded elapsed time limit" errors


And I do hope you stopped boinc before the change and restarted after ?
14) Message boards : News : maximum time limit elapsed bug (Message 50140)
Posted 17 Jul 2011 by S@NL - EStorm
Post:
I had the same problem until i changed the <duration_correction_factor> in client_state.xml file (I changed it first to 100 which was way to high then changed it to 10 and let it run). Currently it is 0.959884 so it's ok now.
If found this in one of the forum's. If it was mention here then I am sorry.
Currently running win7 x64 catalyst 11.6b.

Found it in number crunching thread - Massive "exceeded elapsed time limit" errors




©2024 Astroinformatics Group