Message boards :
News :
Separation updated to 0.82
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
All I get is "Maximum time limit exceeded" with v.82, 64bit ATI on an HD 4770. That's after around 3 minutes. Turns out that installing .82 on the first 2 machines coincided with the the rash of bad test WUs that were all failing with "Maximum time limit exceeded". |
Send message Joined: 1 Sep 08 Posts: 204 Credit: 219,354,537 RAC: 0 |
Next: it's a bit difficult to say due to insta-purge, but it seems the ps_test are OK now, after manually correcting the "result duration correction factor". Which makes me think.. if you're including this factor into your completion time estimation you're bound to get all sorts of seemingly random problems, since this value sometimes get totally screwed (e.g. application changes etc.). Might this explain the recent problems with "exceeded elapsed time limit"? (I don't want to spam this thread, but I think these observation are worth publishing) MrS Scanning for our furry friends since Jan 2002 |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
Next: it's a bit difficult to say due to insta-purge, but it seems the ps_test are OK now, after manually correcting the "result duration correction factor". I hope you've found the problem. If it wasn't for insta-purge all these bugs would be found and corrected more easily with fewer headaches and ill will. |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
I'm getting these error lines in the output of all my .82 tasks: <stderr_txt> Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' Error reading astronomy parameters from file 'astronomy_parameters.txt' Trying old parameters file Using SSE3 path Found 2 CAL devices Chose device 0 They run and validate fine, but is something wrong with the setup? |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
Perhaps OK the same as 0.80? http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2454&nowrap=true#49242 |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
Which makes me think.. if you're including this factor into your completion time estimation you're bound to get all sorts of seemingly random problems, since this value sometimes get totally screwed (e.g. application changes etc.). Might this explain the recent problems with "exceeded elapsed time limit"?The maximum time exceeded thing has nothing to with the client code or the time estimates used for the GPU. Workunits have to be assigned some flops values and then BOINC uses those to estimate how long they take to prevent broken things from never finishing. It used to be a frequent problem with N-body workunits since they're hard to estimate, but after the server update it started happening to some separation workunits. |
Send message Joined: 8 May 10 Posts: 576 Credit: 15,979,383 RAC: 0 |
I'm getting these error lines in the output of all my .82 tasks:That's fine. It tries to use a new parameters file before the actual parameters file since we haven't actually switched to using it yet. |
Send message Joined: 1 Sep 08 Posts: 204 Credit: 219,354,537 RAC: 0 |
]The maximum time exceeded thing has nothing to with the client code or the time estimates used for the GPU. Workunits have to be assigned some flops values and then BOINC uses those to estimate how long they take to prevent broken things from never finishing. It used to be a frequent problem with N-body workunits since they're hard to estimate, but after the server update it started happening to some separation workunits. Thanks Matt. So it seems BOINC factors in the result duration correction factor, if it calculates whether a task is overdue or not. Which is correct, as long as the factor is correct. However: if the correction factor is much too large (in my case it started at ~100 when going from 0.62 to 0.82), BOINC assumes the tasks should finish in 1/(correction factor), which may lead to WU aborts. MrS Scanning for our furry friends since Jan 2002 |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
On my 5850 I get quite different figures for a unit that finished in 151 secs for a credit of 267. Can you give your device info from the log? Mine are Device target: CAL_TARGET_CYPRESS Maybe this will give us an idea why estimated to average iteration time varies even between same gpu types. |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
I've seen an increase in completion times on my Q9450, Win7/64bit, 5970 (also has a 4850 in it, but these times are not reported here) Previously (0.62) I was seeing the following (app_info, target frequency 90, 2 wu's at a time) 159 credits -> 132 to 137 seconds 213 credits -> 184 to 192 seconds 267 credits -> 232 to 239 seconds Now (0.82) I'm seeing (app_info, target frequency 90, 2 wu's at a time) 159 credits -> 141 to 145 seconds 213 credits -> 194 to 197 seconds 367 credits -> 241 to 246 seconds. Tonight I'll reduce the target frequency to see if there is a difference, but this is a fairly large increase in calculation time. |
Send message Joined: 30 Sep 09 Posts: 211 Credit: 36,977,315 RAC: 0 |
Next: it's a bit difficult to say due to insta-purge, but it seems the ps_test are OK now, after manually correcting the "result duration correction factor". I remember a few messages on boinc_dev saying that at least some of the 6.12.* versions of BOINC never initialize one of the values that many BOINC projects use for calculating time limits. Could this mean you've identified which one? |
Send message Joined: 1 Sep 08 Posts: 204 Credit: 219,354,537 RAC: 0 |
@Gas Giant: interesting.. I tried to check the time for my HD6950, but couldn't find any WUs still within the database for which credits were given. Nevertheless, I was seeing 94 s for some WU type previously (running 1 at a time), now I've got a few at 96 - 99 s. Previously I used target frequency 60, now I'm running without app_info and it's smoother than before. This improved responsiveness might directly lead to the slight drop in performance. Could also be checkpointing, though. MrS Scanning for our furry friends since Jan 2002 |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
On my 3850 0.62 wu's completed in 530-560 seconds for 213 cs. Now completing 0.82 wu's in 590 seconds. No app_info. |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
On my P4 XP de_separation_13_3s ran 28140 seconds/7.75 hours. So a cast improvement over the previous application, but still a tad slow in comparison to the old opti apps. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 15 Jul 08 Posts: 383 Credit: 729,293,740 RAC: 0 |
I've seen an increase in completion times on my Q9450, Win7/64bit, 5970 (also has a 4850 in it, but these times are not reported here) I downgraded one machine (with 2 x HD5850 cards) back to v.62 from v.82 to test this. The older version completed WUs 1-2 seconds faster on the average. That's running 2x WU/GPU with the same commandline parameters. |
Send message Joined: 2 Apr 11 Posts: 14 Credit: 4,527,461 RAC: 0 |
PowerPC, Mac OS 10.4.11, BOINC 6.10.58. Same for 0.82 as for 0.80. Completed in about 14.5 hours, inconclusive validation, Stderr output shows several iterations of the "Error loading Lua script…" message. |
Send message Joined: 20 Sep 08 Posts: 1391 Credit: 203,563,566 RAC: 0 |
Can you give your device info from the log? Device target: CAL_TARGET_CYPRESS Revision: 2 CAL Version: 1.4.1016 Engine clock: 775 Mhz Memory clock: 1125 Mhz GPU RAM: 1024 Wavefront size: 64 Double precision: CAL_TRUE Compute shader: CAL_TRUE Number SIMD: 18 Number shader engines: 2 Pitch alignment: 256 Surface alignment: 4096 Max size 2D: { 16384, 16384 } Don't drink water, that's the stuff that rusts pipes |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
On my P4 XP de_separation_13_3s ran 28140 seconds/7.75 hours. So a cast improvement over the previous application, but still a tad slow in comparison to the old opti apps. Had 2 of the same tasks complete in 34800 seconds. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
Can you give your device info from the log? Interesting: Same gpu, same clocks, same mem size but 10% slower on the calculations. You are using cat 11.2 right? Mine runs with cat 11.3. Wonder if it's the different cat version or something else slowing your gpu down. |
Send message Joined: 2 Apr 11 Posts: 14 Credit: 4,527,461 RAC: 0 |
I wrote: PowerPC, Mac OS 10.4.11, BOINC 6.10.58. Same for 0.82 as for 0.80. Completed in about 14.5 hours, inconclusive validation, Stderr output shows several iterations of the "Error loading Lua script…" message. A 0.82 task reported this morning has validated OK, although with the same error messages. NG |
©2024 Astroinformatics Group