Welcome to MilkyWay@home

maximum time limit elapsed bug

Message boards : News : maximum time limit elapsed bug
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 49940 - Posted: 6 Jul 2011, 11:27:24 UTC - in response to Message 49939.  

Thanks for that suggestion FruehwF. I'll have a play with the --gpu-target-frequency parameter in a day or so. Can't stop now because I'm on a Spinhenge binge and checkpointing only after 95% task completion is useless.
ID: 49940 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cristipurdel

Send message
Joined: 1 Jul 11
Posts: 10
Credit: 422,543
RAC: 0
Message 49941 - Posted: 6 Jul 2011, 11:39:27 UTC

So on my 6950, with the new boinc 6.12.33, I dwld several GPU WUs.
After it starts crunching normal on the first one(it takes an estimated time of 4min with 500/1250 MHz), at around 40% the WU gets aborted. Than a massive system freeze happens for about 30-60 seconds, in which I presume, the second WU starts, and the first WU finishes as a computational error.
@Arkayn maybe a type of installer like the one from lunatics would be more suitable. With all the flags being inside the installer, along with some explanations, so that everybody could tinker his own app based on community feedback.
ID: 49941 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 49945 - Posted: 6 Jul 2011, 14:51:05 UTC - in response to Message 49930.  

maximum time limit elapsed bug

Perhaps you could try the optimised application then. The flops in that is specified as <flops>1.0e11</flops>

Back to the original topic. Is anyone currently seeing the maximum time limit elapsed bug when using an app_info.xml?
ID: 49945 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Confusius

Send message
Joined: 31 Mar 10
Posts: 12
Credit: 13,722,511
RAC: 0
Message 49949 - Posted: 6 Jul 2011, 17:13:42 UTC - in response to Message 49945.  

unfortunatly the arkain app seems to be the same as the stock app delivered by MW. No difference in atleast one byte.

But i gave it a try though with the app_info.xml provided in the .zip archive.

Result: 4 WUs in a row without an error!
ID: 49949 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 49951 - Posted: 6 Jul 2011, 18:03:00 UTC - in response to Message 49949.  

...
Result: 4 WUs in a row without an error!



Baaaang!!!

If others do confirme this, than please forum Admins make a pinned thread with posted app_info.xml were this info is seen from everybody.

I can provide the information, if this is desired.
ID: 49951 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Confusius

Send message
Joined: 31 Mar 10
Posts: 12
Credit: 13,722,511
RAC: 0
Message 49954 - Posted: 6 Jul 2011, 18:32:56 UTC
Last modified: 6 Jul 2011, 18:33:46 UTC

Update: Running more than one hour, no errors (Computer WKS01)
ID: 49954 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
davebodger

Send message
Joined: 3 Jul 11
Posts: 8
Credit: 66,330,086
RAC: 0
Message 49958 - Posted: 6 Jul 2011, 22:54:14 UTC - in response to Message 49939.  


Milkyway optimized Apps on Arkayn's download page

Mayby somebody with the TE -Bug could try out.


I tried this and followed the instructions but got lots of errors when restarting BOINC - should I have done anything else?
I saw little in the xml file - i.e. lots of labels with most empty of content - is that correct?
I'm afraid I have no experience in hacking this stuff so am not sure what I am looking at.

06/07/2011 23:19:32 | Milkyway@home | Found app_info.xml; using anonymous platform
06/07/2011 23:19:32 | Milkyway@home | [error] State file error: missing application milkyway_nbody
06/07/2011 23:19:32 | Milkyway@home | [error] Can't handle workunit in state file
06/07/2011 23:19:32 | Milkyway@home | [error] State file error: missing application milkyway_nbody
06/07/2011 23:19:32 | Milkyway@home | [error] Can't handle workunit in state file
06/07/2011 23:19:32 | Milkyway@home | [error] State file error: missing task ps_nbody_test3_243318
06/07/2011 23:19:32 | Milkyway@home | [error] Can't link task ps_nbody_test3_243318_0 in state file
06/07/2011 23:19:32 | Milkyway@home | [error] State file error: missing task ps_nbody_test3_243317
06/07/2011 23:19:32 | Milkyway@home | [error] Can't link task ps_nbody_test3_243317_0 in state file
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding
06/07/2011 23:19:32 | Milkyway@home | [error] No application found for task: windows_x86_64 82 ati14; discarding


Now since I restarted I do not have any Milkyway tasks that want to use the GPU, so I suppose that is a kind of way to stop the Abort errors. :-)

Regards.

Dave.
[/code]
ID: 49958 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 49960 - Posted: 6 Jul 2011, 23:49:25 UTC - in response to Message 49941.  

I wish, but I can only do the Mac Installer.

I do not know what they used to make it either.
ID: 49960 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ziffen63

Send message
Joined: 26 Mar 09
Posts: 7
Credit: 702,781
RAC: 0
Message 49962 - Posted: 7 Jul 2011, 2:30:28 UTC

I let my old computer run overnight several times so it could finish your WU by it's Jul 8th deadline. Well it finished yesterday morning and it uploaded the results but it's as if they disappeared from your database because no points were awarded and there is nothing in the Pending or Tasks lists. I'm about to throw in the towel on Milkyway and detach it. If it's not a time expired issue, here of late, now the WU's just disappear from your website when finished. Very discouraging!
ID: 49962 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 49963 - Posted: 7 Jul 2011, 2:34:29 UTC - in response to Message 49958.  
Last modified: 7 Jul 2011, 2:34:55 UTC


I tried this and followed the instructions but got lots of errors when restarting BOINC - should I have done anything else?
I saw little in the xml file - i.e. lots of labels with most empty of content - is that correct?
I'm afraid I have no experience in hacking this stuff so am not sure what I am looking at.
...



I think you chose the wrong download for your win xp machine u have to use
this:

http://www.arkayn.us/forum/index.php?action=downloads;sa=view;down=29

and for your win7 machine this

http://www.arkayn.us/forum/index.php?action=downloads;sa=view;down=32

Be sure that you don't have any WU's because they error out because the application is for BOINC then not the same, so BOINC quit's them.

This above is for GPU crunching!

If that works. Then you can try to merge the app_info.xml so that CPU WU's are also done if you want that.

greetings franz
ID: 49963 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 49964 - Posted: 7 Jul 2011, 2:38:10 UTC - in response to Message 49962.  
Last modified: 7 Jul 2011, 2:39:11 UTC

I let my old computer run overnight several times so it could finish your WU by it's Jul 8th deadline. Well it finished yesterday morning and it uploaded the results but it's as if they disappeared from your database because no points were awarded and there is nothing in the Pending or Tasks lists. I'm about to throw in the towel on Milkyway and detach it. If it's not a time expired issue, here of late, now the WU's just disappear from your website when finished. Very discouraging!

For awhile now any validated wu is purged from the server within a few seconds. Which task was it? If it was an N-body and you got it before the update then those gave zero credit for some users. See this thread.
Also this one.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 49964 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 49965 - Posted: 7 Jul 2011, 4:10:50 UTC
Last modified: 7 Jul 2011, 4:54:39 UTC

Brief note for those who wish to try an optimised application to overcome maximum time limit elapsed bug on ATI GPUs and are unfamiliar with app_info.xml files and the oddities of files compressed on a Mac. This relates to the optimised applications previously mentioned found on arkayn's Crunchers Anonymous forum.

For Windows 64-bit you should use the correct app_info.xml file from the "Win64_0.82_ati\Win64_0.82_ati\Files to install" folder and not the ._app_info.xml file from the "Win64_0.82_ati\__MACOSX\Win64_0.82_ati\Files to install" folder which contains some strange looking Mac zipping stuff:   Mac OS X  2 u  § ATTR z± § œ  œ com.apple.TextEncoding macintosh;0

For Windows 32-bit you should use the correct app_info.xml file from the "Win32_0.82_ati\Win32_0.82_ati\Files to install" folder and not the ._app_info.xml file from the "Win32_0.82_ati\__MACOSX\Win32_0.82_ati\Files to install" folder which contains similar Mac stuff as 64-bit version above.

For both 64-bit and 32-bit versions you do not need to copy the .DS_Store file.

The optimised application itself is the same as the default application automatically downloaded from the MilkyWay server. For those who are not having any problems there is very little to no advantage to use an optimised application, it only allows some parameters to be added/altered for those who have trouble such as a sluggish user interface or those who wish to tweak a bit.
ID: 49965 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cristipurdel

Send message
Joined: 1 Jul 11
Posts: 10
Credit: 422,543
RAC: 0
Message 49967 - Posted: 7 Jul 2011, 7:07:16 UTC - in response to Message 49965.  

Brief note for those who wish to try an optimised application to overcome maximum time limit elapsed bug on ATI GPUs and are unfamiliar with app_info.xml files and the oddities of files compressed on a Mac. This relates to the optimised applications previously mentioned found on arkayn's Crunchers Anonymous forum.

For Windows 64-bit you should use the correct app_info.xml file from the "Win64_0.82_ati\Win64_0.82_ati\Files to install" folder and not the ._app_info.xml file from the "Win64_0.82_ati\__MACOSX\Win64_0.82_ati\Files to install" folder which contains some strange looking Mac zipping stuff:   Mac OS X  2 u  § ATTR z± § œ  œ com.apple.TextEncoding macintosh;0

For Windows 32-bit you should use the correct app_info.xml file from the "Win32_0.82_ati\Win32_0.82_ati\Files to install" folder and not the ._app_info.xml file from the "Win32_0.82_ati\__MACOSX\Win32_0.82_ati\Files to install" folder which contains similar Mac stuff as 64-bit version above.

For both 64-bit and 32-bit versions you do not need to copy the .DS_Store file.

The optimised application itself is the same as the default application automatically downloaded from the MilkyWay server. For those who are not having any problems there is very little to no advantage to use an optimised application, it only allows some parameters to be added/altered for those who have trouble such as a sluggish user interface or those who wish to tweak a bit.


thx ... it worked

Bu, for the record, amd just so I won't die stupid...

For my 6950, at 500/1250 MHz, from http://en.wikipedia.org/wiki/Comparison_of_AMD_graphics_processing_units#Northern_Islands_.28HD_6xxx.29_series my peak SP power should be
2253 which is 1408*800*2

After I down-clocked the GPU from 800 to 500, boinc says I should have 1760GFLOPS
ATI GPU 0: AMD Radeon HD 6900 series (Cayman) (CAL version 1.4.1417, 1024MB, 1760 GFLOPS peak)

When Actually I'm having 1408= 1408*2*500

I know this is for SP, and for DP my peak power is 25% for Cayman.

So, my DP should hover in the vicinity of 352-440GFLOPS

The question is, since I had the TE bug, and the fact that I have 1e11 flops (=100 GFLOPS) in my app_info (which is lower than my DP), in order to bypass the TE bug, the flops value in app_info should be lower than the peak DP of anyone's card?
ID: 49967 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
Message 49969 - Posted: 7 Jul 2011, 8:38:38 UTC - in response to Message 49962.  

I let my old computer run overnight several times so it could finish your WU by it's Jul 8th deadline. Well it finished yesterday morning and it uploaded the results but it's as if they disappeared from your database because no points were awarded and there is nothing in the Pending or Tasks lists. I'm about to throw in the towel on Milkyway and detach it. If it's not a time expired issue, here of late, now the WU's just disappear from your website when finished. Very discouraging!


Application details for host 273341 (your Sempron)

MilkyWay@Home 0.88 windows_intelx86
Number of tasks completed 2
Max tasks per day 10001
Number of tasks today 1
Consecutive valid tasks 0
Average processing rate 0.39453419025849
Average turnaround time 4.24 days

Total credit 481

So it looks like you did 2 WUs and the credit fits to that too.
Reason for the quick purge is the heavy server load and database size.
ID: 49969 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 49970 - Posted: 7 Jul 2011, 9:29:02 UTC - in response to Message 49967.  
Last modified: 7 Jul 2011, 9:30:02 UTC

..

After I down-clocked the GPU from 800 to 500, boinc says I should have 1760GFLOPS
ATI GPU 0: AMD Radeon HD 6900 series (Cayman) (CAL version 1.4.1417, 1024MB, 1760 GFLOPS peak)

..


I wouldn't give much on this information, I think this is not a measured value, it's only an information from the App-Server which does your card only identify as "HD 6900 series (Cayman)".

May I ask U another question? Why have you down clocked your GPU-Core instead of the memory. For MW@H only the core speed is signficant for the performance.
The memory speed you can downclock as low as your system runs stable (50 % of stock speed should work fine), for saving energie and lower temperature.

greetings
franz
ID: 49970 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 49972 - Posted: 7 Jul 2011, 10:38:21 UTC - in response to Message 49967.  

.....After I down-clocked the GPU from 800 to 500, boinc says I should have 1760GFLOPS
ATI GPU 0: AMD Radeon HD 6900 series (Cayman) (CAL version 1.4.1417, 1024MB, 1760 GFLOPS peak)

When Actually I'm having 1408= 1408*2*500.....

Yes BOINC miscalculates GFLOPS on Cayman. It calculates correctly for previous VLIW5 architecture, but on Cayman VLIW4 architecture it mutiplies by 5 instead of 4, so calculates 25% high.

HD 6950 example:

Actual: 1408 shader units organized in 22 SIMDs with 16 VLIW units (4-issue)

BOINC calculation: 1760 shader units organized in 22 SIMDs with 16 VLIW units (5-issue)

As I noted in a post on the Collatz forum about 6 months ago, the older MilkyWay application Stderr output used to misreport the number of shaders on Cayman too and the Collatz application still does.

ID: 49972 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cristipurdel

Send message
Joined: 1 Jul 11
Posts: 10
Credit: 422,543
RAC: 0
Message 49973 - Posted: 7 Jul 2011, 10:48:40 UTC - in response to Message 49970.  
Last modified: 7 Jul 2011, 10:50:21 UTC



I wouldn't give much on this information, I think this is not a measured value, it's only an information from the App-Server which does your card only identify as "HD 6900 series (Cayman)".

May I ask U another question? Why have you down clocked your GPU-Core instead of the memory. For MW@H only the core speed is signficant for the performance.
The memory speed you can downclock as low as your system runs stable (50 % of stock speed should work fine), for saving energie and lower temperature.

greetings
franz


I'm also running 4 wcg 64 bit applications.
I want to keep my pc as cool as it can be. Right now the gpu hovers around 72C, while the cpu is at 77C.

From my experience with gpu load vs gpu temp, it's much better to lower the clocks than to limit the gpu load (e.g. tthrottle) for the same temperature.
500/1250 is the lowest I can go with amd overdrive.

And I also have to consider the speed of the fan which becomes annoying when it goes into a 'higher' gear :P
ID: 49973 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 49977 - Posted: 7 Jul 2011, 12:34:44 UTC

I see! If you are doing opther projects parallel u have to make compromise.

Your card is one of the best DP card, MW is one of only a few project with need DP and Ithing it pays good for that.

so happy crunching.

franz
ID: 49977 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
davebodger

Send message
Joined: 3 Jul 11
Posts: 8
Credit: 66,330,086
RAC: 0
Message 49994 - Posted: 7 Jul 2011, 23:30:22 UTC - in response to Message 49963.  

I think you chose the wrong download for your win xp machine u have to use this:
http://www.arkayn.us/forum/index.php?action=downloads;sa=view;down=29
and for your win7 machine this:
http://www.arkayn.us/forum/index.php?action=downloads;sa=view;down=32

Be sure that you don't have any WU's because they error out because the application is for BOINC then not the same, so BOINC quit's them.
This above is for GPU crunching!
If that works. Then you can try to merge the app_info.xml so that CPU WU's are also done if you want that.
greetings franz

Thanks franz, I have merged the app_info files together and now have both .82 and .88 running OK at the same time.
No more MTLE problems. :-)

Now if only I could get the files I need to run the ps_nbody stuff then I would be home and dry.
Presumably when someone (Travis?) fixes the base code then I can delete the app_info file and everything will revert back to normal ?

Regards.

Dave.

P.S. here's what I put in the app_info.xml :-
<app_info>
 <app>
 <name>milkyway</name>
 </app>
 <file_info>
  <name>milkyway_separation_0.82_windows_x86_64__ati14.exe</name>
  <executable/>
 </file_info>
 <app_version>
  <app_name>milkyway</app_name>
  <version_num>82</version_num>
    <flops>1.0e11</flops>
    <avg_ncpus>0.05</avg_ncpus>
    <max_ncpus>1</max_ncpus>
    <plan_class>ati14ati</plan_class>
    <coproc>
      <type>ATI</type>
      <count>1</count>
    </coproc>
    <cmdline></cmdline>
  <file_ref>
   <file_name>milkyway_separation_0.82_windows_x86_64__ati14.exe</file_name>
   <main_program/>
  </file_ref>
 </app_version>
 <file_info>
  <name>milkyway_separation_0.88_windows_x86_64.exe</name>
  <executable />
 </file_info>
 <app_version>
  <app_name>milkyway</app_name>
  <version_num>88</version_num>
    <cmdline></cmdline>
   <file_ref>
    <file_name>milkyway_separation_0.88_windows_x86_64.exe</file_name>
    <main_program/>
  </file_ref>
 </app_version>
</app_info>


ID: 49994 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Andy

Send message
Joined: 27 Aug 10
Posts: 19
Credit: 153,747,675
RAC: 0
Message 49997 - Posted: 8 Jul 2011, 8:33:12 UTC - in response to Message 49994.  

I'm having the same issue which is becoming really annoying.

I've tried drivers 11.3,11.5,11.6 on HD 6950 and 6990.

I've tried it on 4 different spec pcs and 4 different ATI cards only thing the same was Windows 7 64 BIT. I get this problem on each PC.

I've tried 2 versions of boinc and same problem.

I downloaded http://www.arkayn.us/forum/index.php?action=downloads;sa=view;down=32 and copied to my boinc data folder as it said. and same problem. I did this when service was not running.

I've tried 2 different versions of boinc client.

I only have this problem with milkyway. Other projects are fine.

My working pcs are using 6.10 and 11.5 drivers 6950. All pcs with this problem are running the same now.

:(

ID: 49997 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : News : maximum time limit elapsed bug

©2024 Astroinformatics Group