Welcome to MilkyWay@home

Posts by kashi

21) Message boards : Number crunching : getting errors with new v1.02 separation application? (Message 53359)
Posted 21 Feb 2012 by Profile kashi
Post:
Yes that log appears to show that the BOINC OpenCL detection is broken. It is detecting the HD 4290 as a Cypress class GPU that is OpenCL capable when it is not and not detecting the OpenCL capability of the HD 5870. With incorrect OpenCL detection by BOINC as a starting point any configuration settings and .xml files to try to get an OpenCL application to work are ineffective.

What a horror, I can't think of any other solution other than disabling the onboard video as you have already thought of yourself.

As for the target parameters I use "--gpu-target-frequency 100" with my 5870 to process 2 concurrent tasks. Any lower than 100 and the lag with 2 concurrent tasks is unbearable on my system.
22) Message boards : Number crunching : getting errors with new v1.02 separation application? (Message 53355)
Posted 21 Feb 2012 by Profile kashi
Post:
Backup project problem. Aha, I hadn't thought of that as I do not use a backup project. Perhaps that was why Sunny129/Eric was having trouble getting any work for Collatz with the BOINC 7.0.xx versions he had tried. He may have had Collatz set with a resource share of 0 as a backup project.

Thanks for that news arkayn.
23) Message boards : Number crunching : getting errors with new v1.02 separation application? (Message 53353)
Posted 21 Feb 2012 by Profile kashi
Post:
I should have said --device 0 before. Removing that and using BOINC 7 should work

You did say --device 0 before.

The reason is how BOINC handles device indexing. If you look the first one is using BOINC 7 and the second one with the error is using 6.12.34. Reupgprade to a BOINC 7 (I think 7.0.15 is the newest), or since you are using app_info already you could add <cmdline> --device 0</cmdline>to force it to use that GPU....

24) Message boards : Number crunching : getting errors with new v1.02 separation application? (Message 53342)
Posted 20 Feb 2012 by Profile kashi
Post:
Excuse the double post but I just noticed Matt advised you to try <cmdline> --device 0</cmdline> and you said you used <cmdline> --device 1</cmdline>

Kind of lines up with what I was saying and Matt has already posted. The CAL applications and the OpenCL applications appear to be handling the device numbering differently due to the onboard graphics not being OpenCL capable. So there is only one device being detected by the OpenCL application. If you or the OpenCL application tries to force or use Device 1 then that is a higher number than the number of devices available hence the message "Requested device is out of range of number found devices"

+ if (clr->devNum >= nDev)

+ {

+ warn("Requested device is out of range of number found devices\n");

Whereas your successful CAL tasks have "Found 2 CAL devices. Chose device 1" in stderr.

Getting the excluded GPU and the detected GPU correct may require different combinations of ignore, exclude and force arguments for OpenCL applications as compared to CAL applications. In other words what works for one may not work for the other as you have experienced. If there is only one OpenCL device detected and you use <ignore_ati_dev>0</ignore_ati_dev> perhaps that leaves no available OpenCL devices. The <exclude_gpu> cc_config settings available in BOINC 7.0.xx give greater flexibility in configuring all this separately for each GPU project or application.

So if there is only one OpenCL capable device it may be sufficient to exclude the HD 4290 for CAL applications only. So perhaps try removing <cmdline> --device 1</cmdline> and <ignore_ati_dev>0</ignore_ati_dev> and instead use:

<cc_config>
<options>
<exclude_gpu>
<url>http://boinc.thesonntags.com/collatz/</url>
<device_num>0</device_num>
</exclude_gpu>
</options>
</cc_config>

Not sure if it will work but worth a try.
25) Message boards : Number crunching : getting errors with new v1.02 separation application? (Message 53337)
Posted 20 Feb 2012 by Profile kashi
Post:
Hmm, that's strange because the Stderr you posted of the error task is saying Device 1 is being used or rather trying to be used. Perhaps BOINC and CAL applications are identifying Device 0 and Device 1 differently than how the MilkyWay OpenCL application is identifying them.

If you use multiple GPU exclusions and inclusions at the same time, perhaps it causes differences in different places in how the Devices get numbered.

Yes work fetch on BOINC 7.0.xx versions caused me a lot of problems at first too. Seems to work alright with my current settings now though but I'm not doing any Collatz. Maybe my report results immediately setting is helping to cause new work to be requested.
26) Message boards : Number crunching : getting errors with new v1.02 separation application? (Message 53335)
Posted 20 Feb 2012 by Profile kashi
Post:
Since it's giving errors on Device 1, I assume that is the card to exclude. So you tried the following cc_config.xml with a BOINC 7.0.xx version and it still didn't work?

<cc_config>
<options>
<exclude_gpu>
<url>http://milkyway.cs.rpi.edu/milkyway/</url>
<device_num>1</device_num>
</exclude_gpu>
</options>
</cc_config>

You know about the different work buffer system of BOINC 7.0.xx versions? Connect about every x.xx days has now effectively become Minimum work buffer. In fact in the later 7.0.xx versions it has been renamed. If you leave it at 0 days which was previously recommended for an always on connection it will not download any new tasks until your cache is empty. With BOINC 7.0.15 I use a value of 1 day for Minimum work buffer and 0.1 days for Max additional work buffer. Due to unreliable work availability on another project I also use report_results_immediately in my cc_config.xml file.
27) Message boards : Number crunching : GPU Requirements (Message 53226)
Posted 16 Feb 2012 by Profile kashi
Post:
Yes the 78xx models should have excellent performance/watt on single precision projects. MilkyWay too if double precision is one quarter of single precision like 79xx models.

I haven't been following developments so I don't know if Pitcairn 78xx models will have double precision of one quarter or one sixteenth like the 77xx models. If it is only one sixteenth, then possible future Tahiti LE model (7890?) would be the better choice for MilkyWay.
28) Message boards : Number crunching : GPU Requirements (Message 53215)
Posted 16 Feb 2012 by Profile kashi
Post:
Yes they can crunch here, but the HD 7750 and HD 7770 models will do so relatively slowly. Double precision FP values of 51.2 and 80 for HD 7750 and HD 7770 are lower than a HD 3850.

You are probably aware of this arkayn, I was just pointing it out in case anyone decided to rush out and buy a 7750 or 7770 for MilkyWay.
29) Message boards : Number crunching : N-Body and the Bunker (Message 53155)
Posted 14 Feb 2012 by Profile kashi
Post:
I'm not sure I understand your question. However if you are asking about the cache, then BOINC development 7.0.xx versions work differently to currently released BOINC versions for cache.

With BOINC 7.0.xx versions "Connect about every x.xx days" is now minimum cache and will be renamed to "Minimum work buffer" in the most recent BOINC 7.0.xx versions.

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2764&nowrap=true#53062

http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,32530
30) Message boards : Number crunching : 79XX Dont Run (Message 52434)
Posted 12 Jan 2012 by Profile kashi
Post:

It's interesting that even at this point the HD 7970 is running the NVidia optimized app at 3.5x faster than the top GTX580 card that I could find in the database. Pretty impressive and the HD 7970 also uses less power.....

Also interesting is the low CPU usage. Some other OpenCL GPU applications use a lot more CPU.
31) Message boards : Number crunching : Results not uploading? (Message 51817)
Posted 3 Dec 2011 by Profile kashi
Post:
I am not certain of the details of how it works but my understanding is that one of the reasons it is done this way is to reduce load on the MilkyWay server. Due to GPU tasks being processed very quickly combined with a small cache of tasks allowed means computers with one or more fast GPUs are contacting the MilkyWay server very frequently. If a separate upload is not required it reduces the number of times the server needs to be contacted by each computer and hence reduces server load.

Edit: I see banditwolf has already explained.
32) Message boards : Number crunching : Results not uploading? (Message 51813)
Posted 3 Dec 2011 by Profile kashi
Post:
Perhaps http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2346&nowrap=true#47592 may help explain.
33) Message boards : Number crunching : How to run nBody WU's only? (Message 51738)
Posted 20 Nov 2011 by Profile kashi
Post:
You appear to have been using a older version of the application on that computer. If you detach and reattach to MilkyWay the current version and files should download.

You need to put the app_info.xml file in:

Windows 2000/XP: C:\Documents and Settings\All Users\Application Data\BOINC\projects\milkyway.cs.rpi.edu_milkyway

Windows Vista and Windows 7: C:\ProgramData\BOINC\projects\milkyway.cs.rpi.edu_milkyway


34) Message boards : Number crunching : feel free to call me a noob but..... (Message 51673)
Posted 12 Nov 2011 by Profile kashi
Post:
FirePro V5800 is Juniper XT (RV840) with CAD software support. So it is similar to a Radeon HD 5770 with lower core and memory clock so as to remain under the 75 watt PCI Express bus limit. Therefore it does not support double precision. Some AMD documentation states in error that it does support double precision.

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=1505&nowrap=true#43973
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2457&nowrap=true#51152
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2457&nowrap=true#51194
35) Message boards : Number crunching : Computation error on CPU ps_seperation tasks (Message 51268)
Posted 1 Oct 2011 by Profile kashi
Post:
I think you need to update your Catalyst driver. Older Catalyst versions have not worked since the MilkyWay ATI applications were updated about 6 months ago. I use Catalyst 11.3, some have had problems on MilkyWay with Catalyst versions newer than 11.3.
36) Message boards : Number crunching : Doing work, not getting any credit (Message 51074)
Posted 15 Sep 2011 by Profile kashi
Post:
w1hue was only running CPU applications on his 3 computers. He was not using an anonymous platform so the MilkyWay applications are the current default.
37) Message boards : Number crunching : Doing work, not getting any credit (Message 51035)
Posted 13 Sep 2011 by Profile kashi
Post:
OK, the 4th iteration of the task completed:

v0.82 (ati14), ATI Radeon HD5800 series, Computer ID 320514, Task 105471785
<background_likelihood> -3.002860510921325 </background_likelihood>
<stream_only_likelihood> -143.187387266921520 -10.166715550631581 </stream_only_likelihood>
<search_likelihood> -2.928372007258208 </search_likelihood>

The 4th task iteration and the 3rd validated with Status "Completed and validated". The first 2 task iterations which included your task 103501154 were invalid with Status "Completed, marked as invalid". The work unit was purged from the database within a few minutes after it was completed.

It means your computer 285024 is not producing valid results.

Excuse double post, I just thought you may be interested to know what happened if you didn't happen to catch it, seeing as the information is only available for a minute or two.
38) Message boards : Number crunching : Doing work, not getting any credit (Message 51033)
Posted 13 Sep 2011 by Profile kashi
Post:
Your computer 285024 downloaded task 103501154 and reported it about 2 days later. Another task was sent to a CPU wingman for validation. Validation was still inconclusive and a 3rd iteration was sent to a GPU. Validation was again inconclusive and a 4th iteration has been sent to a GPU.

The 2 HD 5800 wingmen both have over 100 consecutive valid tasks. Therefore if the 4th task iteration of this work unit is completed and reported it is likely that your task 103501154 will be marked invalid. This is probably what has been happening to the other tasks that you have completed on computer 285024. It is possible that task 105094683 completed on the Xeon CPU will also be marked invalid. Once the work unit is completed it is purged from the database very quickly.

Here are the 3 task iterations that have been reported so far for work unit 71312287:

v0.88, Pentium(R) 4 CPU 3.00GHz, Computer ID 285024, Task 103501154
<background_likelihood> -3.002860514894183 </background_likelihood>
<stream_only_likelihood> -143.187387267274180 -10.166715562831477 </stream_only_likelihood>
<search_likelihood> -2.928372010919170 </search_likelihood>

v0.88, Xeon(R) CPU E5506 @ 2.13GHz, Computer ID 216111, Task 105094683
<background_likelihood> -3.002860513689446 </background_likelihood>
<stream_only_likelihood> -143.187387266921520 -10.166715554441502 </stream_only_likelihood>
<search_likelihood> -2.928372009750395 </search_likelihood>

v0.82, (ati14) ATI Radeon HD5800 series, Computer ID 285598, Task 105458923
<background_likelihood> -3.002860510921325 </background_likelihood>
<stream_only_likelihood> -143.187387266921520 -10.166715550631581 </stream_only_likelihood>
<search_likelihood> -2.928372007258208 </search_likelihood>
39) Message boards : Number crunching : Validation Problem (Message 51013)
Posted 12 Sep 2011 by Profile kashi
Post:
Yes could be related to the power supply. I would think 400W with an i7 920 and a 5870 is really pushing your luck. AMD system requirements for 5870 = "500 Watt or greater power supply". MilkyWay in standard configuration puts a lot of strain on ATI/AMD cards. Not only are they at full load but it is the equivalent of a heavy load. The current draw often exceeds that of the most demanding stress test and this doesn't just last for a short time but continues all day every day while MilkyWay is being processed. Cards that come overclocked slightly from the factory are done so in relation to their use for playing games. The same small factory overclock that is fine for playing games may be unstable when using the card for MilkyWay.

When I suggested trying a much lower core speed with 500 MHz memory speed as a test, I meant substantially lower, for example 100-200 MHz or so below default core speed.

The reduction in memory speed is important too as it usually reduces heat/power draw a noticeable amount. Many who process MilkyWay on ATI/AMD cards use a low memory speed all the time. This reduces power consumption and heat and potentially helps the card last longer. I use 500 MHz memory speed and have done so for a long time now.

Although there have been one or two with heavily overclocked cards who claimed that reduced memory speed reduced their processing speed the majority have found that it has no effect and does not slow MilkyWay at all. Lower memory speed in MilkyWay = lower electricity bill, less heat and a more stable, durable GPU.
40) Message boards : Number crunching : Validation Problem (Message 51011)
Posted 11 Sep 2011 by Profile kashi
Post:
There is a single task with a computation error currently showing on your HD 6950 but other tasks appear to be validating OK. This is a different type of error to your HD 5870 on computer 246261 which is completing and reporting tasks but giving incorrect, invalid results.

Trying with a lower core clock and memory clock on the 5870 that is producing invalid results was just a suggestion to try and find if it is a hardware problem. If there is something wrong with a video card when processing tasks, sometimes reducing the load by reducing the speeds can enable it to work successfully. If it starts producing valid results at lower core and memory speeds then you know it is the hardware at fault and not software. It is just a way of trying to diagnose the problem, that's all. Just like you tried a newer Catalyst driver to see if that was causing the problem.

If you prefer you could swap the 5870 into the computer with the other 5870 that is working correctly. If it still produces invalid results after you swap it into the other computer then it is likely to be a fault with the 5870 itself and not software related.


Previous 20 · Next 20

©2024 Astroinformatics Group