Welcome to MilkyWay@home

Posts by pvh

1) Message boards : Number crunching : tasks de_modfit_fast_19_3s_140 all failing (Message 66350)
Posted 3 May 2017 by pvh
Post:
Thanks! Only checked the number crunching section...
2) Message boards : Number crunching : tasks de_modfit_fast_19_3s_140 all failing (Message 66337)
Posted 3 May 2017 by pvh
Post:
Just started Milkyway again on my old HD6950... It looks like all de_modfit_fast_19_3s_140 tasks are failing. This is the error report on one of them:

<stderr_txt>
<search_application> milkyway_separation 1.46 Linux x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Setting process priority to 0 (13): Permission denied
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Switching to Parameter File 'astronomy_parameters.txt'
Error reading number_streams
19:32:56 (1249): called boinc_finish(1)


Tasks from the batch de_modfit_fast_Sim19_3s_146 seem to be running fine. Is this some problem with the de_modfit_fast_19_3s_140 batch?
3) Message boards : Number crunching : Not getting any new work (Message 64900)
Posted 16 Jul 2016 by pvh
Post:
The BOINC event log listed the drivers and this rig was successfully running a backup project on the GPU. I rebooted the system anyway, and now things seem to be OK. Weird... Thanks for the help!
4) Message boards : Number crunching : Not getting any new work (Message 64898)
Posted 16 Jul 2016 by pvh
Post:
One of my rigs has consistently not been getting any work for the past 10 hours (after I had to reboot it). The other is getting work, but also gets frequent messages saying "got 0 new tasks". Are the servers not capable of keeping up with the load?
5) Message boards : Number crunching : All fixedangles WUs crashing immediately (Message 64878)
Posted 13 Jul 2016 by pvh
Post:
I have been getting a lot of errors lately. They systematically all come from WUs of the modfit fixedangles type. They all immediately crash after startup. This is in the log

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
<search_application> milkyway_separation 1.36 Linux x86_64 double OpenCL </search_application>
Reading preferences ended prematurely
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Setting process priority to 0 (13): Permission denied
Parameter file 'astronomy_parameters.txt' is empty
Switching to Parameter File
Error reading number_parameters
19:10:46 (18424): called boinc_finish

</stderr_txt>
]]>


I assumed it was a bad batch of WUs, but was surprised to see that it was not discussed here (at least I could not find any discussion). Is anybody else having this problem? I have this problem on both my GPUs running Milkyway. I have an openSUSE Leap 42.1 OS with fglrx 15.201.1151 and 15.300.1025.
6) Message boards : Number crunching : MilkyWay@Home v1.02 (opencl_amd_ati) Compute Errors (Message 62838)
Posted 15 Dec 2014 by pvh
Post:
I see the same thing as well. Looks like a bad batch of WUs.

PS - "de_81_DR8_Rev_8_4" and "de_82_DR8_Rev_8_4" appear to be OK.
7) Message boards : Number crunching : ps_modfit_16TestStars immediately result in error (Message 62292)
Posted 9 Sep 2014 by pvh
Post:
I see many errors in ps_modfit_16TestStars WUs, after 1 sec, as follows

Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4' 
Switching to Parameter File
Integral area dimensions must be even: cut 0: { nu_steps = 20, mu_steps = 400, r_steps = 175 }
Error reading parameters file
Failed to read parameters file
07:08:53 (824): called boinc_finish


Other modfit WUs work fine on the same machine. This is openSUSE 13.1 with Boinc 7.2.42. I suspect that all WUs of this type fail.
8) Message boards : Number crunching : process exited with code 22 (Message 58694)
Posted 11 Jun 2013 by pvh
Post:
Well, there is something weird going on here. If you look on the website (the overview of tasks), it claims that the WUs are done with client v1.02, but if you look in the output it identifies itself as client v0.82. If I look at the running processes on my machines, I see the v0.82 client running. This is certainly not necessary as I ran v1.02 successfully in the past...

I am now also getting invalid WUs. I never had those in the past. It says: "Validate state: Workunit error - check skipped". I don't see anything in the output that suggests an error though...
9) Message boards : Number crunching : process exited with code 22 (Message 58666)
Posted 11 Jun 2013 by pvh
Post:
Never mind, I already figured it out. This is an old bug in the milkyway_separation_0.82_x86_64-pc-linux-gnu__ati14 binary. It is 64-bit, but erroneously expects ld-linux-x86-64.so.2 to be in /lib/ld-linux-x86-64.so.2. This should be /lib64/ld-linux-x86-64.so.2. Creating a symlink /lib/ld-linux-x86-64.so.2 -> /lib64/ld-linux-x86-64.so.2 will fix this, but it would be better if you fixed the binary...
10) Message boards : Number crunching : process exited with code 22 (Message 58664)
Posted 10 Jun 2013 by pvh
Post:
All my AMD GPU WUs now immediately error out with error code 22. Does anybody have an idea what is causing that? I ran MW successfully before and as far as I know nothing changed on my machines. I am running openSUSE 12.2 and BOINC 7.0.65.
11) Message boards : Number crunching : Credit and CPU utilisation of NBody_104 (Message 56745)
Posted 5 Jan 2013 by pvh
Post:
Another problem with these nbody tasks is that they are parallel tasks (presumably openMP) and by default use all the cores they can get. However, my version of BOINC (7.0.28 for linux) doesn't realize that that is happening and assumes the nbody WU uses a single core. Thus it loads the remaining N-1 cores with other WUs. This is bad news for the parallel task as having it compete with other jobs for the same cores is generally a Really Bad Idea. This can significantly slow down the parallel job. This depends on how the code is written, but in most cases there will be a slow-down.

So in my opinion these parallel nbody WUs should be kept on ice anyway until BOINC is able to handle parallel WUs correctly.
12) Message boards : Number crunching : Credit and CPU utilisation of NBody_104 (Message 56744)
Posted 5 Jan 2013 by pvh
Post:
I see the same thing here. No GPU usage and dismally low credits. One example:

run time: 1,037.00
CPU time: 11,426.18
credit: 2.93

This is a bad joke. I have the project set up to _only_ receive GPU tasks, so even if the credits had been OK, I would not want to get these work units as they are CPU only. I will disable this project until this is sorted.

Please stop sending these work units as fake GPU tasks immediately!
13) Message boards : Number crunching : WU stuck at 100% (Message 53686)
Posted 15 Mar 2012 by pvh
Post:
When I got home today, I found that the WU ps_separation_15_2s_real_1_1072579_0 was stuck at 100% for more than 8 hours. I manually aborted the WU. This was on openSUSE 11.4 with BOINC 7.0.18.
14) Message boards : Number crunching : CPD going down... (Message 53616)
Posted 10 Mar 2012 by pvh
Post:
Turns out it was a problem with the driver. Rebooted the system and forced a rebuild of the driver. Now things are back to normal. Thanks for the help!
15) Message boards : Number crunching : CPD going down... (Message 53604)
Posted 10 Mar 2012 by pvh
Post:
I changed nothing in my setup. This started some 3 days ago, before that my WUs ran in just under 100 sec, just like you said. I'll try if reserving a core makes a difference, but I never did that before either... Thanks!
16) Message boards : Number crunching : CPD going down... (Message 53590)
Posted 8 Mar 2012 by pvh
Post:
What is going on with the latest WUs I have been getting for my ATI card? They run almost twice as long, but I still get the same credit per WU... As a result my CPD has taken quite a plunge. Is the credit per WU going to be readjusted? I do believe that more work done per WU should also mean that I get more credit...
17) Message boards : News : feel free to cancel any in progress WUs (Message 51350)
Posted 10 Oct 2011 by pvh
Post:
Increase the WU-length could make it impossible for CPU to crunch for M@W in a reasonable time.


Is it strictly necessary that GPU and CPU WUs do the same amount of work? If so, then you will always have a problem since GPUs are so much faster... But I am not convinced that they need to be of the same size...
18) Message boards : News : feel free to cancel any in progress WUs (Message 51349)
Posted 10 Oct 2011 by pvh
Post:
If collatz is your backup project, you can set the resource share to 0. This means, only one wu / gpu will be picked up. When that one finishes, the next one (again a single wu) is downloaded.


I have Primegrid as my backup, it is the only backup project that runs on an ATI and I consider to be at least vaguely useful... If you set the resource share to zero, it only makes the project the backup. It does not limit the number of WUs that are downloaded once the backup kicks in. I think backup projects should work the way you describe, but they don't. I checked on the BOINC site. There is no way to force BOINC to only download a single WU at a time.
19) Message boards : News : feel free to cancel any in progress WUs (Message 51345)
Posted 10 Oct 2011 by pvh
Post:
I for one would also be in favor of _much_ bigger workunits for GPUs (I would say roughly 100x bigger). As it is now, the turnaround time is ridiculously short: around every 1 - 2 min the server needs to be contacted for a new WU. And that is for a single GPU. No wonder your server cannot keep up. A side effect is that MW gets completely bullied by the backup projects. I get a maximum of roughly 25 - 35 min of work in my cache. So every time the server is unresponsive for that amount of time (and that happens quite often) my backup project immediately dumps 20 hours of work on me. If that would happen once every day (we are not very far off), I would be running 20 hours of backup project and only 4 hours of MW per day. I too have an ATI GPU, so the choices for backup projects are very limited and I find them all more or less useless, so I really don't want to be running these backup projects at all...
20) Message boards : Number crunching : GPU Requirements [OLD] (Message 41939)
Posted 5 Sep 2010 by pvh
Post:
As far as I can see there is no Linux client for ATI GPUs. Is there any chance of getting that? I am currently running Collatz, but would love to do something more meaningful...


Next 20

©2024 Astroinformatics Group