a fix for the output file issue
log in

Advanced search

Message boards : News : a fix for the output file issue

Author Message
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 42920 - Posted: 18 Oct 2010 | 3:09:45 UTC
Last modified: 18 Oct 2010 | 3:11:26 UTC

I added the <optional/> tag to the result xml so the issue some people have been having with that file not being found should hopefully be fixed. I think the fix will only be for newly generated WUs, so if you've been having this problem I'd cancel whichever ones you're running, so you can get new ones with the right result xml.
____________

w1hue
Send message
Joined: 13 Feb 09
Posts: 24
Credit: 644,318
RAC: 1,493
Message 42921 - Posted: 18 Oct 2010 | 3:51:17 UTC - in response to Message 42920.

Does this fix the problem of not getting credit for completed WUs that show "computation error"?
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 42923 - Posted: 18 Oct 2010 | 4:48:19 UTC - in response to Message 42920.
Last modified: 18 Oct 2010 | 4:49:18 UTC

And sorry for taking so long, was kind of incapacitated in bed with the flu all week and just found out about this tonight. :(
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 42924 - Posted: 18 Oct 2010 | 4:48:43 UTC - in response to Message 42921.
Last modified: 18 Oct 2010 | 4:49:00 UTC

Does this fix the problem of not getting credit for completed WUs that show "computation error"?


I think it should. If not, please let me know.
____________

Profile mdhittle*
Avatar
Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
Message 42925 - Posted: 18 Oct 2010 | 4:52:53 UTC - in response to Message 42924.

Will it fix this problem, also?

feeder milkyway Not Running
transitioner milkyway Not Running
milkyway_purge milkyway Not Running
file_deleter milkyway Not Running
nbody_assimilator milkyway Not Running
separation_assimilator milkyway Not Running

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 42926 - Posted: 18 Oct 2010 | 5:32:03 UTC - in response to Message 42925.

They're back up.
____________

Profile John Black
Send message
Joined: 3 May 10
Posts: 53
Credit: 689,784
RAC: 905
Message 42930 - Posted: 18 Oct 2010 | 10:54:52 UTC

Thanks Travis and I am sorry to hear that you had flu. I hold with Linus Pauling and vitamin C 500mg/diem that seems to keep away the worst of it.
I am not worried about lost credits what difference does that make to anything least of all if whether we figure out where the Sagittarius arm is going.
I do wish that my fellow contributors realised the difficulties that you guys are working under and how hard you work to send us stuff for free. If you were levying a charge for this service then maybe people would have a right to complain.
I will try the modified software when I run out of SETI stuff which I just received a load of.
Thanks again

Profile Nekodemus vom Wolkenstein
Avatar
Send message
Joined: 28 Sep 10
Posts: 5
Credit: 46,734,494
RAC: 0
Message 42933 - Posted: 18 Oct 2010 | 14:30:30 UTC

Hi Travis,
thanks for the info. about 2 hours since I have no more error messages. I'm thrilled. A THANK YOU to those who found the BUG


regards Nekodemus
____________
regards
Nekodemus vom Wolkenstein

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 42938 - Posted: 18 Oct 2010 | 15:43:31 UTC

Hello community!

Please, can someone explain, why my team mate get this errors on his host (2x GTX295)?

'Error overview of host 218321'

What is the problem?

Thanks!

____________
Best regards!

Traills
Send message
Joined: 16 Aug 10
Posts: 6
Credit: 509,129
RAC: 2,451
Message 42953 - Posted: 19 Oct 2010 | 4:17:58 UTC - in response to Message 42920.

Thank you, Travis. Sorry about the flu, that's never fun. I will cancel out of the affected WUs, which I had suspended, and try a new batch of MW after my slug of Cosmology finishes in about a week.

w1hue
Send message
Joined: 13 Feb 09
Posts: 24
Credit: 644,318
RAC: 1,493
Message 42955 - Posted: 19 Oct 2010 | 5:00:52 UTC - in response to Message 42924.

My credits have not changed for about ten days during which time I completed at least three WUs with over 60 hrs CPU time, my pending credits list shows 0, and my task list is blank. So, guess not... at least not as far as getting credit for the "lost WUs" is concerned. Oh well, life is a bitch, then you die. (Just kidding!!) I'll un-suspend the project and see what happens with new WUs.
____________

Brian Priebe
Send message
Joined: 27 Nov 09
Posts: 98
Credit: 172,238,802
RAC: 78,660
Message 42961 - Posted: 19 Oct 2010 | 8:06:52 UTC - in response to Message 42924.

I think it should. If not, please let me know.

So far so good on those file transfer errors going away. But the Windows CPU app is still taking at least twice the time it used to...

Profile White Mountain Wes
Avatar
Send message
Joined: 24 Jul 09
Posts: 21
Credit: 1,974,883
RAC: 1,198
Message 42966 - Posted: 19 Oct 2010 | 16:30:43 UTC

I just finished my first post fix WU and it completed and validated without any errors. Thanks for the fix! It's nice to be able to crunch for MW again. But I would have to concur that the WU's are now taking about 3X longer than they used to.
____________

Profile prairie69
Send message
Joined: 2 Nov 08
Posts: 11
Credit: 164,511
RAC: 159
Message 42969 - Posted: 19 Oct 2010 | 18:02:20 UTC - in response to Message 42966.

I just finished my first post fix WU and it completed and validated without any errors. Thanks for the fix! It's nice to be able to crunch for MW again. But I would have to concur that the WU's are now taking about 3X longer than they used to.


Moi aussi. Just got my first successful WU in NINE days!

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 42979 - Posted: 19 Oct 2010 | 22:39:36 UTC - in response to Message 42969.

Great news! Glad it's working. :)
____________

Profile John Black
Send message
Joined: 3 May 10
Posts: 53
Credit: 689,784
RAC: 905
Message 42998 - Posted: 20 Oct 2010 | 5:50:53 UTC

Hi Travis (or anybody who knows),

after the fix was announced I downloaded a few new WUs and got to work. The first
de_14-2s_5_126674_1287400167_0 ran for 21+ hours and showed 16 still to go so I aborted it as it looked like a bad one the next de_14_2s_5_126673_1287400167_0 has been running for 17 hours and shows 27 hours to go.

Are these new WUs or still some from the bad batch. If they are the bad ones then no problem I will abort and download new ones. If they are not then I have a different problem and I don't know what to do with it. Both of these WUs showed a "to completion time" of about 18 hours so running for about c44 hours is way beyond what was expected. The Application running for both WUs is MW&H 0.04.

I am afraid that I am stumped HELP!!

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43000 - Posted: 20 Oct 2010 | 8:43:22 UTC

BTW.

It's O.K. to use further the opt. CPU app from available 'here'?

In app_info.xml named as V0.21 ?

Like this:

<app_name>milkyway</app_name> <version_num>21</version_num>


The new stock/orig. MW@h CPU app have 0.04 ..?



From where we could DL the other new N-body CPU app?
And how this app should be named in the app_info.xml file?



I understood it correct, that MW@h have currently 2 CPU apps?

____________
Best regards!

Travis
Send message
Joined: 30 Aug 07
Posts: 15
Credit: 6,571
RAC: 0
Message 43001 - Posted: 20 Oct 2010 | 9:01:52 UTC - in response to Message 42998.

Hi Travis (or anybody who knows),

after the fix was announced I downloaded a few new WUs and got to work. The first
de_14-2s_5_126674_1287400167_0 ran for 21+ hours and showed 16 still to go so I aborted it as it looked like a bad one the next de_14_2s_5_126673_1287400167_0 has been running for 17 hours and shows 27 hours to go.

Are these new WUs or still some from the bad batch. If they are the bad ones then no problem I will abort and download new ones. If they are not then I have a different problem and I don't know what to do with it. Both of these WUs showed a "to completion time" of about 18 hours so running for about c44 hours is way beyond what was expected. The Application running for both WUs is MW&H 0.04.

I am afraid that I am stumped HELP!!


There shouldn't be any problem with them if you're running 0.40...

Travis
Send message
Joined: 30 Aug 07
Posts: 15
Credit: 6,571
RAC: 0
Message 43002 - Posted: 20 Oct 2010 | 9:03:13 UTC - in response to Message 43000.

BTW.

It's O.K. to use further the opt. CPU app from available 'here'?

In app_info.xml named as V0.21 ?

Like this:
<app_name>milkyway</app_name> <version_num>21</version_num>


The new stock/orig. MW@h CPU app have 0.04 ..?



From where we could DL the other new N-body CPU app?
And how this app should be named in the app_info.xml file?



I understood it correct, that MW@h have currently 2 CPU apps?


Yes there are two CPU apps now. The 'optimized' apps probably won't work with a bunch of the new workunits we are sending out, so I would stick with the applications the server is sending out.

Profile John Black
Send message
Joined: 3 May 10
Posts: 53
Credit: 689,784
RAC: 905
Message 43003 - Posted: 20 Oct 2010 | 9:20:16 UTC - in response to Message 43001.

Hi Travis,
the application claims to be MW@H 0.04.not 0.4 Is that just a typo or does it have some meaning? The latest Wu has been running for 20 hours with 30 hours to completion that surely can't be right.

What do you advise?

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43006 - Posted: 20 Oct 2010 | 12:12:19 UTC
Last modified: 20 Oct 2010 | 12:15:13 UTC

O.K., it's now confusing..

'http://milkyway.cs.rpi.edu/milkyway/apps.php'

For Windows (32bit) CPU now:
MW@h:
0.04
0.04 (SSE2)

MW@h N-body:
0.21 (SSE2)


I allowed now also CPU WUs and BOINC DLed: milkyway_0.4_windows_intelx86.exe .
In BOINCs Tasks overview shown as MW@h 0.04 .
In BOINCs Messages overview shown as milkyway version 4 .

This make us members confused.. ;-)

So, this is the 0.04 app, correct?


BTW. My E7600 can up to SSE4.1, but why he got the app without the SSE2 extension function?
____________
Best regards!

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 43013 - Posted: 20 Oct 2010 | 15:06:04 UTC - in response to Message 43006.

O.K., it's now confusing..

'http://milkyway.cs.rpi.edu/milkyway/apps.php'

For Windows (32bit) CPU now:
MW@h:
0.04
0.04 (SSE2)

MW@h N-body:
0.21 (SSE2)


I allowed now also CPU WUs and BOINC DLed: milkyway_0.4_windows_intelx86.exe .
In BOINCs Tasks overview shown as MW@h 0.04 .
In BOINCs Messages overview shown as milkyway version 4 .

This make us members confused.. ;-)

So, this is the 0.04 app, correct?


BTW. My E7600 can up to SSE4.1, but why he got the app without the SSE2 extension function?


It's supposed to be 0.4. However, this is mysteriously reported by the page and BOINC as 0.04.

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43021 - Posted: 20 Oct 2010 | 18:10:39 UTC - in response to Message 43013.
Last modified: 20 Oct 2010 | 18:30:37 UTC

In the host overview here the WUs are also named as MilkyWay@Home v0.04 .


I tested the old opt. SSE3 CPU app, but after ~ 20 mins the process stayed still at 0 %. So this app shouldn't be used any longer.


All stock/orig. I got only the milkyway_0.4_windows_intelx86.exe, which is nonsense, if my CPU can up to SSE4.1, so I did..
I DLed manually (for my WinXP 32bit) the 'milkyway_0.4_windows_intelx86__sse2.exe', 'milkyway_nbody_0.21_windows_intelx86__sse2.exe' and made an app_info.xml with stock 0.24 cuda23 (GPU) and 0.4 + 0.21 (CPU) apps.

The 0.4 app is named:

<app_name>milkyway</app_name> <version_num>4</version_num>

..in app_info.xml and is shown in BOINC like I mentioned it in the upper message.


If I compare with the old stock/orig. (IIRC, 0.19) app, which needed ~ 10 hours/WU, the opt. SSE3 app needed ~ 2 hours/WU.. - it's look like the new 0.4 SSE2 app need ~ 17 hours/WU (estimate time).. so the Cr./CPU WU should be more than in past.
If not, it would be worthless for Cr.-hunter to let run MW@h CPU WUs.


If I enable CPU + GPU in the prefs, my PC get only GPU WUs.
I need to disable GPU for to get CPU WUs.
Then after my PC DLed a few CPU WUs, I need to enable again the GPU.
It would be possible, to 'optimize' the scheduler?
Maybe 4 WUs/CPU-Core and 12 WUs/GPU (limit of tasks in progress)?
____________
Best regards!

Len LE/GE
Send message
Joined: 8 Feb 08
Posts: 232
Credit: 86,905,955
RAC: 38,195
Message 43037 - Posted: 21 Oct 2010 | 2:12:33 UTC - in response to Message 43013.


It's supposed to be 0.4. However, this is mysteriously reported by the page and BOINC as 0.04.


Might be separation_VERSION_MINOR needs 2 digits to make it consistent over the different platforms builds?

1) separation_VERSION_MINOR 04 --> .04
2) separation_VERSION_MINOR 40 --> .40
3) separation_VERSION_MINOR 4 ---> .40 or .04 depending on platform build

Only 3) seems to make problems ;)

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43055 - Posted: 21 Oct 2010 | 8:50:26 UTC
Last modified: 21 Oct 2010 | 8:51:10 UTC

O.K., here we go..

The first CPU WU with the new 0.4 SSE2 app finished: 'resultid=224519892'.
57,179 secs CPU time - ~ 16 hours on an Intel Core2 Duo E7600 @ 2x 3.06 GHz.

The 'wingman' use: Running Milkyway@home ATI GPU application version 0.23 (Win64, CAL 1.4) by Gipsel ..an old app.

Why the CPU get a new app, if the GPU have still the old - and the results are similar/match well?

Why we can't use longer the old opt. CPU app? The old GPU app work.


I don't know how much Cr. this WU'll get.. but if I compare with S@h.. ~ 16 hours S@h-WUs are ~ 800 Cr. . And MW@h give ~ double Cr./WU than S@h, so this MW@h CPU WU should get ~ 1,600 Cr. .
But I guess it'll get only ~ 220 Cr. .

Is there room for more opt. of the new CPU app?
____________
Best regards!

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43092 - Posted: 22 Oct 2010 | 13:56:43 UTC - in response to Message 43021.
Last modified: 22 Oct 2010 | 14:06:10 UTC

(...)
If I enable CPU + GPU in the prefs, my PC get only GPU WUs.
I need to disable GPU for to get CPU WUs.
Then after my PC DLed a few CPU WUs, I need to enable again the GPU.
It would be possible, to 'optimize' the scheduler?
Maybe 4 WUs/CPU-Core and 12 WUs/GPU (limit of tasks in progress)?


Just curious..

I'm the only one which have probs to get CPU & GPU WUs simultaneously?

(with app_info.xml file)

If I enable CPU & GPU in the project prefs, my Intel Core2 Duo E7600 with GTX260-216 get only GPU WUs. Max. 12 WUs.

If BOINC DLed 12 GPU WUs, I need to disable GPU in the prefs.
BOINC DL then after every UL/report of a GPU result a new CPU WU.
If 10 GPU WUs were ULed/reported, and BOINC DLed 10 CPU WUs I enable again (also) GPU.

Then I have one GPU WU in calculation and one GPU WU prepared for calculation. And 2 CPU WUs in calculation and 8 CPU WUs prepared. If one CPU result ULed/reported, BOINC DL a new GPU WU. If only 2 CPU WUs in calculation, I need to disable again the GPU in the prefs.

How it's at your BOINC - how you do it?

Could be the MW@h scheduler do this better?
Like I mentioned.. 'reached limit of x tasks in progress' for CPU and GPU separated.
S@h have after the 3 day outages always a limit set, CPU and GPU separated. The server/scheduler could do it.
IIRC, the low limit is 40 WUs/CPU-Core and 320 WUs/GPU.

So here at MW@h maybe 2 WUs/CPU-Core and 6 WUs/GPU.

This would be possible?

This would be very helpful for us.

Thanks!
____________
Best regards!

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 43094 - Posted: 22 Oct 2010 | 14:05:39 UTC - in response to Message 43055.

Is there room for more opt. of the new CPU app?


In general, the 0.4s should be ~30% faster than before (and it is on Linux / OS X). It's only Windows where the 0.4s are very, very slow (over 200% slower than on the same hardware in Linux). I've figured out the main reason through, and have a temporary fix which hasn't been put on the servers yet. It turns out the combination of standard math library functions from the MSVCRT and the ones replaced by MinGW are pretty terrible. Building the separation with crlibm (which should actually be slower, and it is in Linux/OS X, since crlibm has slower, more precise math functions) actually ends up being much faster on Windows. Built with crlibm, it runs only 33% slower than in Linux, so about the same speed that the old ones would be. I have to come up with a better solution with faster / less precise math on Windows to get closer to Linux.

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43095 - Posted: 22 Oct 2010 | 14:28:36 UTC - in response to Message 43094.
Last modified: 22 Oct 2010 | 14:36:17 UTC

Soon we will see a new 0.41 (with and without sse2) CPU app?

It would be nice to know this, because I have an app_info.xml file with milkyway 0.4 sse2, milkyway_nbody 0.21 sse2 and milkyway 0.24 cuda23 apps. ;-)

This morning, my E7600 finished a 0.21 WU in ~ 45 mins. And got 18.11 Cr./WU.


I have not much knowledge about MW@h..
I don't understand why the CPUs got a new app and the GPUs not. If CPU and GPU results were compared, they have a similar result.
Why the GPUs didn't need a new app?

The old 0.21 (in the time of stock 0.19) SSE3 opt. app was very quick, ~ 2 hours/WU on my E7600 @ 3.06 GHz.
____________
Best regards!

Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 8 May 10
Posts: 576
Credit: 15,704,253
RAC: 0
Message 43096 - Posted: 22 Oct 2010 | 14:31:47 UTC - in response to Message 43095.

Why the GPUs didn't need a new app?

They do. We do not have the source for the ATI application, which is a major problem so I'm working on replacing it.

Profile Crunch3r
Volunteer developer
Avatar
Send message
Joined: 17 Feb 08
Posts: 358
Credit: 256,958,531
RAC: 2,898
Message 43097 - Posted: 22 Oct 2010 | 14:41:32 UTC - in response to Message 43094.

Is there room for more opt. of the new CPU app?


In general, the 0.4s should be ~30% faster than before (and it is on Linux / OS X). It's only Windows where the 0.4s are very, very slow (over 200% slower than on the same hardware in Linux). I've figured out the main reason through, and have a temporary fix which hasn't been put on the servers yet. It turns out the combination of standard math library functions from the MSVCRT and the ones replaced by MinGW are pretty terrible. Building the separation with crlibm (which should actually be slower, and it is in Linux/OS X, since crlibm has slower, more precise math functions) actually ends up being much faster on Windows. Built with crlibm, it runs only 33% slower than in Linux, so about the same speed that the old ones would be. I have to come up with a better solution with faster / less precise math on Windows to get closer to Linux.


Interesting... How about not using minGW at all for the windows platform ?
VS express doesn't cost a dime...


Why the GPUs didn't need a new app?



They do. We do not have the source for the ATI application, which is a major problem so I'm working on replacing it.


Hopefully that will NOT be using openCL... you should take a look at this one -> http://sourceforge.net/projects/calpp/
____________

Join BOINC United now!

Haris Dublas
Send message
Joined: 25 Feb 10
Posts: 49
Credit: 10,136,474
RAC: 0
Message 43103 - Posted: 23 Oct 2010 | 11:38:16 UTC - in response to Message 43092.

(...)
If I enable CPU + GPU in the prefs, my PC get only GPU WUs.
I need to disable GPU for to get CPU WUs.
Then after my PC DLed a few CPU WUs, I need to enable again the GPU.
It would be possible, to 'optimize' the scheduler?
Maybe 4 WUs/CPU-Core and 12 WUs/GPU (limit of tasks in progress)?


Just curious..

I'm the only one which have probs to get CPU & GPU WUs simultaneously?

(with app_info.xml file)

If I enable CPU & GPU in the project prefs, my Intel Core2 Duo E7600 with GTX260-216 get only GPU WUs. Max. 12 WUs.

If BOINC DLed 12 GPU WUs, I need to disable GPU in the prefs.
BOINC DL then after every UL/report of a GPU result a new CPU WU.
If 10 GPU WUs were ULed/reported, and BOINC DLed 10 CPU WUs I enable again (also) GPU.

Then I have one GPU WU in calculation and one GPU WU prepared for calculation. And 2 CPU WUs in calculation and 8 CPU WUs prepared. If one CPU result ULed/reported, BOINC DL a new GPU WU. If only 2 CPU WUs in calculation, I need to disable again the GPU in the prefs.

How it's at your BOINC - how you do it?

Could be the MW@h scheduler do this better?
Like I mentioned.. 'reached limit of x tasks in progress' for CPU and GPU separated.
S@h have after the 3 day outages always a limit set, CPU and GPU separated. The server/scheduler could do it.
IIRC, the low limit is 40 WUs/CPU-Core and 320 WUs/GPU.

So here at MW@h maybe 2 WUs/CPU-Core and 6 WUs/GPU.

This would be possible?

This would be very helpful for us.

Thanks!


CPU or GPU, limit is 6 wus per cpu core. So if you have dual core, you can only have 12 wus even if you have a 10 days cache. Even if you only crunch on the gpu, the limit will still be based on the number of cores of your cpu.

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43105 - Posted: 23 Oct 2010 | 13:22:48 UTC - in response to Message 43103.
Last modified: 23 Oct 2010 | 13:28:58 UTC

Yes, I know.. ;-)
This is currently.

But for us members which would like to crunch also on CPU simultaneously to the GPU, this is not well.

I don't know if other members get enough CPU and GPU WUs simultaneously continuously, if they set CPU + GPU in the prefs (and in 'auto mode').
Or if they need also change like I always the prefs.
Please let a message here how - you do it/it work for you all..

If the MW@h admins would like that the members crunch also on CPU, they should change the 'scheduler laws'.. ;-)

Maybe 2 WUs/CPU-Core and 6 WUs/GPU (this would be more correct).
Then I/we wouldn't need to use/do my/the upper mentioned instructions and could let run the project in 'auto mode'. ;-)
____________
Best regards!

Profile mdhittle*
Avatar
Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
Message 43106 - Posted: 23 Oct 2010 | 13:59:41 UTC - in response to Message 43105.
Last modified: 23 Oct 2010 | 14:23:26 UTC

Maybe 2 WUs/CPU-Core and 6 WUs/GPU (this would be more correct).
Then I/we wouldn't need to use/do my/the upper mentioned instructions and could let run the project in 'auto mode'. ;-)


This is not a good idea.

Right now I get a cache of 72 work units (12 cores x 6 = 72). Since I only run on the GPUs, this is a 27 minute cache.

With what you propose, I would only get 48 work units (12 cores x 2 + 4 GPUs x 6 = 48). 48 work units would only be a 18 minute cache.

Maybe 2 WUs per CPU core and 600 WUs per GPU would be much better. This would give me a 15 hour cache per GPU.

But, it will never happen. The database would need to be changed and the admins have resisted implementing this kind of change.

Edited: to correct the math

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43107 - Posted: 23 Oct 2010 | 14:14:56 UTC - in response to Message 43106.

No, the new 'scheduler law' should be separated for CPU and GPU.

If, 2 WUs/CPU-Core and 6 WUs/GPU..
If you have a Duo-CPU and one GPU, this are 4 CPU WUs and 6 GPU WUs in BOINC.

Your 12 thread CPU with 4 GPUs would have then 24 CPU WUs and 24 GPU WUs in BOINC.

About how high the limit, the admins will have to decide.
My idea is only to get a separate CPU and GPU limit, that the CPU get also autom. WUs.
The S@h scheduler can do this, so the MW@h scheduler could do this also.. ;-)

____________
Best regards!

Profile mdhittle*
Avatar
Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
Message 43108 - Posted: 23 Oct 2010 | 14:29:59 UTC - in response to Message 43107.
Last modified: 23 Oct 2010 | 14:30:33 UTC

I disagree.

If they change it, the amount of WUs per GPU needs to be raised significantly.

I have never run the CPU app here at Milky Way, but by what I have read, it takes somewhere between 7 to 15 hours to complete one workunit. Having 2 workunits per CPU would give you a 14 to 30 hour cache. Right now you have a cache of 42 to 90 hour cache per CPU depending on how long a CPU task takes to run.

If it is changed, the GPU cache should be the same length of time.

But, like I said, it won't happen, either way. Why? Because it would require a significant change to the database and the admins have resisted this kind of change in the past.

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43109 - Posted: 23 Oct 2010 | 14:41:19 UTC - in response to Message 43108.
Last modified: 23 Oct 2010 | 14:43:37 UTC

About the limit we don't need to 'argue'.. ;-)
..the admins will have to decide.

You have currently 18 WUs/GPU.
Why you need then now hundreds of WUs..?

O.K., maybe 2 WUs/CPU-Core (one in calculation, one prepared) and 18 WUs/GPU (one in calculation, 17 prepared)..
____________
Best regards!

Profile mdhittle*
Avatar
Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
Message 43111 - Posted: 23 Oct 2010 | 15:05:05 UTC - in response to Message 43109.
Last modified: 23 Oct 2010 | 15:46:54 UTC

You have currently 18 WUs/GPU.
Why you need then now hundreds of WUs..?


A better question would be, if you have GPUs, why would you want to run the CPU app?

I just checked some random accounts to see how long it is taking to run a CPU WU. The average is around 20 hours per WU. In 20 hours, a single 5800 GPU can complete 800 WUs.

You said earlier that S@H does what you want it to do, run both CPU and GPU tasks. But, this isn't S@H, it is MW@H. The configuration of MW@H works for the majority of the users. Why change it for you? If they do change it, it should be to accommodate the majority of the users. And the majority of the users are GPU ONLY users.

Now, back to your original question. If the server fails, a CPU user is more than likely not going to have their WU flow interrupted by it, since they have a 120 hour cache, based on 20 hours per CPU WU. A GPU only user runs out of cached WUs in 10 to 15 minutes causing an interruption of WU availability. If the cache was increased to the same time length as the CPU users, it wouldn’t cause an interruption. But, a 120 hour cache for one single 5800 would be 4800 WUs. Many of us have 4 5800 or 2 5970s (Same thing as 4 5800s).

The changes you are asking for, to make MW@H like S@H would hurt the majority of the users at MW@H.

But any change to the cache would require a significant change to the WU database, and the MW@H admins have resisted this kind of change in the past.

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43113 - Posted: 23 Oct 2010 | 17:44:13 UTC - in response to Message 43111.
Last modified: 23 Oct 2010 | 17:55:16 UTC

From where you know, that the most members of MW@h crunch only on the GPU?

An other question, if you have a CPU, why you don't want to crunch on?

I see, you have no experiences with the MW@h CPU apps.
In past with the stock 0.19 app a MW@h WU needed ~ 10 hours on my E7600 @ 3.06 GHz.
After installation of the opt. SSE3 MW@h 0.21 app, ~ 2 hours/WU.

Now, the new Windows 0.4 app have a BUG. The Linux app is ~ 200 % faster.
The new Windows 0.4 app should be ~ 30 % faster, than the old 0.19.
The admins need to look again to the app.
Then the stock 0.41 app would need ~ 7 hours.
If then a coder will look to the app and will release again opt. CPU apps, with SSE3 usage like in past, maybe this time also SSSE3 and SSE4.1 extension usage, the calculation time will be again ~ 2 hours/WU..

Change it only for me?
MW@h would profit from all CPUs out there.
Much more if they would send out autom. opt. CPU apps to all members (if enabled in the prefs).
BOINC is now smart to read the extensions of the CPUs.

I don't know if you understood my question..
Not hundreds/thousands of WUs/host..

After 18 WUs/GPU on your host.. and ..my question of 2 WUs/CPU-Core.
My E7600 with GTX260 would have 4 CPU WUs and 18 GPU WUs in BOINC.
Your 12 thread CPU with 4 GPUs would have 24 CPU WUs (additional) and 72 GPU WUs (like now) in BOINC.
..not more.
____________
Best regards!

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43116 - Posted: 23 Oct 2010 | 19:16:04 UTC - in response to Message 43113.

BTW.

Since 19 Oct 2010, MW@h have CPU-only apps.
http://milkyway.cs.rpi.edu/milkyway/apps.php

There is no MilkyWay@Home N-Body Simulation GPU app.

On my E7600 @ 3.06 GHz, this kind of WU need ~ 45 mins with 0.21 sse2 app.

____________
Best regards!

Profile [seti.international] Philip J. Fry
Avatar
Send message
Joined: 30 Apr 09
Posts: 67
Credit: 3,120,494
RAC: 0
Message 43396 - Posted: 1 Nov 2010 | 20:07:05 UTC
Last modified: 1 Nov 2010 | 20:07:42 UTC

@ admins

Please let us know, if it will be possible in future, that MW@h have a separate CPU and GPU WU limit in progress. Thanks!
(Like I mentioned it in the upper messages)


I read about GPU- and CPU- only user, they have no probs.. ;-)

To now, I didn't saw someone with CPU + GPU simultaneously and it work or not.
For me, it don't work well autom. simultaneously.
If CPU + GPU enabled, only GPU WUs in BOINC.
____________
Best regards!

Post to thread

Message boards : News : a fix for the output file issue


Main page · Your account · Message boards


Copyright © 2013 AstroInformatics Group