Welcome to MilkyWay@home

a fix for the output file issue


Advanced search

Message boards : News : a fix for the output file issue
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43006 - Posted: 20 Oct 2010, 12:12:19 UTC
Last modified: 20 Oct 2010, 12:15:13 UTC

O.K., it's now confusing..

'http://milkyway.cs.rpi.edu/milkyway/apps.php'

For Windows (32bit) CPU now:
MW@h:
0.04
0.04 (SSE2)

MW@h N-body:
0.21 (SSE2)


I allowed now also CPU WUs and BOINC DLed: milkyway_0.4_windows_intelx86.exe .
In BOINCs Tasks overview shown as MW@h 0.04 .
In BOINCs Messages overview shown as milkyway version 4 .

This make us members confused.. ;-)

So, this is the 0.04 app, correct?


BTW. My E7600 can up to SSE4.1, but why he got the app without the SSE2 extension function?
ID: 43006 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
10 million credit badge9 year member badge
Message 43013 - Posted: 20 Oct 2010, 15:06:04 UTC - in response to Message 43006.  

O.K., it's now confusing..

'http://milkyway.cs.rpi.edu/milkyway/apps.php'

For Windows (32bit) CPU now:
MW@h:
0.04
0.04 (SSE2)

MW@h N-body:
0.21 (SSE2)


I allowed now also CPU WUs and BOINC DLed: milkyway_0.4_windows_intelx86.exe .
In BOINCs Tasks overview shown as MW@h 0.04 .
In BOINCs Messages overview shown as milkyway version 4 .

This make us members confused.. ;-)

So, this is the 0.04 app, correct?


BTW. My E7600 can up to SSE4.1, but why he got the app without the SSE2 extension function?


It's supposed to be 0.4. However, this is mysteriously reported by the page and BOINC as 0.04.
ID: 43013 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43021 - Posted: 20 Oct 2010, 18:10:39 UTC - in response to Message 43013.  
Last modified: 20 Oct 2010, 18:30:37 UTC

In the host overview here the WUs are also named as MilkyWay@Home v0.04 .


I tested the old opt. SSE3 CPU app, but after ~ 20 mins the process stayed still at 0 %. So this app shouldn't be used any longer.


All stock/orig. I got only the milkyway_0.4_windows_intelx86.exe, which is nonsense, if my CPU can up to SSE4.1, so I did..
I DLed manually (for my WinXP 32bit) the 'milkyway_0.4_windows_intelx86__sse2.exe', 'milkyway_nbody_0.21_windows_intelx86__sse2.exe' and made an app_info.xml with stock 0.24 cuda23 (GPU) and 0.4 + 0.21 (CPU) apps.

The 0.4 app is named:
<app_name>milkyway</app_name>
<version_num>4</version_num>

..in app_info.xml and is shown in BOINC like I mentioned it in the upper message.


If I compare with the old stock/orig. (IIRC, 0.19) app, which needed ~ 10 hours/WU, the opt. SSE3 app needed ~ 2 hours/WU.. - it's look like the new 0.4 SSE2 app need ~ 17 hours/WU (estimate time).. so the Cr./CPU WU should be more than in past.
If not, it would be worthless for Cr.-hunter to let run MW@h CPU WUs.


If I enable CPU + GPU in the prefs, my PC get only GPU WUs.
I need to disable GPU for to get CPU WUs.
Then after my PC DLed a few CPU WUs, I need to enable again the GPU.
It would be possible, to 'optimize' the scheduler?
Maybe 4 WUs/CPU-Core and 12 WUs/GPU (limit of tasks in progress)?
ID: 43021 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
100 million credit badge10 year member badge
Message 43037 - Posted: 21 Oct 2010, 2:12:33 UTC - in response to Message 43013.  


It's supposed to be 0.4. However, this is mysteriously reported by the page and BOINC as 0.04.


Might be separation_VERSION_MINOR needs 2 digits to make it consistent over the different platforms builds?

1) separation_VERSION_MINOR 04 --> .04
2) separation_VERSION_MINOR 40 --> .40
3) separation_VERSION_MINOR 4 ---> .40 or .04 depending on platform build

Only 3) seems to make problems ;)
ID: 43037 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43055 - Posted: 21 Oct 2010, 8:50:26 UTC
Last modified: 21 Oct 2010, 8:51:10 UTC

O.K., here we go..

The first CPU WU with the new 0.4 SSE2 app finished: 'resultid=224519892'.
57,179 secs CPU time - ~ 16 hours on an Intel Core2 Duo E7600 @ 2x 3.06 GHz.

The 'wingman' use: Running Milkyway@home ATI GPU application version 0.23 (Win64, CAL 1.4) by Gipsel ..an old app.

Why the CPU get a new app, if the GPU have still the old - and the results are similar/match well?

Why we can't use longer the old opt. CPU app? The old GPU app work.


I don't know how much Cr. this WU'll get.. but if I compare with S@h.. ~ 16 hours S@h-WUs are ~ 800 Cr. . And MW@h give ~ double Cr./WU than S@h, so this MW@h CPU WU should get ~ 1,600 Cr. .
But I guess it'll get only ~ 220 Cr. .

Is there room for more opt. of the new CPU app?
ID: 43055 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43092 - Posted: 22 Oct 2010, 13:56:43 UTC - in response to Message 43021.  
Last modified: 22 Oct 2010, 14:06:10 UTC

(...)
If I enable CPU + GPU in the prefs, my PC get only GPU WUs.
I need to disable GPU for to get CPU WUs.
Then after my PC DLed a few CPU WUs, I need to enable again the GPU.
It would be possible, to 'optimize' the scheduler?
Maybe 4 WUs/CPU-Core and 12 WUs/GPU (limit of tasks in progress)?


Just curious..

I'm the only one which have probs to get CPU & GPU WUs simultaneously?

(with app_info.xml file)

If I enable CPU & GPU in the project prefs, my Intel Core2 Duo E7600 with GTX260-216 get only GPU WUs. Max. 12 WUs.

If BOINC DLed 12 GPU WUs, I need to disable GPU in the prefs.
BOINC DL then after every UL/report of a GPU result a new CPU WU.
If 10 GPU WUs were ULed/reported, and BOINC DLed 10 CPU WUs I enable again (also) GPU.

Then I have one GPU WU in calculation and one GPU WU prepared for calculation. And 2 CPU WUs in calculation and 8 CPU WUs prepared. If one CPU result ULed/reported, BOINC DL a new GPU WU. If only 2 CPU WUs in calculation, I need to disable again the GPU in the prefs.

How it's at your BOINC - how you do it?

Could be the MW@h scheduler do this better?
Like I mentioned.. 'reached limit of x tasks in progress' for CPU and GPU separated.
S@h have after the 3 day outages always a limit set, CPU and GPU separated. The server/scheduler could do it.
IIRC, the low limit is 40 WUs/CPU-Core and 320 WUs/GPU.

So here at MW@h maybe 2 WUs/CPU-Core and 6 WUs/GPU.

This would be possible?

This would be very helpful for us.

Thanks!
ID: 43092 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
10 million credit badge9 year member badge
Message 43094 - Posted: 22 Oct 2010, 14:05:39 UTC - in response to Message 43055.  

Is there room for more opt. of the new CPU app?


In general, the 0.4s should be ~30% faster than before (and it is on Linux / OS X). It's only Windows where the 0.4s are very, very slow (over 200% slower than on the same hardware in Linux). I've figured out the main reason through, and have a temporary fix which hasn't been put on the servers yet. It turns out the combination of standard math library functions from the MSVCRT and the ones replaced by MinGW are pretty terrible. Building the separation with crlibm (which should actually be slower, and it is in Linux/OS X, since crlibm has slower, more precise math functions) actually ends up being much faster on Windows. Built with crlibm, it runs only 33% slower than in Linux, so about the same speed that the old ones would be. I have to come up with a better solution with faster / less precise math on Windows to get closer to Linux.
ID: 43094 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43095 - Posted: 22 Oct 2010, 14:28:36 UTC - in response to Message 43094.  
Last modified: 22 Oct 2010, 14:36:17 UTC

Soon we will see a new 0.41 (with and without sse2) CPU app?

It would be nice to know this, because I have an app_info.xml file with milkyway 0.4 sse2, milkyway_nbody 0.21 sse2 and milkyway 0.24 cuda23 apps. ;-)

This morning, my E7600 finished a 0.21 WU in ~ 45 mins. And got 18.11 Cr./WU.


I have not much knowledge about MW@h..
I don't understand why the CPUs got a new app and the GPUs not. If CPU and GPU results were compared, they have a similar result.
Why the GPUs didn't need a new app?

The old 0.21 (in the time of stock 0.19) SSE3 opt. app was very quick, ~ 2 hours/WU on my E7600 @ 3.06 GHz.
ID: 43095 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
10 million credit badge9 year member badge
Message 43096 - Posted: 22 Oct 2010, 14:31:47 UTC - in response to Message 43095.  

Why the GPUs didn't need a new app?

They do. We do not have the source for the ATI application, which is a major problem so I'm working on replacing it.
ID: 43096 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileCrunch3r
Volunteer developer
Avatar

Send message
Joined: 17 Feb 08
Posts: 363
Credit: 258,227,990
RAC: 0
200 million credit badge10 year member badge
Message 43097 - Posted: 22 Oct 2010, 14:41:32 UTC - in response to Message 43094.  

Is there room for more opt. of the new CPU app?


In general, the 0.4s should be ~30% faster than before (and it is on Linux / OS X). It's only Windows where the 0.4s are very, very slow (over 200% slower than on the same hardware in Linux). I've figured out the main reason through, and have a temporary fix which hasn't been put on the servers yet. It turns out the combination of standard math library functions from the MSVCRT and the ones replaced by MinGW are pretty terrible. Building the separation with crlibm (which should actually be slower, and it is in Linux/OS X, since crlibm has slower, more precise math functions) actually ends up being much faster on Windows. Built with crlibm, it runs only 33% slower than in Linux, so about the same speed that the old ones would be. I have to come up with a better solution with faster / less precise math on Windows to get closer to Linux.


Interesting... How about not using minGW at all for the windows platform ?
VS express doesn't cost a dime...


Why the GPUs didn't need a new app?



They do. We do not have the source for the ATI application, which is a major problem so I'm working on replacing it.


Hopefully that will NOT be using openCL... you should take a look at this one -> http://sourceforge.net/projects/calpp/

Join Support science! Joinc Team BOINC United now!
ID: 43097 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Haris Dublas

Send message
Joined: 25 Feb 10
Posts: 49
Credit: 10,136,474
RAC: 0
10 million credit badge9 year member badge
Message 43103 - Posted: 23 Oct 2010, 11:38:16 UTC - in response to Message 43092.  

(...)
If I enable CPU + GPU in the prefs, my PC get only GPU WUs.
I need to disable GPU for to get CPU WUs.
Then after my PC DLed a few CPU WUs, I need to enable again the GPU.
It would be possible, to 'optimize' the scheduler?
Maybe 4 WUs/CPU-Core and 12 WUs/GPU (limit of tasks in progress)?


Just curious..

I'm the only one which have probs to get CPU & GPU WUs simultaneously?

(with app_info.xml file)

If I enable CPU & GPU in the project prefs, my Intel Core2 Duo E7600 with GTX260-216 get only GPU WUs. Max. 12 WUs.

If BOINC DLed 12 GPU WUs, I need to disable GPU in the prefs.
BOINC DL then after every UL/report of a GPU result a new CPU WU.
If 10 GPU WUs were ULed/reported, and BOINC DLed 10 CPU WUs I enable again (also) GPU.

Then I have one GPU WU in calculation and one GPU WU prepared for calculation. And 2 CPU WUs in calculation and 8 CPU WUs prepared. If one CPU result ULed/reported, BOINC DL a new GPU WU. If only 2 CPU WUs in calculation, I need to disable again the GPU in the prefs.

How it's at your BOINC - how you do it?

Could be the MW@h scheduler do this better?
Like I mentioned.. 'reached limit of x tasks in progress' for CPU and GPU separated.
S@h have after the 3 day outages always a limit set, CPU and GPU separated. The server/scheduler could do it.
IIRC, the low limit is 40 WUs/CPU-Core and 320 WUs/GPU.

So here at MW@h maybe 2 WUs/CPU-Core and 6 WUs/GPU.

This would be possible?

This would be very helpful for us.

Thanks!


CPU or GPU, limit is 6 wus per cpu core. So if you have dual core, you can only have 12 wus even if you have a 10 days cache. Even if you only crunch on the gpu, the limit will still be based on the number of cores of your cpu.
ID: 43103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43105 - Posted: 23 Oct 2010, 13:22:48 UTC - in response to Message 43103.  
Last modified: 23 Oct 2010, 13:28:58 UTC

Yes, I know.. ;-)
This is currently.

But for us members which would like to crunch also on CPU simultaneously to the GPU, this is not well.

I don't know if other members get enough CPU and GPU WUs simultaneously continuously, if they set CPU + GPU in the prefs (and in 'auto mode').
Or if they need also change like I always the prefs.
Please let a message here how - you do it/it work for you all..

If the MW@h admins would like that the members crunch also on CPU, they should change the 'scheduler laws'.. ;-)

Maybe 2 WUs/CPU-Core and 6 WUs/GPU (this would be more correct).
Then I/we wouldn't need to use/do my/the upper mentioned instructions and could let run the project in 'auto mode'. ;-)
ID: 43105 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemdhittle*
Avatar

Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
200 million credit badge9 year member badge
Message 43106 - Posted: 23 Oct 2010, 13:59:41 UTC - in response to Message 43105.  
Last modified: 23 Oct 2010, 14:23:26 UTC

Maybe 2 WUs/CPU-Core and 6 WUs/GPU (this would be more correct).
Then I/we wouldn't need to use/do my/the upper mentioned instructions and could let run the project in 'auto mode'. ;-)


This is not a good idea.

Right now I get a cache of 72 work units (12 cores x 6 = 72). Since I only run on the GPUs, this is a 27 minute cache.

With what you propose, I would only get 48 work units (12 cores x 2 + 4 GPUs x 6 = 48). 48 work units would only be a 18 minute cache.

Maybe 2 WUs per CPU core and 600 WUs per GPU would be much better. This would give me a 15 hour cache per GPU.

But, it will never happen. The database would need to be changed and the admins have resisted implementing this kind of change.

Edited: to correct the math
ID: 43106 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43107 - Posted: 23 Oct 2010, 14:14:56 UTC - in response to Message 43106.  

No, the new 'scheduler law' should be separated for CPU and GPU.

If, 2 WUs/CPU-Core and 6 WUs/GPU..
If you have a Duo-CPU and one GPU, this are 4 CPU WUs and 6 GPU WUs in BOINC.

Your 12 thread CPU with 4 GPUs would have then 24 CPU WUs and 24 GPU WUs in BOINC.

About how high the limit, the admins will have to decide.
My idea is only to get a separate CPU and GPU limit, that the CPU get also autom. WUs.
The S@h scheduler can do this, so the MW@h scheduler could do this also.. ;-)

ID: 43107 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemdhittle*
Avatar

Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
200 million credit badge9 year member badge
Message 43108 - Posted: 23 Oct 2010, 14:29:59 UTC - in response to Message 43107.  
Last modified: 23 Oct 2010, 14:30:33 UTC

I disagree.

If they change it, the amount of WUs per GPU needs to be raised significantly.

I have never run the CPU app here at Milky Way, but by what I have read, it takes somewhere between 7 to 15 hours to complete one workunit. Having 2 workunits per CPU would give you a 14 to 30 hour cache. Right now you have a cache of 42 to 90 hour cache per CPU depending on how long a CPU task takes to run.

If it is changed, the GPU cache should be the same length of time.

But, like I said, it won't happen, either way. Why? Because it would require a significant change to the database and the admins have resisted this kind of change in the past.
ID: 43108 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43109 - Posted: 23 Oct 2010, 14:41:19 UTC - in response to Message 43108.  
Last modified: 23 Oct 2010, 14:43:37 UTC

About the limit we don't need to 'argue'.. ;-)
..the admins will have to decide.

You have currently 18 WUs/GPU.
Why you need then now hundreds of WUs..?

O.K., maybe 2 WUs/CPU-Core (one in calculation, one prepared) and 18 WUs/GPU (one in calculation, 17 prepared)..
ID: 43109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemdhittle*
Avatar

Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
200 million credit badge9 year member badge
Message 43111 - Posted: 23 Oct 2010, 15:05:05 UTC - in response to Message 43109.  
Last modified: 23 Oct 2010, 15:46:54 UTC

ID: 43111 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43113 - Posted: 23 Oct 2010, 17:44:13 UTC - in response to Message 43111.  
Last modified: 23 Oct 2010, 17:55:16 UTC

From where you know, that the most members of MW@h crunch only on the GPU?

An other question, if you have a CPU, why you don't want to crunch on?

I see, you have no experiences with the MW@h CPU apps.
In past with the stock 0.19 app a MW@h WU needed ~ 10 hours on my E7600 @ 3.06 GHz.
After installation of the opt. SSE3 MW@h 0.21 app, ~ 2 hours/WU.

Now, the new Windows 0.4 app have a BUG. The Linux app is ~ 200 % faster.
The new Windows 0.4 app should be ~ 30 % faster, than the old 0.19.
The admins need to look again to the app.
Then the stock 0.41 app would need ~ 7 hours.
If then a coder will look to the app and will release again opt. CPU apps, with SSE3 usage like in past, maybe this time also SSSE3 and SSE4.1 extension usage, the calculation time will be again ~ 2 hours/WU..

Change it only for me?
MW@h would profit from all CPUs out there.
Much more if they would send out autom. opt. CPU apps to all members (if enabled in the prefs).
BOINC is now smart to read the extensions of the CPUs.

I don't know if you understood my question..
Not hundreds/thousands of WUs/host..

After 18 WUs/GPU on your host.. and ..my question of 2 WUs/CPU-Core.
My E7600 with GTX260 would have 4 CPU WUs and 18 GPU WUs in BOINC.
Your 12 thread CPU with 4 GPUs would have 24 CPU WUs (additional) and 72 GPU WUs (like now) in BOINC.
..not more.
ID: 43113 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43116 - Posted: 23 Oct 2010, 19:16:04 UTC - in response to Message 43113.  

BTW.

Since 19 Oct 2010, MW@h have CPU-only apps.
http://milkyway.cs.rpi.edu/milkyway/apps.php

There is no MilkyWay@Home N-Body Simulation GPU app.

On my E7600 @ 3.06 GHz, this kind of WU need ~ 45 mins with 0.21 sse2 app.

ID: 43116 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileSutaru Tsureku

Send message
Joined: 30 Apr 09
Posts: 95
Credit: 24,003,766
RAC: 0
20 million credit badge10 year member badge
Message 43396 - Posted: 1 Nov 2010, 20:07:05 UTC
Last modified: 1 Nov 2010, 20:07:42 UTC

@ admins

Please let us know, if it will be possible in future, that MW@h have a separate CPU and GPU WU limit in progress. Thanks!
(Like I mentioned it in the upper messages)


I read about GPU- and CPU- only user, they have no probs.. ;-)

To now, I didn't saw someone with CPU + GPU simultaneously and it work or not.
For me, it don't work well autom. simultaneously.
If CPU + GPU enabled, only GPU WUs in BOINC.
ID: 43396 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : News : a fix for the output file issue

©2019 Astroinformatics Group