Welcome to MilkyWay@home

Recompiled Linux 32/64 apps

Message boards : Application Code Discussion : Recompiled Linux 32/64 apps
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
Profile Dingo
Avatar

Send message
Joined: 28 Aug 07
Posts: 35
Credit: 88,705,546
RAC: 0
Message 11995 - Posted: 21 Feb 2009, 3:27:01 UTC - in response to Message 11989.  

OK on one Linux PC it just said SSE2 but on the windows using CPU-Z it says SSE3 so I guess I am fine and do not need a SSE2 version. I am not up on all this stuff :)

Thanks

Proud Founder and member of



Have a look at my WebCam
ID: 11995 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 12017 - Posted: 21 Feb 2009, 4:00:41 UTC - in response to Message 11995.  

No worries, judging by your stats for many projects you're struggling along quite well.
A few suitable ATI cards in some of those boxes and you'll be in orbit. :)

In case I didn't make myself clear before, Linux will not report SSE3 but pni instead. pni=SSE3.

I don't know what CPU-Z shows, as I haven't got a working SSE3 capable computer. With perfect timing my X3350 computer with a HD3850 failed a few days before the ATI 32-bit Windows teaser application was released.
ID: 12017 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Dingo
Avatar

Send message
Joined: 28 Aug 07
Posts: 35
Credit: 88,705,546
RAC: 0
Message 12021 - Posted: 21 Feb 2009, 4:13:45 UTC - in response to Message 12017.  

Turns out that Ubuntu (cat /proc/cpuinfo) shows it as ssse3, and the optimized ssse3 works fine, well so far anyway.

Proud Founder and member of



Have a look at my WebCam
ID: 12021 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 12034 - Posted: 21 Feb 2009, 4:54:00 UTC - in response to Message 12021.  
Last modified: 21 Feb 2009, 5:09:28 UTC

Yes that's fine, SSSE3 is different to SSE3 though, it has some extra instructions and an extra "S". :)
SSE3=pni and SSSE3=SSSE3.

As mentioned in an earlier post in this thread an older Linux kernel may not report the most recent codes. This may explain why cat /proc/cpuinfo will report the same CPU differently on different boxes if you are using older and newer versions of Linux on them.

Not sure of the difference in speed between SSE3 and SSSE3 versions, I've only ever tried the Linux64 SSE4.1.
The main thing is you are crunching OK.
ID: 12034 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
Avatar

Send message
Joined: 6 Sep 07
Posts: 66
Credit: 636,861
RAC: 0
Message 12403 - Posted: 22 Feb 2009, 17:51:13 UTC - in response to Message 12034.  

Not sure of the difference in speed between SSE3 and SSSE3 versions...

It should be zilch, since it's unlikely that the compiler will find opportunities in MW code to fit the multimedia-like SSSE3 instructions.

HTH

ID: 12403 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile speedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
Message 12411 - Posted: 22 Feb 2009, 18:28:58 UTC - in response to Message 12403.  

Not sure of the difference in speed between SSE3 and SSSE3 versions...

It should be zilch, since it's unlikely that the compiler will find opportunities in MW code to fit the multimedia-like SSSE3 instructions.

HTH


right, not much difference - but a higher SSE-level surely looks faster. :D
mic.


ID: 12411 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile RAMen
Avatar

Send message
Joined: 8 Apr 08
Posts: 45
Credit: 161,943,995
RAC: 0
Message 13746 - Posted: 3 Mar 2009, 6:58:44 UTC
Last modified: 3 Mar 2009, 7:03:50 UTC

Is there any plan for a linux app for ATI/GPU in the pipeline quad or i7
ID: 13746 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile gabberattack (johnny, eriq, se...

Send message
Joined: 6 Jun 08
Posts: 8
Credit: 152,709
RAC: 0
Message 16864 - Posted: 25 Mar 2009, 22:18:41 UTC

Yes, release ATI app for Linux, please. CUDA for all systems is under construction now, but it makes sense to make app for stronger cards first and ATI is the one. I hate running my machine under Win just to be able to run Milky on GPU. :-( And I do not think there are more guys with NVidia GTS/GTX 200 cards than Linux ones with ATI 38xx/48xx.
ID: 16864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile shaf*

Send message
Joined: 9 Mar 09
Posts: 37
Credit: 37,538,556
RAC: 0
Message 17624 - Posted: 5 Apr 2009, 12:37:55 UTC

Still a little disappointed with the linux apps.

I have SSE3 running on an e6400@2.4 test machine ...

-rw-r--r-- 1 root root 594 2009-02-18 21:33 app_info.xml
-rw-r--r-- 1 root root 35127 2009-01-19 20:07 GPL.txt
-rwxr-xr-x 1 root root 1222624 2009-02-18 21:25 milkyway_0.18_sse3_i686-pc-linux-gnu

This yield a wu time of 35minutes

Whereas the windows op app yields a time of 20minutes on a slower processor.(1.8ghz T2390)

The difference seems a lot to me or is this typical ?



ID: 17624 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mfl0p

Send message
Joined: 18 Feb 09
Posts: 8
Credit: 2,424,453
RAC: 0
Message 17633 - Posted: 5 Apr 2009, 15:05:54 UTC
Last modified: 5 Apr 2009, 15:08:00 UTC

After reading that Intel compiler flag "-fp-model fast=2" was being used on these applications, I figured I would run a test, and results are as expected.

The Linux 64bit SSSE3 0.18d mkII app downloaded from zslip.com gives invalid results. I suspect that the same flag was used on the other apps, too. Flag "-fp-model precise" needs to be used to get the fitness in line with expected output.

This app's output:
[admin@ntellx4 test_files]$ ./milkyway_0.18_SSSE3_x86_64-pc-linux-gnu 
[admin@ntellx4 test_files]$ more out
searchname
parameters [8]: 0.733171635575244 14.657212876628332 -1.705465347395041 16.911711745343634 28.077212666463502 -1.203290851581461 3.527360643924728 2.224821450587501
metadata: this is the metadata
fitness: -3.027909854710229
speedimic_SSSE3_64: 0.18



Correct output:
86:
searchname
parameters [8]: 0.733171635575244 14.657212876628332 -1.705465347395041 16.911711745343634 28.077212666463502 -1.203290851581461 3.527360643924728 2.224821450587501
metadata: this is the metadata
fitness: -3.027909854710189
your_app_name: 0.18


Travis, i'm starting to think a quorum should be used as in other BOINC projects, since the code is open source.
ID: 17633 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 17634 - Posted: 5 Apr 2009, 15:15:09 UTC - in response to Message 17633.  
Last modified: 5 Apr 2009, 15:17:20 UTC

The Linux 64bit SSSE3 0.18d mkII app downloaded from zslip.com gives invalid results.
[..]
This app's output:
fitness: -3.027909854710229

Correct output:
fitness: -3.027909854710189

No. Travis has given a range for an allowed deviation from the result you quoted. And speedimics results are within those requirements.
You also get small deviations between the results using the x87 FPU, SSE2, PowerPC FPU or AltiVec. As long they are small enough it is okay.
ID: 17634 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mfl0p

Send message
Joined: 18 Feb 09
Posts: 8
Credit: 2,424,453
RAC: 0
Message 17637 - Posted: 5 Apr 2009, 15:37:23 UTC
Last modified: 5 Apr 2009, 15:41:29 UTC

If that is the case, then why does Travis have a thread about testing custom apps, saying "The results should look like the following:" instead of "should be within a deviation of the following:"?

I'm sure others would like to know what this allowed deviation is. The result posted above has error as a result of about a 50% speed increase. That's pretty major.
ID: 17637 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 17639 - Posted: 5 Apr 2009, 16:07:47 UTC

He did post awhile ago that the results must be to the x place, I believe it was the 10th. Those results are equal to the 12th.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 17639 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
Avatar

Send message
Joined: 6 Sep 07
Posts: 66
Credit: 636,861
RAC: 0
Message 17665 - Posted: 5 Apr 2009, 20:48:49 UTC - in response to Message 17624.  

Still a little disappointed with the linux apps...

Whereas the windows op app yields a time of 20minutes on a slower processor.(1.8ghz T2390)

The difference seems a lot to me or is this typical ?

It might be because Linux manages power differently from Windows, running BOINC applications at a slow CPU frequency in order to save energy. See more details here.

HTH
ID: 17665 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile shaf*

Send message
Joined: 9 Mar 09
Posts: 37
Credit: 37,538,556
RAC: 0
Message 17668 - Posted: 5 Apr 2009, 21:42:59 UTC - in response to Message 17665.  

It might be because Linux manages power differently from Windows, running BOINC applications at a slow CPU frequency in order to save energy. See more details here.

HTH


Afraid not, speedstep, thermal throttling are all disabled in the BIOS. Solid overclock applied and powernow/acpi daemons are not running once booted.

The linux client is simply inefficient compared to win32.
ID: 17668 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
Avatar

Send message
Joined: 6 Sep 07
Posts: 66
Credit: 636,861
RAC: 0
Message 17677 - Posted: 5 Apr 2009, 22:48:49 UTC - in response to Message 17668.  

The linux client is simply inefficient compared to win32.

Different compilers perhaps?

ID: 17677 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 17678 - Posted: 5 Apr 2009, 22:49:57 UTC - in response to Message 17668.  

The linux client is simply inefficient compared to win32.

I thought the fastest version of Speedimic is quite close.

Just looked it up, speedimics own Q9550 (45nm, 2.83GHz) is taking about 1030 seconds for a 27.77 credit WU under Linux.

He has another host (Q6600, 65nm, appears to be overclocked to 2.7-2.8GHz from the benchmark values) completing the same tasks in about 1150 seconds under Windows. It is a somehow bad comparison because of the different clockspeed and that the 45nm CPUs should be faster per clock than their 65nm counterparts, but nonetheless it is clear that his Linux version can't be that bad.
ID: 17678 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile shaf*

Send message
Joined: 9 Mar 09
Posts: 37
Credit: 37,538,556
RAC: 0
Message 17680 - Posted: 5 Apr 2009, 22:57:12 UTC - in response to Message 17678.  
Last modified: 5 Apr 2009, 22:58:38 UTC

The linux client is simply inefficient compared to win32.

I thought the fastest version of Speedimic is quite close.

Just looked it up, speedimics own Q9550 (45nm, 2.83GHz) is taking about 1030 seconds for a 27.77 credit WU under Linux.

He has another host (Q6600, 65nm, appears to be overclocked to 2.7-2.8GHz from the benchmark values) completing the same tasks in about 1150 seconds under Windows. It is a somehow bad comparison because of the different clockspeed and that the 45nm CPUs should be faster per clock than their 65nm counterparts, but nonetheless it is clear that his Linux version can't be that bad.


Here are my tasks ....

Laptop win32 : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39511370

Linux : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39523720

Thanks for the pointers I'll recheck if I'm using the wrong client
ID: 17680 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile shaf*

Send message
Joined: 9 Mar 09
Posts: 37
Credit: 37,538,556
RAC: 0
Message 17681 - Posted: 5 Apr 2009, 23:04:16 UTC - in response to Message 17678.  


I thought the fastest version of Speedimic is quite close.

Just looked it up, speedimics own Q9550 (45nm, 2.83GHz) is taking about 1030 seconds for a 27.77 credit WU under Linux.

He has another host (Q6600, 65nm, appears to be overclocked to 2.7-2.8GHz from the benchmark values) completing the same tasks in about 1150 seconds under Windows. It is a somehow bad comparison because of the different clockspeed and that the 45nm CPUs should be faster per clock than their 65nm counterparts, but nonetheless it is clear that his Linux version can't be that bad.


OK the q6600 is closer to arch than the q9550. He's using SSE4.1 on the q9550.

Odd - He has a different client to me on his q6600. Any ideas where I can get it from ?

ID: 17681 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 17683 - Posted: 5 Apr 2009, 23:15:20 UTC - in response to Message 17680.  

Here are my tasks ....

Laptop win32 : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39511370

Linux : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39523720

Thanks for the pointers I'll recheck if I'm using the wrong client

You could use the SSSE3 version (instead of only SSE3) on your Core2. The SSE3 binary retains compatibilty to AMD CPUs and may be a bit slower.
ID: 17683 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Application Code Discussion : Recompiled Linux 32/64 apps

©2024 Astroinformatics Group