Message boards :
Application Code Discussion :
Recompiled Linux 32/64 apps
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 28 Aug 07 Posts: 35 Credit: 88,813,415 RAC: 5,082 |
OK on one Linux PC it just said SSE2 but on the windows using CPU-Z it says SSE3 so I guess I am fine and do not need a SSE2 version. I am not up on all this stuff :) Thanks Proud Founder and member of Have a look at my WebCam |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
No worries, judging by your stats for many projects you're struggling along quite well. A few suitable ATI cards in some of those boxes and you'll be in orbit. :) In case I didn't make myself clear before, Linux will not report SSE3 but pni instead. pni=SSE3. I don't know what CPU-Z shows, as I haven't got a working SSE3 capable computer. With perfect timing my X3350 computer with a HD3850 failed a few days before the ATI 32-bit Windows teaser application was released. |
Send message Joined: 28 Aug 07 Posts: 35 Credit: 88,813,415 RAC: 5,082 |
Turns out that Ubuntu (cat /proc/cpuinfo) shows it as ssse3, and the optimized ssse3 works fine, well so far anyway. Proud Founder and member of Have a look at my WebCam |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
Yes that's fine, SSSE3 is different to SSE3 though, it has some extra instructions and an extra "S". :) SSE3=pni and SSSE3=SSSE3. As mentioned in an earlier post in this thread an older Linux kernel may not report the most recent codes. This may explain why cat /proc/cpuinfo will report the same CPU differently on different boxes if you are using older and newer versions of Linux on them. Not sure of the difference in speed between SSE3 and SSSE3 versions, I've only ever tried the Linux64 SSE4.1. The main thing is you are crunching OK. |
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
Not sure of the difference in speed between SSE3 and SSSE3 versions... It should be zilch, since it's unlikely that the compiler will find opportunities in MW code to fit the multimedia-like SSSE3 instructions. HTH |
Send message Joined: 22 Feb 08 Posts: 260 Credit: 57,387,048 RAC: 0 |
Not sure of the difference in speed between SSE3 and SSSE3 versions... right, not much difference - but a higher SSE-level surely looks faster. :D mic. |
Send message Joined: 8 Apr 08 Posts: 45 Credit: 161,943,995 RAC: 0 |
Is there any plan for a linux app for ATI/GPU in the pipeline quad or i7 |
Send message Joined: 6 Jun 08 Posts: 8 Credit: 152,709 RAC: 0 |
Yes, release ATI app for Linux, please. CUDA for all systems is under construction now, but it makes sense to make app for stronger cards first and ATI is the one. I hate running my machine under Win just to be able to run Milky on GPU. :-( And I do not think there are more guys with NVidia GTS/GTX 200 cards than Linux ones with ATI 38xx/48xx. |
Send message Joined: 9 Mar 09 Posts: 37 Credit: 37,538,556 RAC: 0 |
Still a little disappointed with the linux apps. I have SSE3 running on an e6400@2.4 test machine ... -rw-r--r-- 1 root root 594 2009-02-18 21:33 app_info.xml -rw-r--r-- 1 root root 35127 2009-01-19 20:07 GPL.txt -rwxr-xr-x 1 root root 1222624 2009-02-18 21:25 milkyway_0.18_sse3_i686-pc-linux-gnu This yield a wu time of 35minutes Whereas the windows op app yields a time of 20minutes on a slower processor.(1.8ghz T2390) The difference seems a lot to me or is this typical ? |
Send message Joined: 18 Feb 09 Posts: 8 Credit: 2,424,453 RAC: 0 |
After reading that Intel compiler flag "-fp-model fast=2" was being used on these applications, I figured I would run a test, and results are as expected. The Linux 64bit SSSE3 0.18d mkII app downloaded from zslip.com gives invalid results. I suspect that the same flag was used on the other apps, too. Flag "-fp-model precise" needs to be used to get the fitness in line with expected output. This app's output: [admin@ntellx4 test_files]$ ./milkyway_0.18_SSSE3_x86_64-pc-linux-gnu [admin@ntellx4 test_files]$ more out searchname parameters [8]: 0.733171635575244 14.657212876628332 -1.705465347395041 16.911711745343634 28.077212666463502 -1.203290851581461 3.527360643924728 2.224821450587501 metadata: this is the metadata fitness: -3.027909854710229 speedimic_SSSE3_64: 0.18 Correct output: 86: searchname parameters [8]: 0.733171635575244 14.657212876628332 -1.705465347395041 16.911711745343634 28.077212666463502 -1.203290851581461 3.527360643924728 2.224821450587501 metadata: this is the metadata fitness: -3.027909854710189 your_app_name: 0.18 Travis, i'm starting to think a quorum should be used as in other BOINC projects, since the code is open source. |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
The Linux 64bit SSSE3 0.18d mkII app downloaded from zslip.com gives invalid results. No. Travis has given a range for an allowed deviation from the result you quoted. And speedimics results are within those requirements. You also get small deviations between the results using the x87 FPU, SSE2, PowerPC FPU or AltiVec. As long they are small enough it is okay. |
Send message Joined: 18 Feb 09 Posts: 8 Credit: 2,424,453 RAC: 0 |
If that is the case, then why does Travis have a thread about testing custom apps, saying "The results should look like the following:" instead of "should be within a deviation of the following:"? I'm sure others would like to know what this allowed deviation is. The result posted above has error as a result of about a 50% speed increase. That's pretty major. |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
He did post awhile ago that the results must be to the x place, I believe it was the 10th. Those results are equal to the 12th. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
Still a little disappointed with the linux apps... It might be because Linux manages power differently from Windows, running BOINC applications at a slow CPU frequency in order to save energy. See more details here. HTH |
Send message Joined: 9 Mar 09 Posts: 37 Credit: 37,538,556 RAC: 0 |
It might be because Linux manages power differently from Windows, running BOINC applications at a slow CPU frequency in order to save energy. See more details here. Afraid not, speedstep, thermal throttling are all disabled in the BIOS. Solid overclock applied and powernow/acpi daemons are not running once booted. The linux client is simply inefficient compared to win32. |
Send message Joined: 6 Sep 07 Posts: 66 Credit: 636,861 RAC: 0 |
The linux client is simply inefficient compared to win32. Different compilers perhaps? |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
The linux client is simply inefficient compared to win32. I thought the fastest version of Speedimic is quite close. Just looked it up, speedimics own Q9550 (45nm, 2.83GHz) is taking about 1030 seconds for a 27.77 credit WU under Linux. He has another host (Q6600, 65nm, appears to be overclocked to 2.7-2.8GHz from the benchmark values) completing the same tasks in about 1150 seconds under Windows. It is a somehow bad comparison because of the different clockspeed and that the 45nm CPUs should be faster per clock than their 65nm counterparts, but nonetheless it is clear that his Linux version can't be that bad. |
Send message Joined: 9 Mar 09 Posts: 37 Credit: 37,538,556 RAC: 0 |
The linux client is simply inefficient compared to win32. Here are my tasks .... Laptop win32 : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39511370 Linux : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=39523720 Thanks for the pointers I'll recheck if I'm using the wrong client |
Send message Joined: 9 Mar 09 Posts: 37 Credit: 37,538,556 RAC: 0 |
OK the q6600 is closer to arch than the q9550. He's using SSE4.1 on the q9550. Odd - He has a different client to me on his q6600. Any ideas where I can get it from ? |
Send message Joined: 26 Jul 08 Posts: 627 Credit: 94,940,203 RAC: 0 |
Here are my tasks .... You could use the SSSE3 version (instead of only SSE3) on your Core2. The SSE3 binary retains compatibilty to AMD CPUs and may be a bit slower. |
©2024 Astroinformatics Group