Welcome to MilkyWay@home

Posts by Len LE/GE

41) Message boards : Number crunching : AMD 4890 still supported? (Message 57733)
Posted 30 Mar 2013 by Len LE/GE
Post:

Exit status -1073741819 (0xffffffffc0000005) Unknown error number

...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x000007FEE52046D5 read attempt to address 0x00000000


Did you reinstall (or repair if offered) the driver after you changed the card?
If reinstall/repair doesn't help, try to deinstall + clean + install.
42) Message boards : Number crunching : AMD 4890 still supported? (Message 57668)
Posted 27 Mar 2013 by Len LE/GE
Post:
With Win 7 you should be ok as long as you don't go higher than 13.1. Above that you will get openCL problems with the current MW app.
It could help to see a log of a WU that errored out.
43) Message boards : Number crunching : AMD 4890 still supported? (Message 57652)
Posted 26 Mar 2013 by Len LE/GE
Post:
From what I saw, cat 13.1 still supports the HD 4000 series. But if you are running Win XP than the latest version with openCL (needed for MW) for this OS is 12.1
44) Message boards : Number crunching : Computation errors (Message 57477)
Posted 11 Mar 2013 by Len LE/GE
Post:
I will try the update, thanks.

One more thing to mention:
I have updated the AMD drivers to latest beta ones, as I got spontaneous reboots of the system when doing gpu (Einstein or Milkyway).
Screens blink as well at times.
Seems to be stable now with Einstein and Rosetta for 24 hours.
Win7 64bit and AMD HD5870 (1GB).


You are not the first one with that problem. Others did report it too (see thread After driver update all gpu wu's fail).
Your problem is the driver update.
AFAIK ccc13.1 is throwing that warning about "OpenCL extension is now part of core" but still works; ccc13.2 gives the clBuildProgram failure.

It's possibly the trick used to get the old IL kernel running isn't working anymore. If HD79xx cards don't show that errors, it would hint into the same direction.
Or it could simply be a bug starting in ccc13.1 and getting worth in ccc13.2
45) Message boards : News : New Separation Runs Started (Message 57475)
Posted 10 Mar 2013 by Len LE/GE
Post:
Got 1 error WU too.
WU 320835109
"Non-finite result
Failed to calculate likelihood"
46) Message boards : Number crunching : Computation errors (Message 57440)
Posted 7 Mar 2013 by Len LE/GE
Post:
Suddenly, all separation tasks I get are erroring out immediatelly on this computer! 6 tesks, than another 6 tasks, etc.... :O

http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=344187

Even after resetting the project! ...

Is there a "bad" serie of WUs, please?
Whats happening ? :(


Client state Compute error
Exit status -185 (0xffffffffffffff47) ERR_RESULT_START
<core_client_version>7.0.44</core_client_version>
<![CDATA[
<message>
couldn't start app: CreateProcess() failed - Klient nen� dr�itelem po�adovan�ho opr�vn�n�. (0x522)
</message>
]]>

Your computer has a problem to start the mw app.
Do you have problems with other projects on this computer too?
47) Message boards : Number crunching : ATI generate some invalids and nVidia not (Message 57335)
Posted 24 Feb 2013 by Len LE/GE
Post:
Your invalid WU is only valid up to 7 or 8 digits and your error WUs are showing lots of "NAN" in the results.
What type of 7970 is it exactly?
"Clock frequency: 1000 Mhz"
Is it factory overclocked or done by yourself?
Try setting it back to default clock and see if you still get errors.
48) Message boards : Number crunching : After driver update all gpu wu's fail (Message 57170)
Posted 5 Feb 2013 by Len LE/GE
Post:
On the lunatics board there are messages about them having problems with openCL with the newest ATI drivers too. Sounds like it started with 13.1 and got real bad with 13.2.

I saw a few mw WUs throwing out warnings "line 30: warning: OpenCL extension is now part of core" but did run through with driver 1084.2 (ccc 13.1?) while you get the additional
clBuildProgram: Build failure (-11): CL_BUILD_PROGRAM_FAILURE
Error building program from source (-11): CL_BUILD_PROGRAM_FAILURE
Error creating integral program from source

with your driver 1124.2 (ccc 13.2)
49) Message boards : News : 1.06 Errors with Windows Clients (Message 57164)
Posted 4 Feb 2013 by Len LE/GE
Post:
Anxiously applied this fix but am still having computation errors. Any other ideas would be greatly appreciated.


Seems the quick fix didn't work out for you.
Your choices are
- to set mw to NNT and after all tasks are sent back do a reset of the project
- or to disable n-body in the preferences until v1.07 is out (expected tomorrow)

In your case I would choose to reset mw to make sure you have a fresh and clean mw folder with no old/bad nbody dll hanging around.
50) Message boards : News : Nbody 1.06 (Message 57145)
Posted 2 Feb 2013 by Len LE/GE
Post:
I tried the bug-fix, still 6 not validated wu's, and a lot of computation errors from the wingmen.


Seems it's working for you. Your invalid list has only tasks that 'can't validate' because of winkmen with still faulty dlls.
I don't know if there is a server option to force reload the app with dlls without pushing out a v1.07?

51) Message boards : News : Nbody 1.06 (Message 57141)
Posted 2 Feb 2013 by Len LE/GE
Post:
Milkyway distributes a 'versioned' copy of this file with the name (currently) "pthreadGC2_64_nbody_1.06.dll"

AFAIR last time I did the automatic download for nbody I think it did the renaming while downloading. It's been so long ago that my memory could be wrong. You are right that the file nowadays comes down to the client with the version number. Personally I prefer using an app_info, so I can set the command line params I want.

It should not be too hard to see the proper name from the faulty dll; the correct filename is there, only the wrong binary inside. So it's no big science to replace it with a fresh copy from the server. ;)
I see Jeffery posted the proper steps in his announcement Message 57128.
Makes it clear that the versioned filname is to be used now.

BOINC's soft link 'feature' and it's side effects (search path etc.) ... I really don't want to comment on that.
52) Message boards : News : Nbody 1.06 (Message 57126)
Posted 1 Feb 2013 by Len LE/GE
Post:
Instead of resetting the project you could copy this
pthreadGC2_64_nbody_1.06.dll into your mw directory and rename it to pthreadGC2_64.dll after you deleted the old one.
Remember to stop mw while doing this.
53) Message boards : Number crunching : ATI generate some invalids and nVidia not (Message 57095)
Posted 31 Jan 2013 by Len LE/GE
Post:
Running an ATI Radeon HD 5850 (cat 12.1) I just checked my application details:
Consecutive valid tasks 15020
for
milkyway_separation 1.02 Windows x86 double OpenCL
54) Message boards : Number crunching : Average credit all over the place (Message 57088)
Posted 30 Jan 2013 by Len LE/GE
Post:
You only get the credits after a WU is validated.
Actually you have a high percentage of WUs that needed results from other computers to compare and validate against yours.
With some WUs validated immediately and others having to wait for more results to compare, the graph will never be a flat line. With a low number of results a day your will see big jumps in the graph, with a higher number of results per day this will level out better.
55) Message boards : Number crunching : Got Multiple WU's to run on my GTX 660TI but lost std CPU Apps (Message 56760)
Posted 6 Jan 2013 by Len LE/GE
Post:

Here is the app_info.xml file that made it work:

<app_info>
<app>
<name>milkyway</name>
</app>
<file_info>
<name>milkyway_separation_1.00_windows_x86_64.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway</app_name>
<version_num>100</version_num>
<cmdline></cmdline>
<file_ref>
<file_name>milkyway_separation_1.00_windows_x86_64.exe</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>milkyway_nbody</name>
</app>
<file_info>
<name>milkyway_nbody_1.04_windows_x86_64__mt.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>104</version_num>
<cmdline></cmdline>
<file_ref>
<file_name>milkyway_nbody_1.04_windows_x86_64__mt.exe</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>milkyway</name>
</app>
<file_info>
<name>milkyway_separation_1.02_windows_x86_64__opencl_nvidia.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway</app_name>
<version_num>102</version_num>
<flops>2.0e11</flops>
<avg_ncpus>0.05</avg_ncpus>
<max_ncpus>1</max_ncpus>
<coproc>
<type>CUDA</type>
<count>0.45</count>
</coproc>
<cmdline></cmdline>
<file_ref>
<file_name>milkyway_separation_1.02_windows_x86_64__opencl_nvidia.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>


Your nbody section is missing the reference to libgomp_64-1_nbody_1.04.dll and pthreadGC2_64_nbody_1.04.dll. And while you are at it, you should add the mt plan class. Additionally I would define the number of threads to use too; it was helping in the past to force BOINC to make use of multithreading.

<app>
<name>milkyway_nbody</name>
</app>
<file_info>
<name>milkyway_nbody_1.04_windows_x86_64__mt.exe</name>
<executable/>
</file_info>
<file_info>
<name>libgomp_64-1_nbody_1.04.dll</name>
<executable/>
</file_info>
<file_info>
<name>pthreadGC2_64_nbody_1.04.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>milkyway_nbody</app_name>
<version_num>104</version_num>
<plan_class>mt</plan_class>
<!-- the following 3 lines are setting 4 threads per WU -->
<avg_ncpus>4</avg_ncpus>
<max_ncpus>4</max_ncpus>
<cmdline>--nthreads=4 --disable-opencl</cmdline>
<file_ref>
<file_name>milkyway_nbody_1.04_windows_x86_64__mt.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>libgomp_64-1_nbody_1.04.dll</file_name>
<open_name>libgomp_64-1.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>pthreadGC2_64_nbody_1.04.dll</file_name>
<open_name>pthreadGC2_64.dll</open_name>
<copy_file/>
</file_ref>
</app_version>

You might try it with and without the --disable-opencl switch.
56) Message boards : Number crunching : project work generation question (Message 56605)
Posted 22 Dec 2012 by Len LE/GE
Post:
That number is sort of misleading. 395 WUs are gone faster than you can read the first digit.
New workunits are generated based on the results that are coming in and queued in 'ready to send'. It's manually started for a WU type and than runs automatically. We have problems if the validator isn't working ('waiting for validation' is jumping up big time) or the WU generator can't keep up with the demand.
Actually the demand is high (~320k WUs downloaded) and so the queue 'ready to send' got a bit small (seems stable ~400 right now) but new WUs are steady generated as the results are coming back and the generator is still able to keep up with the demand.
The number WUs per core and per gpu is limited for each computer to prevent single computers grapping tons of WUs while others can't get any. As long as you get your queue loaded with this max number WUs without problems, your don't need to worry. Haven't seen a message like "Project has no tasks available" for mw separation in my logs since a while.

57) Message boards : Number crunching : GTX670's and the MilkyWay project (Message 56555)
Posted 17 Dec 2012 by Len LE/GE
Post:
Here is one of the WU's that failed.

Task 361228420, WU 28093553.

If I need to copy and paste in the task details, please let me know and I will do it.


Here is the relevant part of 1 task log:

Found 2 platforms
Platform 0 information:
Name: NVIDIA CUDA
Version: OpenCL 1.1 CUDA 4.2.1
Vendor: NVIDIA Corporation
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Profile: FULL_PROFILE
Platform 1 information:
Name: NVIDIA CUDA
Version: OpenCL 1.1 CUDA 4.2.1
Vendor: NVIDIA Corporation
Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 2 CL devices
Device 'GeForce GTX 670' (NVIDIA Corporation:0x10de) (CL_DEVICE_TYPE_GPU)
Driver version: 306.97
Version: OpenCL 1.1 CUDA
Compute capability: 3.0
Max compute units: 7
Clock frequency: 1045 Mhz
Global mem size: 2147483648
Local mem size: 49152
Max const buf size: 65536
Double extension: cl_khr_fp64
Error creating context (-5): CL_OUT_OF_RESOURCES
Error getting device and context (-5): CL_OUT_OF_RESOURCES

Are you running other gpu projects on the same time on the same card?
If not, than something else is using too much resources on that card not letting enough room for mw to run.
58) Message boards : Number crunching : Finally, BOINC 7.0.40 and Intel GPU detection! Application coming? (Message 56457)
Posted 10 Dec 2012 by Len LE/GE
Post:
He is talking about Intel® Core™ i7-3610QM and the integrated graphic detected by BOINC 7.0.40.
Sounds Boinc versions before this did not detect it.
59) Message boards : Number crunching : Using other path... (Message 56389)
Posted 5 Dec 2012 by Len LE/GE
Post:
Below the SSE2 path there is only x87, so there is nothing going wrong. I think cruncher said there wasn't really a difference between SSE and X87 with the functions he was giving to Matt.

Gipsel used the library functions and the MS compiler for the X87 and SSE version because the Intel compiled versions for these older CPUs were slower (but for SSE2 and up the Intel compiled versions were running faster).
From an old PM from Gipsel:
- Athlon X2 CPUs with the SSE2/3 app are almost twice as fast (at same clock) as the SSE app on AthlonXPs
- with SSE2 AthlonX2 shows only about half of the performance of a Phenom or Core2 if the code is vectorized properly (otherwise there would be no difference)

Hope that gives you some idea about the speed differences between the different version and cpus, even that the above is for the old and outdated apps.

I think Matt is using an actual MS compiler, but with other libraries and some important functions replaced by functions from cruncher.

Started mw on an old MP2200 again to see what runtimes I get with the actual app on it. As expected, the stderr shows 'other path'. With less than 1% done it looks like 50hrs with mw using around 80% cpu and 20% used by other progs running. So your 42hrs seem reasonable for this type of cpu.
Will give you some better runtime estimates tomorrow after I got more of this WU done.
60) Message boards : Number crunching : Crunch with Ati 4870 & 6870 in the same computer (Message 56366)
Posted 3 Dec 2012 by Len LE/GE
Post:
For the 4870 you need cat 12.4, the openCL driver in newer cat versions do NOT support that card anymore. openCL in Version 12.6 and later needs at least a 5xxx series card.


Previous 20 · Next 20

©2024 Astroinformatics Group