Welcome to MilkyWay@home

getting errors with new v1.02 separation application?


Advanced search

Message boards : Number crunching : getting errors with new v1.02 separation application?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53303 - Posted: 19 Feb 2012, 18:05:24 UTC

like the title says, v1.02 tasks are all erroring out pretty much as soon as they start. the longest only ran for ~3 seconds. i followed the Separation updated to 1.00 thread very closely to see which ATI driver might work best for my combination of hardware and software, but have had no success thus far. i first started by looking for folks w/ 58xx series GPUs, most of which were running on a Win7 x64 platform...so i had to narrow those search results down to WinXP 32-bit platforms, of which i only found one belonging to RAMen. i had a look at some of his validated results to see what a successful Stderr output should look like under the new v1.02 application - a typical successful result looks like this:

Stderr output

<core_client_version>7.0.12</core_client_version>
<![CDATA[
<stderr_txt>
BOINC: parse gpu_opencl_dev_index 0
<search_application> milkyway_separation 1.02 Windows x86 double OpenCL </search_application>
Unrecognized XML in project preferences: max_gfx_cpu_pct
Skipping: 20
Skipping: /max_gfx_cpu_pct
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Error reading astronomy parameters from file 'astronomy_parameters.txt'
Trying old parameters file
Using SSE4.1 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 1.1 AMD-APP-SDK-v2.5 (732.1)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 0 on platform 0
Found 1 CL device
Device 'Cypress' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Driver version: CAL 1.4.1546
Version: OpenCL 1.1 AMD-APP-SDK-v2.5 (732.1)
Compute capability: 0.0
Max compute units: 18
Clock frequency: 765 Mhz
Global mem size: 536870912
Local mem size: 32768
Max const buf size: 65536
Double extension: cl_khr_fp64
Build log:
--------------------------------------------------------------------------------
LOOP UNROLL: pragma unroll (line 288)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 280)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 273)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 244)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 202)
Unrolled as requested!

--------------------------------------------------------------------------------
Using AMD IL kernel
Binary status (0): CL_SUCCESS
Estimated AMD GPU GFLOP/s: 2203 SP GFLOP/s, 441 DP FLOP/s
Using a target frequency of 60.0
Using a block size of 4608 with 69 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range: { nu_steps = 640, mu_steps = 1600, r_steps = 1400 }
Iteration area: 2240000
Chunk estimate: 7
Num chunks: 8
Chunk size: 317952
Added area: 303616
Effective area: 2543616
Initial wait: 14 ms
Could not load Ktm32.dll (126): The specified module could not be found.

Integration time: 74.807670 s. Average time per iteration = 116.886984 ms
Integral 0 time = 76.035948 s
Running likelihood with 109999 stars
Likelihood time = 2.222093 s
<background_integral> 0.000386955133682 </background_integral>
<stream_integral> 2.648622095744236 100.520898384170820 </stream_integral>
<background_likelihood> -3.072900153659742 </background_likelihood>
<stream_only_likelihood> -48.038597437283549 -107.797345431890760 </stream_only_likelihood>
<search_likelihood> -3.012478096868883 </search_likelihood>
22:12:36 (4628): called boinc_finish

</stderr_txt>
]]>



here is what one of my v1.02 errors looks like:

Stderr output

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
<search_application> milkyway_separation 1.02 Windows x86 double OpenCL </search_application>
Unrecognized XML in project preferences: max_gfx_cpu_pct
Skipping: 0
Skipping: /max_gfx_cpu_pct
Unrecognized XML in project preferences: apps_selected
Skipping: app_id
Skipping: /apps_selected
Guessing preferred OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Error reading astronomy parameters from file 'astronomy_parameters.txt'
Trying old parameters file
Using SSE3 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 1.1 AMD-APP (851.4)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 1 on platform 0
Found 1 CL device
Requested device is out of range of number found devices
Failed to select a device (1): MW_CL_ERROR
Failed to get information about device
Error getting device and context (1): MW_CL_ERROR
Failed to calculate likelihood

<background_integral> 1.#QNAN0000000000 </background_integral>
<stream_integral> 1.#QNAN0000000000 1.#QNAN0000000000 </stream_integral>
<background_likelihood> 1.#QNAN0000000000 </background_likelihood>
<stream_only_likelihood> 1.#QNAN0000000000 1.#QNAN0000000000 </stream_only_likelihood>
<search_likelihood> 1.#QNAN0000000000 </search_likelihood>
00:35:40 (2596): called boinc_finish

</stderr_txt>
]]>


the bolded part marks the point at which my output starts to differentiate itself from a successful result. of all the platforms crunching w/ a 58xx series GPU that i looked at (regardless of OS), they seem to be "using device 0 on platform 0," whereas mine seems to be "using device 1 on platform 0." this makes sense to me b/c my device 0 is actually the motherboard's HD4290 IGP, while device 1 is my HD 5870 GPU...however i think this device labeling system is causing a GPU recognition problem. the successful platforms are finding the "Cypress" GPU after detecting 1 CL device, whereas mine finds 1 CL device, and then claims that the "requested device is out of range of number found devices."

i've gotten this same error and same Stderr output w/ Catalyst drivers 11.8, 11.10, and 12.1. if there is at least one other participant who is able to crunch v1.02 tasks successfully w/ his 58xx series GPU on WinXP x32, then i don't see why i should be able to...i just have to get to the bottom of this error...

does anyone have a clue what might be going on here and what i might have to do in order to fix it? i've reverted back to v0.82 for the time being, but i know its only a matter of time before it gets deprecated for good. at that point, i'll no longer be able to crunch for MW@H unless i can get the v1.02 tasks running error-free.
ID: 53303 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilearkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
50 million credit badge12 year member badge
Message 53304 - Posted: 19 Feb 2012, 18:28:06 UTC

Are you still using the on-board graphics to crunch at another project?

If not, you might try disabling it in the cc_config.xml file and see if that might help.
ID: 53304 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
10 million credit badge11 year member badge
Message 53306 - Posted: 19 Feb 2012, 18:30:17 UTC - in response to Message 53303.  

The reason is how BOINC handles device indexing. If you look the first one is using BOINC 7 and the second one with the error is using 6.12.34. Reupgprade to a BOINC 7 (I think 7.0.15 is the newest), or since you are using app_info already you could add --device 0to force it to use that GPU.

You have a kind of weird case where you have 2 GPUs that support both CAL but only 1 supports OpenCL.

The 4290 is based on an R600 core and doesn't support OpenCL. The older version of BOINC will give a device index of 1 to use the other GPU based on the CAL detection which would include that. The OpenCL device BOINC provides in 7 is correct and uses the OpenCL capable GPU.
ID: 53306 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
10 million credit badge11 year member badge
Message 53307 - Posted: 19 Feb 2012, 18:30:42 UTC - in response to Message 53304.  

The GPU exclusion should also work with 6.12
ID: 53307 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53308 - Posted: 19 Feb 2012, 18:30:45 UTC - in response to Message 53304.  

Are you still using the on-board graphics to crunch at another project?

If not, you might try disabling it in the cc_config.xml file and see if that might help.

no, i don't use the IGP to crunch at all. and yes, it is already ignored by BOINC via the cc_config.xml file.
ID: 53308 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53310 - Posted: 19 Feb 2012, 18:39:09 UTC - in response to Message 53306.  

The reason is how BOINC handles device indexing. If you look the first one is using BOINC 7 and the second one with the error is using 6.12.34. Reupgprade to a BOINC 7 (I think 7.0.15 is the newest), or since you are using app_info already you could add <cmdline> --device 0</cmdline>to force it to use that GPU.

i was hoping to avoid changing my version of BOINC since v6.12.34 has worked so well for me for quite some time. but if that's what it'll take, then i'll do that later tonight and report back. i just hope it doesn't wreak havoc on the functionality of all the other projects i participate in...
ID: 53310 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53326 - Posted: 20 Feb 2012, 4:31:36 UTC
Last modified: 20 Feb 2012, 4:37:00 UTC

ok, so i gave BOINC v7.0.15 a try and had no luck. the error seems to be of the same nature as the ones i was getting w/ v1.02 tasks on BOINC v6.12.34:

Stderr output

<core_client_version>7.0.15</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
BOINC: parse gpu_opencl_dev_index 0
<search_application> milkyway_separation 1.02 Windows x86 double OpenCL </search_application>
Unrecognized XML in project preferences: max_gfx_cpu_pct
Skipping: 0
Skipping: /max_gfx_cpu_pct
Unrecognized XML in project preferences: apps_selected
Skipping: app_id
Skipping: /apps_selected
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Error reading astronomy parameters from file 'astronomy_parameters.txt'
Trying old parameters file
Using SSE3 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 1.1 AMD-APP (851.4)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 1 on platform 0
Found 1 CL device
Requested device is out of range of number found devices
Failed to select a device (1): MW_CL_ERROR
Failed to get information about device
Error getting device and context (1): MW_CL_ERROR
Failed to calculate likelihood
<background_integral> 1.#QNAN0000000000 </background_integral>
<stream_integral> 1.#QNAN0000000000 1.#QNAN0000000000 </stream_integral>
<background_likelihood> 1.#QNAN0000000000 </background_likelihood>
<stream_only_likelihood> 1.#QNAN0000000000 1.#QNAN0000000000 </stream_only_likelihood>
<search_likelihood> 1.#QNAN0000000000 </search_likelihood>
22:51:55 (308): called boinc_finish

</stderr_txt>
]]>

by the way, i had to detach and reattach to the project to even get a task in the first place. prior to that, i was getting the "not reporting or requesting tasks" crap in the event log. at that point i thought i would give BOINC v7.0.12 a try, as that is the version that is working for RAMen's 58xx series GPU WinXP 32-bit system...but again, no luck. i also had to detach and reattach from the project again to get a task and avoid the "not reporting or requesting tasks" message in the event log. and again, the error seems to be of the same nature as before:

Stderr output

<core_client_version>7.0.12</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
BOINC: parse gpu_opencl_dev_index 1236176
<search_application> milkyway_separation 1.02 Windows x86 double OpenCL </search_application>
Unrecognized XML in project preferences: max_gfx_cpu_pct
Skipping: 0
Skipping: /max_gfx_cpu_pct
Unrecognized XML in project preferences: apps_selected
Skipping: app_id
Skipping: /apps_selected
BOINC GPU type suggests using OpenCL vendor 'Advanced Micro Devices, Inc.'
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Error reading astronomy parameters from file 'astronomy_parameters.txt'
Trying old parameters file
Using SSE3 path
Found 1 platform
Platform 0 information:
Name: AMD Accelerated Parallel Processing
Version: OpenCL 1.1 AMD-APP (851.4)
Vendor: Advanced Micro Devices, Inc.
Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Profile: FULL_PROFILE
Using device 1 on platform 0
Found 1 CL device
Requested device is out of range of number found devices
Failed to select a device (1): MW_CL_ERROR
Failed to get information about device
Error getting device and context (1): MW_CL_ERROR
Failed to calculate likelihood
<background_integral> 1.#QNAN0000000000 </background_integral>
<stream_integral> 1.#QNAN0000000000 1.#QNAN0000000000 </stream_integral>
<background_likelihood> 1.#QNAN0000000000 </background_likelihood>
<stream_only_likelihood> 1.#QNAN0000000000 1.#QNAN0000000000 </stream_only_likelihood>
<search_likelihood> 1.#QNAN0000000000 </search_likelihood>
23:18:26 (308): called boinc_finish

</stderr_txt>
]]>


i'm not sure what to try next. i will say that i added the device argument to the app_info.xml, but that changed nothing, and the stderr output error was the same. i also maintain that its just as big a problem that my host doesn't want to report or request tasks now that i've updated to BOINC >= v7.x.xx, and that i have to detach and reattach just to get a single task.

any ideas?
ID: 53326 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53328 - Posted: 20 Feb 2012, 12:46:34 UTC

*UPDATE*

i let the system crunch overnight on some other projects with BOINC v7.0.15 and Catalyst 12.1 since i couldn't get MW@H running, and as i suspected, switching to BOINC v7.0.15 has had some adverse effects on my other projects. specifically, Einstein@Home crunches just fine on the CPU, and Collatz crunches just fine on the GPU, but neither of them reported any completed tasks overnight, nor did my host request any new work from either project. so it would appear that MW@H isn't the only project that has seen the "not reporting or requesting tasks" message in the event log since switching to BOINC v7.0.15.
ID: 53328 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
100 million credit badge13 year member badge
Message 53335 - Posted: 20 Feb 2012, 13:50:14 UTC
Last modified: 20 Feb 2012, 13:50:30 UTC

Since it's giving errors on Device 1, I assume that is the card to exclude. So you tried the following cc_config.xml with a BOINC 7.0.xx version and it still didn't work?

<cc_config>
<options>
<exclude_gpu>
<url>http://milkyway.cs.rpi.edu/milkyway/</url>
<device_num>1</device_num>
</exclude_gpu>
</options>
</cc_config>

You know about the different work buffer system of BOINC 7.0.xx versions? Connect about every x.xx days has now effectively become Minimum work buffer. In fact in the later 7.0.xx versions it has been renamed. If you leave it at 0 days which was previously recommended for an always on connection it will not download any new tasks until your cache is empty. With BOINC 7.0.15 I use a value of 1 day for Minimum work buffer and 0.1 days for Max additional work buffer. Due to unreliable work availability on another project I also use report_results_immediately in my cc_config.xml file.
ID: 53335 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53336 - Posted: 20 Feb 2012, 14:43:01 UTC - in response to Message 53335.  

Since it's giving errors on Device 1, I assume that is the card to exclude. So you tried the following cc_config.xml with a BOINC 7.0.xx version and it still didn't work?

<cc_config>
<options>
<exclude_gpu>
<url>http://milkyway.cs.rpi.edu/milkyway/</url>
<device_num>1</device_num>
</exclude_gpu>
</options>
</cc_config>

actually no - i used a GPU inclusion argument in the project's app_info.xml file (<cmdline> --device 1</cmdline>), not a GPU exclusion argument in the cc_config.xml file. and actually device 1 is not the device to ignore b/c that's my double precision-capable 5870. i suppose i should give this a try though, only i'll exclude device 0 instead of device 1. i should note that i already have an <ignore_ati_dev>0</ignore_ati_dev> argument in the cc_config.xml file so that BOINC ignores my motherboard's integrated HD 4290 GPU.


You know about the different work buffer system of BOINC 7.0.xx versions? Connect about every x.xx days has now effectively become Minimum work buffer. In fact in the later 7.0.xx versions it has been renamed. If you leave it at 0 days which was previously recommended for an always on connection it will not download any new tasks until your cache is empty. With BOINC 7.0.15 I use a value of 1 day for Minimum work buffer and 0.1 days for Max additional work buffer. Due to unreliable work availability on another project I also use report_results_immediately in my cc_config.xml file.

i read something about that, but i don't think that's what's affecting my ability to get new work. first of all, my "connect every x.xx days" was set to 0.10, not 0...and on top of that, i ran clean out of Collatz work overnight, and my host still didn't have any new Collatz work in the morning. so i met BOINC v7.0.15's requirement of running the Collatz project cache dry before requesting new tasks, and still it didn't fetch any...
ID: 53336 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
100 million credit badge13 year member badge
Message 53337 - Posted: 20 Feb 2012, 15:05:59 UTC
Last modified: 20 Feb 2012, 15:12:14 UTC

Hmm, that's strange because the Stderr you posted of the error task is saying Device 1 is being used or rather trying to be used. Perhaps BOINC and CAL applications are identifying Device 0 and Device 1 differently than how the MilkyWay OpenCL application is identifying them.

If you use multiple GPU exclusions and inclusions at the same time, perhaps it causes differences in different places in how the Devices get numbered.

Yes work fetch on BOINC 7.0.xx versions caused me a lot of problems at first too. Seems to work alright with my current settings now though but I'm not doing any Collatz. Maybe my report results immediately setting is helping to cause new work to be requested.
ID: 53337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
100 million credit badge13 year member badge
Message 53342 - Posted: 20 Feb 2012, 16:54:51 UTC
Last modified: 20 Feb 2012, 17:30:09 UTC

Excuse the double post but I just noticed Matt advised you to try <cmdline> --device 0</cmdline> and you said you used <cmdline> --device 1</cmdline>

Kind of lines up with what I was saying and Matt has already posted. The CAL applications and the OpenCL applications appear to be handling the device numbering differently due to the onboard graphics not being OpenCL capable. So there is only one device being detected by the OpenCL application. If you or the OpenCL application tries to force or use Device 1 then that is a higher number than the number of devices available hence the message "Requested device is out of range of number found devices"

+ if (clr->devNum >= nDev)

+ {

+ warn("Requested device is out of range of number found devices\n");

Whereas your successful CAL tasks have "Found 2 CAL devices. Chose device 1" in stderr.

Getting the excluded GPU and the detected GPU correct may require different combinations of ignore, exclude and force arguments for OpenCL applications as compared to CAL applications. In other words what works for one may not work for the other as you have experienced. If there is only one OpenCL device detected and you use <ignore_ati_dev>0</ignore_ati_dev> perhaps that leaves no available OpenCL devices. The <exclude_gpu> cc_config settings available in BOINC 7.0.xx give greater flexibility in configuring all this separately for each GPU project or application.

So if there is only one OpenCL capable device it may be sufficient to exclude the HD 4290 for CAL applications only. So perhaps try removing <cmdline> --device 1</cmdline> and <ignore_ati_dev>0</ignore_ati_dev> and instead use:

<cc_config>
<options>
<exclude_gpu>
<url>http://boinc.thesonntags.com/collatz/</url>
<device_num>0</device_num>
</exclude_gpu>
</options>
</cc_config>

Not sure if it will work but worth a try.
ID: 53342 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53345 - Posted: 20 Feb 2012, 17:35:22 UTC

actually i tried both <cmdline> --device 0</cmdline> and <cmdline> --device 1</cmdline>. nevertheless, its probably worth a try to remove both <cmdline> --device x</cmdline> from the app_info.xml and <ignore_ati_dev>x</ignore_ati_dev> from the cc_config.xml for now, and add what you suggested to the cc_config.xml file. it appears i have a few more things to experiment with, so i'll try to get on it tonight and report back as soon as i can...

thanks,
Eric
ID: 53345 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
10 million credit badge11 year member badge
Message 53352 - Posted: 20 Feb 2012, 20:56:03 UTC - in response to Message 53345.  

actually i tried both --device 0 and --device 1. nevertheless, its probably worth a try to remove both --device x from the app_info.xml and x from the cc_config.xml for now, and add what you suggested to the cc_config.xml file. it appears i have a few more things to experiment with, so i'll try to get on it tonight and report back as soon as i can...

thanks,
Eric
I should have said --device 0 before. Removing that and using BOINC 7 should work
ID: 53352 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
100 million credit badge13 year member badge
Message 53353 - Posted: 21 Feb 2012, 1:16:10 UTC - in response to Message 53352.  

I should have said --device 0 before. Removing that and using BOINC 7 should work

You did say --device 0 before.

The reason is how BOINC handles device indexing. If you look the first one is using BOINC 7 and the second one with the error is using 6.12.34. Reupgprade to a BOINC 7 (I think 7.0.15 is the newest), or since you are using app_info already you could add <cmdline> --device 0</cmdline>to force it to use that GPU....

ID: 53353 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilearkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
50 million credit badge12 year member badge
Message 53354 - Posted: 21 Feb 2012, 2:04:38 UTC - in response to Message 53353.  

I should have said --device 0 before. Removing that and using BOINC 7 should work

You did say --device 0 before.

The reason is how BOINC handles device indexing. If you look the first one is using BOINC 7 and the second one with the error is using 6.12.34. Reupgprade to a BOINC 7 (I think 7.0.15 is the newest), or since you are using app_info already you could add <cmdline> --device 0</cmdline>to force it to use that GPU....



Newest version is now 7.0.17, which fixes the backup project problem and several other problems.
ID: 53354 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilekashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
100 million credit badge13 year member badge
Message 53355 - Posted: 21 Feb 2012, 2:43:19 UTC
Last modified: 21 Feb 2012, 2:44:25 UTC

Backup project problem. Aha, I hadn't thought of that as I do not use a backup project. Perhaps that was why Sunny129/Eric was having trouble getting any work for Collatz with the BOINC 7.0.xx versions he had tried. He may have had Collatz set with a resource share of 0 as a backup project.

Thanks for that news arkayn.
ID: 53355 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53356 - Posted: 21 Feb 2012, 3:14:32 UTC

i was unaware of the "backup project" problem...though i don't think it should affect me, as i do not use Collatz strictly as a backup project. i actually have Collatz and MW@H set to use equal [non-zero] resources, and i switch between the two of them by keeping one or the other suspended.

at any rate, i went back to BOINC v6.12.34 last night simply b/c i was unable to crunch either of my 2 GPU projects on v7.0.12 or 15. i set "no new tasks," so i should be able to start testing again, this time on v7.0.17.
ID: 53356 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0
300 million credit badge10 year member badgeextraordinary contributions badge
Message 53357 - Posted: 21 Feb 2012, 4:50:16 UTC
Last modified: 21 Feb 2012, 4:53:46 UTC

ok guys, on BOINC v7.0.17 i've tried excluding, ignoring, and forcing devices 0 (integrated HD 4290) and 1 (HD 5870) just for the heck of it, and nothing worked. i also removed all exclude, ignore, and force arguments and i'm still getting the same errors as before.

i don't know why i didn't think of it before, but perhaps i should post my BOINC client's startup log. i'm not sure if it'll help, but there's a good chance that someone will see or understand something that i did not...here it is with no cc_config.xml file in the BOINC data directory or an app_info.xml file in the MW@H data directory:

2/20/2012 11:42:10 PM | | No config file found - using defaults
2/20/2012 11:42:10 PM | | Starting BOINC client version 7.0.17 for windows_intelx86
2/20/2012 11:42:10 PM | | log flags: file_xfer, sched_ops, task
2/20/2012 11:42:10 PM | | Libraries: libcurl/7.21.6 OpenSSL/1.0.0d zlib/1.2.5
2/20/2012 11:42:10 PM | | Data directory: D:\Documents and Settings\All Users\Application Data\BOINC
2/20/2012 11:42:10 PM | | Running under account Eric
2/20/2012 11:42:10 PM | | Processor: 6 AuthenticAMD AMD Phenom(tm) II X6 1090T Processor [Family 16 Model 10 Stepping 0]
2/20/2012 11:42:10 PM | | Processor: 512.00 KB cache
2/20/2012 11:42:10 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni cx16 syscall nx lm svm sse4a osvw ibs skinit wdt page1gb rdtscp 3dnowext 3dnow
2/20/2012 11:42:10 PM | | OS: Microsoft Windows XP: Professional x86 Edition, Service Pack 3, (05.01.2600.00)
2/20/2012 11:42:10 PM | | Memory: 3.00 GB physical, 10.81 GB virtual
2/20/2012 11:42:10 PM | | Disk: 176.53 GB total, 155.15 GB free
2/20/2012 11:42:10 PM | | Local time is UTC -5 hours
2/20/2012 11:42:10 PM | | VirtualBox version: 4.1.8
2/20/2012 11:42:10 PM | | ATI GPU 0: (not used) Cypress (CAL version 1.4.1664, 341MB, 324MB available, 107 GFLOPS peak)
2/20/2012 11:42:10 PM | | ATI GPU 1: ATI Radeon HD 5800 series (Cypress) (CAL version 1.4.1664, 2048MB, 2031MB available, 5440 GFLOPS peak)
2/20/2012 11:42:10 PM | | OpenCL: ATI GPU 0 (not used): Cypress (driver version CAL 1.4.1664, device version OpenCL 1.1 AMD-APP (851.4), 1024MB, 324MB available)
2/20/2012 11:42:10 PM | SETI@home | Found app_info.xml; using anonymous platform
2/20/2012 11:42:10 PM | Collatz Conjecture | URL http://boinc.thesonntags.com/collatz/; Computer ID 86460; resource share 50
2/20/2012 11:42:10 PM | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 4563909; resource share 50
2/20/2012 11:42:10 PM | LHC@home 1.0 | URL http://lhcathomeclassic.cern.ch/sixtrack/; Computer ID 9906341; resource share 1890
2/20/2012 11:42:10 PM | Milkyway@Home | URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 402429; resource share 50
2/20/2012 11:42:10 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6334707; resource share 50
2/20/2012 11:42:10 PM | Einstein@Home | General prefs: from Einstein@Home (last modified 06-Sep-2011 15:03:44)
2/20/2012 11:42:10 PM | Einstein@Home | Computer location: home
2/20/2012 11:42:10 PM | | General prefs: using separate prefs for home
2/20/2012 11:42:10 PM | | Reading preferences override file
2/20/2012 11:42:10 PM | | Preferences:
2/20/2012 11:42:10 PM | | max memory usage when active: 3070.10MB
2/20/2012 11:42:10 PM | | max memory usage when idle: 3070.10MB
2/20/2012 11:42:10 PM | | max disk usage: 88.27GB
2/20/2012 11:42:10 PM | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
2/20/2012 11:42:10 PM | | Not using a proxy


if this doesn't help, i'm not sure i'm up to the task of trying all of the several combinations of exclude, ignore, and force arguments just to see if any such combination works. it looks like it might finally be time to stop using the IGP as a dedicated display GPU and start getting used to using the HD 5870 for the display AND crunching...who knows, maybe i'll find some target parameters that'll make GUI lag at least bearable.

btw Matt, any idea how much time we have left before separation v0.82 is permanently deprecated?
ID: 53357 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilearkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
50 million credit badge12 year member badge
Message 53358 - Posted: 21 Feb 2012, 5:17:56 UTC - in response to Message 53357.  
Last modified: 21 Feb 2012, 5:18:53 UTC

Is there any way to disable the internal GPU via the bios, it looks like the 5870 is being disabled for OpenCL because the internal GPU is being disabled as the lesser GPU.
ID: 53358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : getting errors with new v1.02 separation application?

©2021 Astroinformatics Group