Welcome to MilkyWay@home

I surrender...

Message boards : Number crunching : I surrender...
Message board moderation

To post messages, you must log in.

AuthorMessage
pieface

Send message
Joined: 7 Nov 07
Posts: 13
Credit: 20,143,762
RAC: 0
Message 49757 - Posted: 29 Jun 2011, 21:20:57 UTC

I recently received a new computer with a GPU that could crunch MW, so I merrily downloaded Boinc and signed up for MW. Sure enuf downloaded a bunch of MW GPU units and a half dozen or so of the N-body 0.66 Multi-threaded ones as well. I figured all was well as the new guy is a 6 core (12 w/HT enabled), but... The GPU units all crunched but not one of the MT WU's. I downloaded some Rosetta and some Enigma WU's just to see what would happen and they all crunched merrily to completion, but the N-body guys haven't budged a bit.

Any ideas???

Computer is here:

298955
ID: 49757 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 49758 - Posted: 29 Jun 2011, 21:34:58 UTC

All except the 2 MW that say they are (cuda) are Cpu wus. It looks like you are only getting 1 GPU wu at a time. You may need to tweek your settings if you want to run multiple GPu wus.
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 49758 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pieface

Send message
Joined: 7 Nov 07
Posts: 13
Credit: 20,143,762
RAC: 0
Message 49762 - Posted: 29 Jun 2011, 21:45:02 UTC

un-hunh...

All of the GPU units processed (and most merrilly went off to la-la-land)...

It's the multi-treaded CPU units that refuse to run at all...
ID: 49762 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
Message 49764 - Posted: 29 Jun 2011, 22:02:27 UTC

Any error message? What's in the BOINC messages if you start it, with only MW allowed for the CPU?

MrS
Scanning for our furry friends since Jan 2002
ID: 49764 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pieface

Send message
Joined: 7 Nov 07
Posts: 13
Credit: 20,143,762
RAC: 0
Message 49766 - Posted: 29 Jun 2011, 22:26:25 UTC

No error messages that I can see in the Log, every time a GPU unit finished there was a schedule request for more GPU work, usually only one more unit downloaded with a message saying the unit had reached the max-tasks-in-process limit. I have one more Rosetta WU running on the CPU at 99.84 pct. Once that one is finished and uploaded I will exit Boinc and bounce the machine to see if maybe the GPU WU's locked something up.
ID: 49766 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pieface

Send message
Joined: 7 Nov 07
Posts: 13
Credit: 20,143,762
RAC: 0
Message 49767 - Posted: 30 Jun 2011, 1:08:24 UTC

Oh well... took a couple of hours to do the shut-down/re-start as Mr Softie decided he just had to co-opt the machine for updates. Situation is still the same, seven CPU n-body 0.66 MT WU's sitting there all by themselves in 'ready to start' status, but none of them running. I am going to Abort these WU's and see what happens with another batch later. At least that way they will get re-sent to someone else.
ID: 49767 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
dp

Send message
Joined: 23 Jan 08
Posts: 7
Credit: 156,772,001
RAC: 7,895
Message 49772 - Posted: 30 Jun 2011, 8:11:02 UTC

I am having similar non-results.
Have disconnected and reconnected without success.

Any suggestions???

Thanks dp


Stderr output
<core_client_version>6.12.26</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
Error loading Lua script 'astronomy_parameters.txt': [string "number_parameters: 4..."]:1: '<name>' expected near '4'
Error reading astronomy parameters from file 'astronomy_parameters.txt'
Trying old parameters file
Using SSE3 path
Failed to get CAL device attributes: Parameter passed in is invalid (CAL_RESULT_INVALID_PARAMETER)
Error getting device information: Parameter passed in is invalid (CAL_RESULT_INVALID_PARAMETER)
Failed to get CAL info: Parameter passed in is invalid (CAL_RESULT_INVALID_PARAMETER)
Failed to setup CAL
03:33:26 (5972): called boinc_finish

</stderr_txt>
ID: 49772 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
Message 49777 - Posted: 30 Jun 2011, 17:25:25 UTC

dp, your problem is very different. It looks like your ATI driver is way out of date.

pieface, it could be that the n_body WUs require something that is not yet available to BOINC (be it system memory or whatever). Aborting these and downloading more WUs will very probably change nothing. Could you post the messages upon BOINC startup, preferably with only the MW WUs on the machine?

MrS
Scanning for our furry friends since Jan 2002
ID: 49777 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pieface

Send message
Joined: 7 Nov 07
Posts: 13
Credit: 20,143,762
RAC: 0
Message 49795 - Posted: 1 Jul 2011, 18:03:59 UTC

Hmmm...

Sorry, I had already aborted them when I checked the message boards again. I ran some different projects overnite, still getting the new machine identified to all the other projects I run and didn't encounter any problems.

When I came back around to MW this afteroon it merrilly downloaded the whole spectrum of types, GPU running in around 10 mins, multi-thread running very quickly also (using all 10 'cores' that I let Boinc have) and now the single threaded ones are starting to run their chunk of time. I have no idea what it could have been, probably something related to running with a useable GPU for the first time got it confused.

Thanks for all the time!! Hopefully this was a one-shot wonder and won't happen again...
ID: 49795 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
dp

Send message
Joined: 23 Jan 08
Posts: 7
Credit: 156,772,001
RAC: 7,895
Message 49987 - Posted: 7 Jul 2011, 19:35:15 UTC - in response to Message 49795.  

Thanks for getting back.
Updated the driver, detached and re-attached to the MW project, but still no luck.

Any suggestions?
dp
ID: 49987 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 49989 - Posted: 7 Jul 2011, 19:40:16 UTC
Last modified: 7 Jul 2011, 19:40:39 UTC

Please look at
http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2468

Here is a solution which has worked by others.

For futher questions don't be shy to PM me.

regards

franz
ID: 49989 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Avatar

Send message
Joined: 1 Sep 08
Posts: 204
Credit: 219,354,537
RAC: 0
Message 49996 - Posted: 8 Jul 2011, 7:43:12 UTC - in response to Message 49989.  

You're getting a different error now - moving in the right direction :)
When I switched from 0.62 to 0.82 I had to manually reset the "result duration correction factor" to not get the "maximum elapsed time exceeded" error. Don't know what the solution in the other thread is - just ask if you need further assistance.

MrS
Scanning for our furry friends since Jan 2002
ID: 49996 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : I surrender...

©2024 Astroinformatics Group