Welcome to MilkyWay@home

[coproc] Insufficient CUDA for de_separation_23_3s_fix errors

Message boards : Number crunching : [coproc] Insufficient CUDA for de_separation_23_3s_fix errors
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 557,043,259
RAC: 42,493
Message 45894 - Posted: 31 Jan 2011, 6:22:34 UTC
Last modified: 31 Jan 2011, 7:08:36 UTC

I am seeing new errors in the Event Log in the new BOINC Manager for MW@H project. Never saw them before. The only thing that is changed recently is the new BOINC Manager and Client 6.12.12. I just updated the app for better sharing of CUDA resources among multiple projects. Can anyone explain these errors and if they are an indication of a problem?

Keith
ID: 45894 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mdhittle*
Avatar

Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
Message 45926 - Posted: 3 Feb 2011, 0:41:14 UTC - in response to Message 45894.  

I am seeing new errors in the Event Log in the new BOINC Manager for MW@H project. Never saw them before. The only thing that is changed recently is the new BOINC Manager and Client 6.12.12. I just updated the app for better sharing of CUDA resources among multiple projects. Can anyone explain these errors and if they are an indication of a problem?

Keith


6.12.12 Development version (MAY BE UNSTABLE - USE ONLY FOR TESTING)

You might try upgrading to the stable version 6.10.58

-Mike
ID: 45926 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 45932 - Posted: 3 Feb 2011, 2:19:41 UTC

What errors are you seeing in the log? I'm using 6.12.12 here on 10 machines and IMO it's superior to 6.10.58. No problems at all and works far better for GPU projects.

ID: 45932 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 557,043,259
RAC: 42,493
Message 45935 - Posted: 3 Feb 2011, 5:13:15 UTC - in response to Message 45926.  

BOINC 6.10.58 would be a downgrade. It does not play NICE at all with regard to multitasking multiple GPU projects. Does not obey TDI parameters at all. I have my general and local preferences to switch projects every 5 minutes. 6.10.58 would never obey my rules. It would hang up exclusively on Seti GPU work and never do any MW GPU work. The new BOINC 6.12.12 works MUCH, MUCH better and obeys the TDI parameters very nicely. I am now doing splitting 50% of CPU/GPU time between Seti and MW. The error reported in the Event Log seems to be benign. I am not seeing MW errors or invalidated results in Tasks on the MW account page. So still would like some explanation as to what the error means. I think it has something to do with when the GPU is already running another GPU task from Seti and BOINC starts up a new MW GPU task. Don't see the error when running two Seti GPU tasks at the same time or when running two MW GPU tasks at the same time. It just appears when one of the Seti GPU tasks finishes and a MW task takes its place. I have 1GB of memory on the GTX 460 and the GPU memory monitor says that only 61% of memory is in use so I don't think the error is real.

Keith
ID: 45935 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 557,043,259
RAC: 42,493
Message 45936 - Posted: 3 Feb 2011, 5:19:26 UTC - in response to Message 45932.  

The errors in the Event log all start with the topic header and define the actual current MW GPU task. Only see the errors with MW, never from Seti. Probably has something to do with the MW MB GPU application when running an anonymous file so I can run 2 tasks on the 460 at a time. I would agree with you whole-heartedly about how much better BOINC 6.12.12 is compared to 6.10.58. I don't like the movement of Message tab into the separate Event Log. If the Event Log wouldn't stay on top of all windows, I couldn't complain. I have found out that you can run the 6.12.12 client with the 6.10.58 Manager so that is an option that I might switch to eventually. I am going to live with the 6.12.12 Manager for a while or until I get fed up with the Event Log. Too soon to tell.

Keith

ID: 45936 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 45938 - Posted: 3 Feb 2011, 6:41:26 UTC

I don't see any of those errors on any of my machines but I'm not running SETI either. I do have a dual GPU box that switches between MW and Collatz, no errors though. The reason they put the event log in the advanced menu was because users kept seeing benign error messages and thought there was a problem when there wasn't. As far as the interface, I hardly ever use BOINC manager. Instead I control whole local network of BOINC clients with the far more capable eFMer BoincTasks (and the companion TThrottle). Just curious, why do you have the jobs switch every 5 minutes? I would think that would cause all kinds of havoc.
ID: 45938 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 557,043,259
RAC: 42,493
Message 45952 - Posted: 3 Feb 2011, 23:13:54 UTC - in response to Message 45938.  

I'll have to investigate those other managers. I was having all kinds of troubles getting MW to give up control of the GPU and was trying all kinds of things. Someone told me to change the default task switch to five minutes instead of the default 60 minutes. An attempt to get the client to give equal opportunity to Seti GPU tasks. I have since changed it to ten minutes but with the 6.12.12 client, the default TDI of 60 minutes would probably work fine. I did see a very large increase in the reported tasks completion logs, every time the task switched, it wrote all the app header info along with the task restarted times. Makes for an unnecessarily large result report.

Keith
ID: 45952 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 46011 - Posted: 6 Feb 2011, 4:27:57 UTC - in response to Message 45935.  

Does not obey TDI parameters at all. I have my general and local preferences to switch projects every 5 minutes. 6.10.58 would never obey my rules.

Switching projects every 5 minutes is not a good idea. Unless all projects you are running have tasks less than the TSI, well, you are just thrashing.TSI at default allows most projects to get a shot of CPU and to complete a reasonable amount of work before a switch.

On mutli-core systems, particularly with 8 or more cores a far better strategy is to extend TSI out so that most tasks complete beffore the TSI expires (mine is set to 720 min, 6 hours). There are a host of issues with honoring TSI on GPU projects where the tasks are longer than 5 minutes in that you will waste considerable time unloading the GPU and loading it with the next task and rinse and repeat...
ID: 46011 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 557,043,259
RAC: 42,493
Message 46018 - Posted: 6 Feb 2011, 16:11:44 UTC - in response to Message 46011.  

Yes, I concluded that also after seeing the results headers. But, before I went to the 6.12.12 client, the only way to crunch Seti after adding MW project was to set the TDI to five minutes. Now with 6.12.12, all is well and I have reset the TDI back to the default of 60 minutes. That allows each project to complete about 8 GPU tasks each session before switching projects.

Keith
ID: 46018 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mdhittle*
Avatar

Send message
Joined: 25 Jun 10
Posts: 284
Credit: 260,490,091
RAC: 0
Message 46027 - Posted: 7 Feb 2011, 0:39:27 UTC - in response to Message 45935.  

BOINC 6.10.58 would be a downgrade.


Actually, 6.10.58 would be an upgrade. You would be upgrading from a BETA version to a stable RELEASED version.

-Mike
ID: 46027 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 46034 - Posted: 7 Feb 2011, 3:47:58 UTC
Last modified: 7 Feb 2011, 3:52:31 UTC

Have you tried 6.12.12? Or 6.12.13? Or 6.12.11 for that matter? Others posting here have. They're all more stable than 6.10.58 IMO. Being so insistent about things you haven't tried is a not a good idea and not a way to learn anything. Run what you want, maybe it's a free country

Edit: And a BIG CONGRATS to the Packers
ID: 46034 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 46134 - Posted: 10 Feb 2011, 11:10:16 UTC - in response to Message 46027.  

BOINC 6.10.58 would be a downgrade.


Actually, 6.10.58 would be an upgrade. You would be upgrading from a BETA version to a stable RELEASED version.

Most of us that test the later versions of BOINC *NEVER* suggest trying a version that we ourselves have not been running... usually for some considerable time. I personally also watch the change logs carefully to see what changes have been made and how radical of a shift there has been...

There things to dislike about the 6.12.x series like the hiding of the event log and the order of the columns are not what I would choose ... but, though Beta, the push is on to make one of these the next stable. 6.12.13 has only minor tweaks from 6.12.12 like removing the URL change from notices because of a suggestion I made ...

But, the biggest change is to get rid of strict FIFO which should not have been allowed to persist as long as it did ... if you were single project it mattered little, but if you did multiple projects it was a real bad thing ...

Lastly, even the "stable" release versions have their issues ... just not usually big enough that most people notice...
ID: 46134 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : [coproc] Insufficient CUDA for de_separation_23_3s_fix errors

©2024 Astroinformatics Group