Welcome to MilkyWay@home

Problem?

Message boards : Number crunching : Problem?
Message board moderation

To post messages, you must log in.

AuthorMessage
BrainSmashR

Send message
Joined: 16 Dec 07
Posts: 5
Credit: 745,296
RAC: 0
Message 63027 - Posted: 17 Jan 2015, 8:44:05 UTC

So I have some WU that take all 6 cores to crunch, but they keep ticking after they reach 100% completed and surpass the "time remaining" variable...essentially halting ALL work being done by BOINC.

Can someone explain to me what's going on? I enjoy crunching for Milkyway but I do not like you "idling" my system so I can't crunch for anyone else....
ID: 63027 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,345,392
RAC: 21,895
Message 63028 - Posted: 17 Jan 2015, 12:05:14 UTC - in response to Message 63027.  

So I have some WU that take all 6 cores to crunch, but they keep ticking after they reach 100% completed and surpass the "time remaining" variable...essentially halting ALL work being done by BOINC.

Can someone explain to me what's going on? I enjoy crunching for Milkyway but I do not like you "idling" my system so I can't crunch for anyone else....


You must stop thinking of 100% as 'finished' it is a rounded off number, it is NOT a hard stop what you are move on to the next unit number. MW is notorious for going for several minutes beyond the 100% point as it ensures all crunching is done and gets the results ready to send back to MW etc, etc, ETC. Your system is NOT idle, it is still crunching and can use all 6 cores if needed to finish the current workunit before it starts the next one.
ID: 63028 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BrainSmashR

Send message
Joined: 16 Dec 07
Posts: 5
Credit: 745,296
RAC: 0
Message 63031 - Posted: 17 Jan 2015, 15:00:22 UTC - in response to Message 63028.  

Gotcha....thanks for the response
ID: 63031 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Woods

Send message
Joined: 30 Jan 09
Posts: 3
Credit: 81,093,970
RAC: 0
Message 63033 - Posted: 17 Jan 2015, 18:02:03 UTC

I've seen a lot of WUs recently which take several hours past the 100% point, rather than just minutes:
ps_nbody_12_20_orphan_sim_3_1413455402_1706770_3 (task 695052043) is up to 18:04:01 right now, having reached 99% after about 10 minutes and 100% after about 6 hours or so. It seems like a lot of the previous WUs which ran for roughly a day have failed validation for "too many results", too.
ID: 63033 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BrainSmashR

Send message
Joined: 16 Dec 07
Posts: 5
Credit: 745,296
RAC: 0
Message 63034 - Posted: 17 Jan 2015, 20:32:20 UTC - in response to Message 63033.  

Well that's sort of what I was wondering. I have one that says 100% (waiting to run) with 21:52:58 elapsed and --- remaining.

Is this really not finished after 21 hours with all 6 cores?
ID: 63034 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,345,392
RAC: 21,895
Message 63037 - Posted: 18 Jan 2015, 11:39:47 UTC - in response to Message 63034.  
Last modified: 18 Jan 2015, 11:43:13 UTC

Well that's sort of what I was wondering. I have one that says 100% (waiting to run) with 21:52:58 elapsed and --- remaining.

Is this really not finished after 21 hours with all 6 cores?


Do a suspend of Boinc and then a resume thru the Boinc Manager and see if it magically changes the times and finishes. Sometimes Boinc gets 'confused' and although the clock keeps going no work is getting done.

John Woods you should try that too.

When I am crunching and see a REALLY long unit like that that is my first thing to do, then if that doesn't work I abort it and move on to the next unit. I am interested in crunching multiple units, not one unit forever!

I also do NOT crunch the units that take multiple cpu cores, it's just a me thing that way.
ID: 63037 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BrainSmashR

Send message
Joined: 16 Dec 07
Posts: 5
Credit: 745,296
RAC: 0
Message 63040 - Posted: 18 Jan 2015, 15:04:18 UTC - in response to Message 63037.  
Last modified: 18 Jan 2015, 15:35:09 UTC

Did that and suspended other projects to force that WU to start again...we'll see how it goes

Edit: Forced an update and somewhere between 3 deferments the WU completed and uploaded (or magically disappeared) :)
ID: 63040 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Woods

Send message
Joined: 30 Jan 09
Posts: 3
Credit: 81,093,970
RAC: 0
Message 63042 - Posted: 18 Jan 2015, 21:43:56 UTC

I did a suspend and resume, and nothing changed. I then suspended BOINC again and quit BOINC Manager; when I started it again, that job had reset to 0.000% complete with no elapsed time and 02:11:55 remaining. I suspended other projects, and it looks like that job has just started over from scratch (the Remaining time dropped to about 36 minutes as soon as it started). I'll let it cook for a while and see if it acts any different this time.

(There's also a new BOINC Manager that I should install, but the changelist didn't seem to have anything relevant to this kind of problem...)
ID: 63042 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John Woods

Send message
Joined: 30 Jan 09
Posts: 3
Credit: 81,093,970
RAC: 0
Message 63044 - Posted: 18 Jan 2015, 23:23:24 UTC - in response to Message 63042.  

Yup, second time around it seems to have needed only a couple of hours and completed fine.
ID: 63044 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,345,392
RAC: 21,895
Message 63047 - Posted: 19 Jan 2015, 12:10:28 UTC - in response to Message 63042.  


(There's also a new BOINC Manager that I should install, but the changelist didn't seem to have anything relevant to this kind of problem...)


It doesn't affect alot of people but if you want to sign up for World Community Grid, a Boinc project run by IBM and offering several project crunching options, then you will need to upgrade to 7.4.36. WCG can be found here::
https://secure.worldcommunitygrid.org/research/viewAllProjects.do
ID: 63047 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3319
Credit: 520,345,392
RAC: 21,895
Message 63048 - Posted: 19 Jan 2015, 12:10:45 UTC - in response to Message 63044.  

Yup, second time around it seems to have needed only a couple of hours and completed fine.


+1
ID: 63048 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Problem?

©2024 Astroinformatics Group