Welcome to MilkyWay@home

Home Separation (Modified Fit) v1.28 taking ages


Advanced search

Message boards : Number crunching : Home Separation (Modified Fit) v1.28 taking ages
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 1 Apr 08
Posts: 19
Credit: 6,895,549
RAC: 9,457
5 million credit badge10 year member badge
Message 60744 - Posted: 15 Jan 2014, 14:12:26 UTC

Hi

I have a Home Separation (Modified Fit) v1.28 Mac OS X WU that has been crunching on CPU (no MW GPU for my Mac, I'd need OpenCL !) for more than 70 hours, with less than 54% completion, showing no remaining time ("-") with a deadline below 140 hours.

Should I worry ? :)

(this one has been running for 26 hours with a 59% completion, with a displayed remaining time of 21 hours)

Thanks for you help.
ID: 60744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60749 - Posted: 16 Jan 2014, 11:14:23 UTC - in response to Message 60744.  

Hi

I have a Home Separation (Modified Fit) v1.28 Mac OS X WU that has been crunching on CPU (no MW GPU for my Mac, I'd need OpenCL !) for more than 70 hours, with less than 54% completion, showing no remaining time ("-") with a deadline below 140 hours.

Should I worry ? :)

(this one has been running for 26 hours with a 59% completion, with a displayed remaining time of 21 hours)

Thanks for you help.


Try suspending the project and then resuming it, that can unstick a unit if it is stuck. To make it resume on the same unit you need to suspend and resume the whole project, I suspend it for a slow 5 count and it seems to work for me when it happens.
ID: 60749 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 1 Apr 08
Posts: 19
Credit: 6,895,549
RAC: 9,457
5 million credit badge10 year member badge
Message 60751 - Posted: 16 Jan 2014, 21:22:38 UTC

Gee I don't get it, now I come home and I see boinc says one has crunched 33 hours remaining 7 hours (67%) and the other 33 hours remaining 14 hours (59%)... they are the same WUs than before, I can't see anything in boinc messages appart that the WU did resume at some point, on my account page I only see the same 2 WUs sent on the 09/01 !!

WTF ?
ID: 60751 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60756 - Posted: 17 Jan 2014, 11:46:07 UTC - in response to Message 60751.  

Gee I don't get it, now I come home and I see boinc says one has crunched 33 hours remaining 7 hours (67%) and the other 33 hours remaining 14 hours (59%)... they are the same WUs than before, I can't see anything in boinc messages appart that the WU did resume at some point, on my account page I only see the same 2 WUs sent on the 09/01 !!

WTF ?


I think you need to figure out why it is 'suspending', since you said it is 'resuming' that means it was suspended earlier, figuring out why it suspended is the key I think.
ID: 60756 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 448,778
RAC: 0
100 thousand credit badge6 year member badge
Message 60758 - Posted: 17 Jan 2014, 12:42:35 UTC - in response to Message 60756.  

Gee I don't get it, now I come home and I see boinc says one has crunched 33 hours remaining 7 hours (67%) and the other 33 hours remaining 14 hours (59%)... they are the same WUs than before, I can't see anything in boinc messages appart that the WU did resume at some point, on my account page I only see the same 2 WUs sent on the 09/01 !!

WTF ?

I think you need to figure out why it is 'suspending', since you said it is 'resuming' that means it was suspended earlier, figuring out why it suspended is the key I think.

Enable the <sched_op_debug> log flag to see when (and possibly why) tasks are being suspended or pre-empted.

That information used to be in the default logs, but recent BOINCs have dumbed it down into the debug areas.
ID: 60758 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 1 Apr 08
Posts: 19
Credit: 6,895,549
RAC: 9,457
5 million credit badge10 year member badge
Message 60759 - Posted: 17 Jan 2014, 17:37:16 UTC

Well I'm running many different projects on my Mac, most of the long running WU of any project are, at some point, suspended by boinc (then other WU/project start, or restart), and then after some time they are resumed (and eventually they can do this several times, if the WU is very long, ie CPDN...), unless they are running in high priority mode (due to lack of time before deadline), then they won't stop.

I have increased the boinc parameter that defines the time before switching WU (I put 1440 mn instead of the standard low value, which I can't remember) and I have the "keep in memory while suspended" parameter ON, because I have lot's of RAM, and for those projects/applications that don't implement checkpoints, the WU crunching done will not be lost.
ID: 60759 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60761 - Posted: 18 Jan 2014, 12:32:55 UTC - in response to Message 60759.  

Well I'm running many different projects on my Mac, most of the long running WU of any project are, at some point, suspended by boinc (then other WU/project start, or restart), and then after some time they are resumed (and eventually they can do this several times, if the WU is very long, ie CPDN...), unless they are running in high priority mode (due to lack of time before deadline), then they won't stop.

I have increased the boinc parameter that defines the time before switching WU (I put 1440 mn instead of the standard low value, which I can't remember) and I have the "keep in memory while suspended" parameter ON, because I have lot's of RAM, and for those projects/applications that don't implement checkpoints, the WU crunching done will not be lost.


I do believe we have a winner...the multiple projects is the root cause of the problem, Boinc is trying to manage multiple projects, which it does fairly well, and your MW units are being suspended as their deadlines are not in danger of being exceeded. You either need more pc's so you can spread your projects out more evenly or you need to cut back on a few projects this month and concentrate on a group at a time instead of ALL of them at once.

I just counted and you have recent credit at 33 different projects, according to the list that appears when I click on your name. Assuming that some of those aren't accurate that still leaves lots of projects you are crunching for. If you were to separate them into groups you could do 5 a month and after 6 months have the same average credit, but each month give more to the 5 you are crunching for. You could even have a couple of higher priority projects that always crunch and then rotate some of the others thru, just to stop some of the constant swapping and deadline issues.
ID: 60761 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 1 Apr 08
Posts: 19
Credit: 6,895,549
RAC: 9,457
5 million credit badge10 year member badge
Message 60773 - Posted: 19 Jan 2014, 19:10:47 UTC
Last modified: 19 Jan 2014, 19:12:50 UTC

Well, both finally terminated, very oddly only one can be seen in my WU list now... luckily I have the full history in my BoincTasks, the last one finished today after 45 hours of calculation, the first one finished 2 days ago after 47 hours, why is only one visible anymore ?

Anyway I see I got 320 credits for a 45 hours WU, this is... I mean... wow, thank you so much. (hey, don't take this badly, I'm just kidding, if I would only participate to projects for the credits, I wouldn't he here and wouldn't do most of the projects I do ! ;) )

Regarding the fact I crunch many projects at the same time : I've always done this and I think I'll always do, this is the way I like it and only during raids or special events organized by my team do I switch to "mono project" or a reduced list during a limited time (max 2 weeks). When it's over I'm always very happy to go back to my multi-project habit.

So if the project application is not able to crunch together with others app, well, it's just missing boinc spirit, if boinc was supposed to be only mono projet it wouldn't exist at all (seti, WCG, etc, would still have their own system). Boinc is about sharing and letting options.


Anyway thank you all for your time and explanations, I always like to discuss and share on projects forums, and very happy when I do have feedback. Dead project's forums are just dead project :)
ID: 60773 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60776 - Posted: 20 Jan 2014, 11:52:21 UTC - in response to Message 60773.  
Last modified: 20 Jan 2014, 11:53:04 UTC

Well, both finally terminated, very oddly only one can be seen in my WU list now... luckily I have the full history in my BoincTasks, the last one finished today after 45 hours of calculation, the first one finished 2 days ago after 47 hours, why is only one visible anymore ?

Anyway I see I got 320 credits for a 45 hours WU, this is... I mean... wow, thank you so much. (hey, don't take this badly, I'm just kidding, if I would only participate to projects for the credits, I wouldn't he here and wouldn't do most of the projects I do ! ;) )


Credits are funny things, each project doing its own thing for the most part. If you look here
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=482801852
you will see the history of your unit and how long it too the other people that had your same unit to crunch and the credits they got. You can see that even though you took more then 10 times as long as the next guy you both got the same credits for the same unit.

Regarding the fact I crunch many projects at the same time : I've always done this and I think I'll always do, this is the way I like it and only during raids or special events organized by my team do I switch to "mono project" or a reduced list during a limited time (max 2 weeks). When it's over I'm always very happy to go back to my multi-project habit.

So if the project application is not able to crunch together with others app, well, it's just missing boinc spirit, if boinc was supposed to be only mono projet it wouldn't exist at all (seti, WCG, etc, would still have their own system). Boinc is about sharing and letting options.


Anyway thank you all for your time and explanations, I always like to discuss and share on projects forums, and very happy when I do have feedback. Dead project's forums are just dead project :)


I am glad you got the help you needed, if you have any more questions please ask them.
ID: 60776 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 1 Apr 08
Posts: 19
Credit: 6,895,549
RAC: 9,457
5 million credit badge10 year member badge
Message 60782 - Posted: 21 Jan 2014, 14:08:44 UTC

... and the first guy took 816 secs to crunch the SAME WU ??

Isn't there something very wrong in there ? how can they all 3 produce the same output ???
ID: 60782 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60786 - Posted: 21 Jan 2014, 23:34:40 UTC - in response to Message 60782.  

... and the first guy took 816 secs to crunch the SAME WU ??

Isn't there something very wrong in there ? how can they all 3 produce the same output ???


Credits are fixed here, so everyone finishing the same unit correctly gets the same credits.
ID: 60786 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 1 Apr 08
Posts: 19
Credit: 6,895,549
RAC: 9,457
5 million credit badge10 year member badge
Message 60789 - Posted: 22 Jan 2014, 13:50:30 UTC

My question is not about credit, my question is how on earth can 816 secs of calculation produce the same scientific result than 45 hours done to calculate the same task on my Mac ?? is the application doing some random loops or something ? being a developer in my early IT days this seems quite astounding...

Regarding credit, as I said, I actually don't care.
ID: 60789 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60795 - Posted: 23 Jan 2014, 11:57:59 UTC - in response to Message 60789.  

My question is not about credit, my question is how on earth can 816 secs of calculation produce the same scientific result than 45 hours done to calculate the same task on my Mac ?? is the application doing some random loops or something ? being a developer in my early IT days this seems quite astounding...

Regarding credit, as I said, I actually don't care.


You are forgetting yours was suspended for alot of the time, meaning it wasn't actually being crunched on and the other person probably had an i7 pc crunching like crazy on that unit straight thru nonstop. That combined with your processor itself being slower means they finished the unit much faster then you did. Does that then mean your pc can't contribute, NO, it just means you will never be first with the result. But since it takes at least two results, from two separate users, to confirm an actual result here, your pc IS helpful!
ID: 60795 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Len LE/GE

Send message
Joined: 8 Feb 08
Posts: 261
Credit: 104,050,322
RAC: 0
100 million credit badge10 year member badge
Message 60796 - Posted: 23 Jan 2014, 12:41:01 UTC

From the time differences I guess it was gpu (nonstop) vs. cpu (often suspended).
Since the WU is purged now, I can't verify.
ID: 60796 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60797 - Posted: 23 Jan 2014, 14:46:44 UTC - in response to Message 60796.  

From the time differences I guess it was gpu (nonstop) vs. cpu (often suspended).
Since the WU is purged now, I can't verify.


Yeah it could be, I don't remember either.
ID: 60797 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 448,778
RAC: 0
100 thousand credit badge6 year member badge
Message 60798 - Posted: 23 Jan 2014, 15:03:56 UTC - in response to Message 60797.  

From the time differences I guess it was gpu (nonstop) vs. cpu (often suspended).
Since the WU is purged now, I can't verify.

Yeah it could be, I don't remember either.

The 816 seconds was done under an ATI GPU plan_class, but I didn't follow through to see the exact card model.
ID: 60798 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2240
Credit: 288,203,433
RAC: 1,142,538
200 million credit badge10 year member badgeextraordinary contributions badge
Message 60803 - Posted: 24 Jan 2014, 12:38:40 UTC - in response to Message 60798.  

From the time differences I guess it was gpu (nonstop) vs. cpu (often suspended).
Since the WU is purged now, I can't verify.

Yeah it could be, I don't remember either.

The 816 seconds was done under an ATI GPU plan_class, but I didn't follow through to see the exact card model.


Thanks Richard.
ID: 60803 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Home Separation (Modified Fit) v1.28 taking ages

©2019 Astroinformatics Group