Welcome to MilkyWay@home

Long crunch time on new N-Body simulations?


Advanced search

Message boards : Number crunching : Long crunch time on new N-Body simulations?
Message board moderation

To post messages, you must log in.

AuthorMessage
Bill
Avatar

Send message
Joined: 8 Jan 18
Posts: 30
Credit: 4,596,734
RAC: 1,771
3 million credit badge1 year member badge
Message 68734 - Posted: 16 May 2019, 12:15:14 UTC

I've noticed I have a few N-Body tasks that are taking way longer than I had experienced in the past. What would normally take a few hours is creeping into taking 24 hours or more.

Not all of my N-Body tasks are this way, but a few of them are here and here.

I'm letting them crunch through. They don't seem to be "stuck", just taking awhile. They should be able to complete before the deadline, assuming no delays. I just wanted to see if anyone else is experiencing this.
ID: 68734 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 55
Credit: 6,210,135
RAC: 4,808
5 million credit badge2 year member badge
Message 68736 - Posted: 16 May 2019, 13:57:39 UTC - in response to Message 68734.  

Yes, I saw it. I think it is normal.
ID: 68736 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 356
Credit: 16,317,754
RAC: 0
10 million credit badge9 year member badge
Message 68743 - Posted: 17 May 2019, 19:57:11 UTC - in response to Message 68734.  

Well, this isn't very surprising since the new application is using only one single core.
.
ID: 68743 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill
Avatar

Send message
Joined: 8 Jan 18
Posts: 30
Credit: 4,596,734
RAC: 1,771
3 million credit badge1 year member badge
Message 68744 - Posted: 17 May 2019, 20:47:34 UTC - in response to Message 68743.  

Well, this isn't very surprising since the new application is using only one single core.
Ok, true, but I think the estimated completion time is being under-estimated. I had some tasks that took 24-26 hours to complete, but I think they were originally estimated at 11 hours or so. I didn't pay close enough attention to know for sure if this is the case. I'll have to check upcoming tasks to see if this is really the case or not.
ID: 68744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 12 Dec 15
Posts: 43
Credit: 7,026,236
RAC: 38,360
5 million credit badge3 year member badge
Message 68752 - Posted: 19 May 2019, 3:52:19 UTC - in response to Message 68744.  

Well, this isn't very surprising since the new application is using only one single core.
Ok, true, but I think the estimated completion time is being under-estimated. I had some tasks that took 24-26 hours to complete, but I think they were originally estimated at 11 hours or so. I didn't pay close enough attention to know for sure if this is the case. I'll have to check upcoming tasks to see if this is really the case or not.


BOINC client keeps a running average of completion times to estimate completion and the old runtimes outweigh the new runtimes in the average.

I think if you set, in cc_config.xml, <rec_half_life_days>0</rec_half_life_days> then restart BOINC and run it for an hour then set it back to default 10 days <rec_half_life_days>10</rec_half_life_days>, you'll reset the running averages (of all WU's) and it should be close to the right number in 24 hours.

I don't usually worry about the estimate (it's usually always wrong) and so haven't tested this.
https://boinc.berkeley.edu/wiki/Client_configuration
ID: 68752 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 222
Credit: 108,753,353
RAC: 4,915
100 million credit badge8 year member badgeextraordinary contributions badge
Message 68756 - Posted: 19 May 2019, 18:12:41 UTC - in response to Message 68752.  

BOINC client keeps a running average of completion times to estimate completion and the old runtimes outweigh the new runtimes in the average.

I think if you set, in cc_config.xml, <rec_half_life_days>0</rec_half_life_days> then restart BOINC and run it for an hour then set it back to default 10 days <rec_half_life_days>10</rec_half_life_days>, you'll reset the running averages (of all WU's) and it should be close to the right number in 24 hours.

I don't usually worry about the estimate (it's usually always wrong) and so haven't tested this.


This is the correct way to update the estimated times. Or if you want an estimate that changes more rapidly based on a quickly changing data mix, set <rec_half_life_days>10</rec_half_life_days> to <rec_half_life_days>1</rec_half_life_days> and your estimates will only average over the last day.

That is what I run on my clients since Seti has a fairly diverse data mix that changes daily.
ID: 68756 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill
Avatar

Send message
Joined: 8 Jan 18
Posts: 30
Credit: 4,596,734
RAC: 1,771
3 million credit badge1 year member badge
Message 68759 - Posted: 19 May 2019, 23:01:26 UTC

Thanks for the input Marmot and Keith. The reason I noticed this was because I am running MW, Einstein, and Seti all on the same machine. Einstein also has some tasks that are taking about 10 hours. I had several tasks from MW and Einstein that needed a lot of compute time and their due dates were approaching.

I suspect now that I just had the store a minimum/additional settings a little high. Compounding that with two projects with newer tasks that take longer computing times from before probably put me in that concern. I had the settings at 1 and 5, but I have dialed it back to 0.5 and 2. We’ll see how it goes from there.
ID: 68759 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Hal Bregg

Send message
Joined: 28 Dec 18
Posts: 10
Credit: 590,928
RAC: 4,520
500 thousand credit badge
Message 68760 - Posted: 20 May 2019, 8:31:48 UTC - in response to Message 68743.  
Last modified: 20 May 2019, 8:39:46 UTC

[Deleted]
ID: 68760 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill
Avatar

Send message
Joined: 8 Jan 18
Posts: 30
Credit: 4,596,734
RAC: 1,771
3 million credit badge1 year member badge
Message 68862 - Posted: 17 Jun 2019, 12:55:02 UTC

So, I think something is going on here. I currently have four N-Body 1.76 tasks running that so far have elapsed 2 or 3 DAYS, with most of those having 10+ hours to remain. These tasks were downloaded over a week ago (6 June 2019, 9 June 2019), and I am pretty sure I would have noticed estimated times of that long. Marmot, I did try adjusting the half life as you suggested and it had not picked up this discrepancy in ETA. This morning I have adjusted my half life setting down to 1.

I feel that this is a problem that shouldn't be happening. The CPU, despite being a laptop, is from 2017 so it isn't a slow processor. It has a higher GFLOPS/core than an i7-8700. Is there any kind of debugging I can do to evaluate this?
ID: 68862 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
swiftmallard
Avatar

Send message
Joined: 18 Jul 09
Posts: 294
Credit: 303,123,494
RAC: 5,732
300 million credit badge10 year member badgeextraordinary contributions badge
Message 68945 - Posted: 2 Aug 2019, 22:03:05 UTC - in response to Message 68862.  

So, I think something is going on here. I currently have four N-Body 1.76 tasks running that so far have elapsed 2 or 3 DAYS, with most of those having 10+ hours to remain. These tasks were downloaded over a week ago (6 June 2019, 9 June 2019), and I am pretty sure I would have noticed estimated times of that long. Marmot, I did try adjusting the half life as you suggested and it had not picked up this discrepancy in ETA. This morning I have adjusted my half life setting down to 1.

I feel that this is a problem that shouldn't be happening. The CPU, despite being a laptop, is from 2017 so it isn't a slow processor. It has a higher GFLOPS/core than an i7-8700. Is there any kind of debugging I can do to evaluate this?

My aging dual core AMD Phenom II is processing N-Body 1.76 tasks in just a few hours, all have been pretty close to the estimated time.
ID: 68945 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProfileKeith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 222
Credit: 108,753,353
RAC: 4,915
100 million credit badge8 year member badgeextraordinary contributions badge
Message 68946 - Posted: 3 Aug 2019, 3:05:28 UTC - in response to Message 68945.  

When the run_time greatly exceeds the cpu_time, that indicates a cpu that is overcommitted. Try running fewer tasks. Or reduce the number of background processes that are stealing cpu cycles from the crunching.
ID: 68946 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill
Avatar

Send message
Joined: 8 Jan 18
Posts: 30
Credit: 4,596,734
RAC: 1,771
3 million credit badge1 year member badge
Message 68979 - Posted: 19 Aug 2019, 15:49:08 UTC

Sorry for the late response, but I think I know the culprit. I was crunching Intel GPU tasks for Seti@home and that slowed everything down. I don't know why I didn't figure that in the first place. Times have sped up significantly since I stopped with the iGPU tasks.
ID: 68979 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Long crunch time on new N-Body simulations?

©2019 Astroinformatics Group