Welcome to MilkyWay@home

Long running wu.


Advanced search

Message boards : Number crunching : Long running wu.
Message board moderation

To post messages, you must log in.

AuthorMessage
Profileadrianxw

Send message
Joined: 25 May 14
Posts: 24
Credit: 44,663,020
RAC: 20,438
30 million credit badge5 year member badge
Message 66947 - Posted: 10 Jan 2018, 9:29:39 UTC

I saw this wu...

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=1559573628

on my system. Normally, the jobs here run to completion in 5-6 minutes, but this one was 56% complete, not increasing, and had over a day of crunching time logged. I suspended and released it, the percentage done dropped and it started again running to completion quickly. You can see by the sent/returned dates that something weird happened. I've not noticed this before here, but it has certainly worried me, as I have the project on here, a system I do not look at much, it is simply a number cruncher.
ID: 66947 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Guymer

Send message
Joined: 8 Mar 09
Posts: 1
Credit: 15,064,161
RAC: 5,126
10 million credit badge10 year member badge
Message 66959 - Posted: 12 Jan 2018, 6:28:15 UTC

I have this problem too.
Stopping and restarting
Boinc client clears the time to 5-6 minutes again.
ID: 66959 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill
Avatar

Send message
Joined: 8 Jan 18
Posts: 35
Credit: 4,860,686
RAC: 7,996
3 million credit badge1 year member badge
Message 66970 - Posted: 17 Jan 2018, 18:17:17 UTC

First, the link posted does not seem to work?

I had a similar problem for Seti WUs that used Intel GPUs. Work would start, but ETA would never end. If you paused crunching on the WU, it would just restart. I discovered that thrashing was an overall problem with iGPUs, so I just stopped using the processor on BOINC.

This problem sprouted all of a sudden one day; I don't think I had any changes on my system so I suspect it was something with the WU itself.

Sorry, I know that doesn't help.
ID: 66970 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profileadrianxw

Send message
Joined: 25 May 14
Posts: 24
Credit: 44,663,020
RAC: 20,438
30 million credit badge5 year member badge
Message 67029 - Posted: 2 Feb 2018, 14:21:13 UTC
Last modified: 2 Feb 2018, 14:21:52 UTC

>>> First, the link posted does not seem to work?

It does not work anymore, it did when I posted it.
ID: 67029 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Yavanius
Avatar

Send message
Joined: 27 Jan 15
Posts: 6
Credit: 1,304,712
RAC: 0
1 million credit badge4 year member badge
Message 67054 - Posted: 10 Feb 2018, 17:58:12 UTC - in response to Message 67029.  

Because the database clears results from the BOINC server after so many days else you'd start to get huge performance degradation.

Typically, the more active a project, the shorter the period. The alternative is investing in really expensive corporate enterprise servers...
ID: 67054 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Yavanius
Avatar

Send message
Joined: 27 Jan 15
Posts: 6
Credit: 1,304,712
RAC: 0
1 million credit badge4 year member badge
Message 67055 - Posted: 10 Feb 2018, 18:01:16 UTC - in response to Message 66947.  

I saw this wu...

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=1559573628

on my system. Normally, the jobs here run to completion in 5-6 minutes, but this one was 56% complete, not increasing, and had over a day of crunching time logged. I suspended and released it, the percentage done dropped and it started again running to completion quickly.


Seen that a number of times. I think it's an ongoing bug they are battling. If you don't want to babysit, turn N-body off. Pretty annoying to come home and realize your computer been spinning its wheels on a WU since last night...
ID: 67055 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Suicyder

Send message
Joined: 25 Mar 18
Posts: 2
Credit: 1,636,029
RAC: 0
1 million credit badge1 year member badge
Message 67277 - Posted: 27 Mar 2018, 16:49:46 UTC
Last modified: 27 Mar 2018, 16:51:57 UTC

A bit of a kick to this topic, but apparently I am running into the same issue with N-body tasks.

Once my display turn off (power saving option) the task basically stops running, timer moves on, task calculations don't.

After returning, it shows the following:
https://tweakers.net/ext/f/ulINereZ3BO5WoqV9aqqHxlB/full.png

After restarting BOINC, this:
https://tweakers.net/ext/f/iqFyPTv2veSk6YY07Rwg5hxS/full.png

For now it only happened with N-body tasks during my project testing period.
If this doesn't occur the WU also finishes within 10 minutes, unlike the 20ish it needed for that after it occurred.
ID: 67277 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Suicyder

Send message
Joined: 25 Mar 18
Posts: 2
Credit: 1,636,029
RAC: 0
1 million credit badge1 year member badge
Message 67282 - Posted: 28 Mar 2018, 17:54:27 UTC

Been trying a lot, drivers updated, latest version of BOINC, no other programs running, keeping the machine active.

But doesn't matter if I keep the machine active or not, eventually the {mt} task will stop running. As it is now I will just remove {mt} WU's and turn them off. Shame I have to, but they keep suspending progression.
ID: 67282 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profilemikey
Avatar

Send message
Joined: 8 May 09
Posts: 2254
Credit: 351,689,895
RAC: 943,490
300 million credit badge10 year member badgeextraordinary contributions badge
Message 67284 - Posted: 29 Mar 2018, 10:25:49 UTC - in response to Message 67282.  

Been trying a lot, drivers updated, latest version of BOINC, no other programs running, keeping the machine active.

But doesn't matter if I keep the machine active or not, eventually the {mt} task will stop running. As it is now I will just remove {mt} WU's and turn them off. Shame I have to, but they keep suspending progression.


Alot of us are in the same boat, they just won't work like they should all the time so we've moved on. As long as it does work for enough people they will keep sending them out.
ID: 67284 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 9 Jul 17
Posts: 59
Credit: 6,780,407
RAC: 4,218
5 million credit badge2 year member badge
Message 67285 - Posted: 31 Mar 2018, 8:05:10 UTC

I haven't seen a long running work unit yet, but I am on Ubuntu 16.04 (i7-4770). It might be a Windows problem?
https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=763264
ID: 67285 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Long running wu.

©2019 Astroinformatics Group