rpi_logo
Long running wu.
Long running wu.
log in

Advanced search

Message boards : Number crunching : Long running wu.

Author Message
Profile adrianxw
Send message
Joined: 25 May 14
Posts: 16
Credit: 34,526,677
RAC: 32,305

Message 66947 - Posted: 10 Jan 2018, 9:29:39 UTC

I saw this wu...

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=1559573628

on my system. Normally, the jobs here run to completion in 5-6 minutes, but this one was 56% complete, not increasing, and had over a day of crunching time logged. I suspended and released it, the percentage done dropped and it started again running to completion quickly. You can see by the sent/returned dates that something weird happened. I've not noticed this before here, but it has certainly worried me, as I have the project on here, a system I do not look at much, it is simply a number cruncher.
____________

David Guymer
Send message
Joined: 8 Mar 09
Posts: 1
Credit: 12,311,053
RAC: 6,367

Message 66959 - Posted: 12 Jan 2018, 6:28:15 UTC

I have this problem too.
Stopping and restarting
Boinc client clears the time to 5-6 minutes again.

Bill
Send message
Joined: 8 Jan 18
Posts: 1
Credit: 1,290,447
RAC: 6,574

Message 66970 - Posted: 17 Jan 2018, 18:17:17 UTC

First, the link posted does not seem to work?

I had a similar problem for Seti WUs that used Intel GPUs. Work would start, but ETA would never end. If you paused crunching on the WU, it would just restart. I discovered that thrashing was an overall problem with iGPUs, so I just stopped using the processor on BOINC.

This problem sprouted all of a sudden one day; I don't think I had any changes on my system so I suspect it was something with the WU itself.

Sorry, I know that doesn't help.

Profile adrianxw
Send message
Joined: 25 May 14
Posts: 16
Credit: 34,526,677
RAC: 32,305

Message 67029 - Posted: 2 Feb 2018, 14:21:13 UTC
Last modified: 2 Feb 2018, 14:21:52 UTC

>>> First, the link posted does not seem to work?

It does not work anymore, it did when I posted it.

Yavanius
Avatar
Send message
Joined: 27 Jan 15
Posts: 6
Credit: 1,026,724
RAC: 0

Message 67054 - Posted: 10 Feb 2018, 17:58:12 UTC - in response to Message 67029.

Because the database clears results from the BOINC server after so many days else you'd start to get huge performance degradation.

Typically, the more active a project, the shorter the period. The alternative is investing in really expensive corporate enterprise servers...

Yavanius
Avatar
Send message
Joined: 27 Jan 15
Posts: 6
Credit: 1,026,724
RAC: 0

Message 67055 - Posted: 10 Feb 2018, 18:01:16 UTC - in response to Message 66947.

I saw this wu...

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=1559573628

on my system. Normally, the jobs here run to completion in 5-6 minutes, but this one was 56% complete, not increasing, and had over a day of crunching time logged. I suspended and released it, the percentage done dropped and it started again running to completion quickly.


Seen that a number of times. I think it's an ongoing bug they are battling. If you don't want to babysit, turn N-body off. Pretty annoying to come home and realize your computer been spinning its wheels on a WU since last night...

Suicyder
Send message
Joined: 25 Mar 18
Posts: 2
Credit: 1,636,029
RAC: 0

Message 67277 - Posted: 27 Mar 2018, 16:49:46 UTC
Last modified: 27 Mar 2018, 16:51:57 UTC

A bit of a kick to this topic, but apparently I am running into the same issue with N-body tasks.

Once my display turn off (power saving option) the task basically stops running, timer moves on, task calculations don't.

After returning, it shows the following:
https://tweakers.net/ext/f/ulINereZ3BO5WoqV9aqqHxlB/full.png

After restarting BOINC, this:
https://tweakers.net/ext/f/iqFyPTv2veSk6YY07Rwg5hxS/full.png

For now it only happened with N-body tasks during my project testing period.
If this doesn't occur the WU also finishes within 10 minutes, unlike the 20ish it needed for that after it occurred.

Suicyder
Send message
Joined: 25 Mar 18
Posts: 2
Credit: 1,636,029
RAC: 0

Message 67282 - Posted: 28 Mar 2018, 17:54:27 UTC

Been trying a lot, drivers updated, latest version of BOINC, no other programs running, keeping the machine active.

But doesn't matter if I keep the machine active or not, eventually the {mt} task will stop running. As it is now I will just remove {mt} WU's and turn them off. Shame I have to, but they keep suspending progression.

Profile mikey
Avatar
Send message
Joined: 8 May 09
Posts: 2183
Credit: 232,361,889
RAC: 230,124

Message 67284 - Posted: 29 Mar 2018, 10:25:49 UTC - in response to Message 67282.

Been trying a lot, drivers updated, latest version of BOINC, no other programs running, keeping the machine active.

But doesn't matter if I keep the machine active or not, eventually the {mt} task will stop running. As it is now I will just remove {mt} WU's and turn them off. Shame I have to, but they keep suspending progression.


Alot of us are in the same boat, they just won't work like they should all the time so we've moved on. As long as it does work for enough people they will keep sending them out.

Jim1348
Send message
Joined: 9 Jul 17
Posts: 18
Credit: 843,472
RAC: 4,253

Message 67285 - Posted: 31 Mar 2018, 8:05:10 UTC

I haven't seen a long running work unit yet, but I am on Ubuntu 16.04 (i7-4770). It might be a Windows problem?
https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=763264


Post to thread

Message boards : Number crunching : Long running wu.


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group