Welcome to MilkyWay@home

Failing workunits

Message boards : News : Failing workunits
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile James Sotherden
Avatar

Send message
Joined: 3 Jan 09
Posts: 139
Credit: 50,066,562
RAC: 0
Message 43164 - Posted: 25 Oct 2010, 20:25:46 UTC

Im also wondering if i should run MW on my i7 68 hours for a work unit to complete and only getting 213 for credit stinks. And I have more of them to crunch. NNT for me.

The mac is doing them in 9 hours. why so slow on a i7 quad? the same DE- separation units.
ID: 43164 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Dirk Sadowski

Send message
Joined: 30 Apr 09
Posts: 101
Credit: 29,874,293
RAC: 0
Message 43175 - Posted: 26 Oct 2010, 13:27:42 UTC - in response to Message 43164.  

Currently I wouldn't let run stock 0.4 app of/for Windows.
AFAIK, the Windows app is little bit buggy. ~ 200 % slower than the Linux app.
The 0.4 app should be ~ 30 % faster than the 0.19, but currently it isn't.
So we need to wait to 0.41 - or what ever which number.

The 0.21 (sse2) nbody app is running well. ~ 45 - 60 mins / WU on my E7600 @ 3.06 GHz.

http://milkyway.cs.rpi.edu/milkyway/apps.php

ID: 43175 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ski King

Send message
Joined: 7 Nov 09
Posts: 3
Credit: 1,093,996
RAC: 0
Message 43365 - Posted: 1 Nov 2010, 3:36:33 UTC - in response to Message 43164.  

I just gave MW another chance, however it downloaded a WU that estimated 78 hours running on a 4CPU Mac, where it elevated itself to high priority and grabbed 3.9CPUs. I immediately aborted it and will give MW a 1-2 week vacation. Maybe it will be better later.
ID: 43365 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 43367 - Posted: 1 Nov 2010, 3:57:50 UTC - in response to Message 43365.  

I just gave MW another chance, however it downloaded a WU that estimated 78 hours running on a 4CPU Mac, where it elevated itself to high priority and grabbed 3.9CPUs. I immediately aborted it and will give MW a 1-2 week vacation. Maybe it will be better later.

The N-body time estimates are almost meaningless; they are really high to avoid BOINC killing off the processes, and an actually good time estimate is hard to make. The number of processors the N-body uses is however many BOINC tells it to use.
ID: 43367 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ski King

Send message
Joined: 7 Nov 09
Posts: 3
Credit: 1,093,996
RAC: 0
Message 43410 - Posted: 2 Nov 2010, 3:40:19 UTC - in response to Message 43367.  

That's what I thought, but MW still locks out all other Projects!

There are 5 projects and all have Resource share set to 100. There are 4 CPUs so there should be 4 tasks running at one time. MW shows a Status of Running, high priority (3.90 CPUs) and the other tasks are Waiting to run.

Boinc Processor usage preferences are set to

On multiprocessors, use at most 4 processors
On multiprocessors, use at most 100 % of the processors
Use at most 100 percent of CPU time

How can I force MW to not lock out the other tasks?
ID: 43410 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bill Walker

Send message
Joined: 19 Aug 09
Posts: 23
Credit: 631,303
RAC: 0
Message 43414 - Posted: 2 Nov 2010, 11:40:18 UTC - in response to Message 43410.  

Ski King, it can takes weeks for BOINC to even out the work history of all projects when you add a new project. In effect, BOINC is looking at the long term average CPU useage of all your projects, and trying to even them out. If you just added MW, its long term average is very low, and probably much lower than the projects you have been running. Right now, BOINC is using all your CPUs on MW, so its average goes up some every day, while the long term averages of the other projects goes down some every day. Once the averages are roughly equal, BOINC will start sharing the CPUs between all the projects.
ID: 43414 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 108
Credit: 430,760,953
RAC: 0
Message 43427 - Posted: 2 Nov 2010, 17:53:11 UTC - in response to Message 43414.  

From my observations of BOINC 6.10.56/58, it's not at all clear that the BOINC scheduler is working as you might expect. It's had over a year to balance the workload between 5 CPU projects on my machines here. Once the MW 0.04/0.40 WU run times headed for the moon, it did not throttle back MW WU counts on the machines. Instead, as I write this, it has 15-17 WU's downloaded for MW on each machine (to satisfy only a 1-day cache) with each of these estimated at 31-35 hours run time. Each machine only has 16 (hyperthreaded) cores available.

With 7-day turnaround time required, it's now running up to 6 of these MW WU's per machine at high priority. With MW version 0.19, it never ran any at high priority.
ID: 43427 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Odysseus

Send message
Joined: 10 Nov 07
Posts: 96
Credit: 29,931,027
RAC: 0
Message 43438 - Posted: 3 Nov 2010, 0:36:39 UTC

Still getting errors on my G4 Macs; a small mercy is that the tasks bomb right away, so the time-wastage is negligible. Have set the project to NNT on all three until a new app becomes available.

ID: 43438 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brent

Send message
Joined: 16 Mar 10
Posts: 12
Credit: 22,284,745
RAC: 0
Message 43895 - Posted: 16 Nov 2010, 23:19:04 UTC

Well I still don't see anything that states this problem is resolved so I will remain NNT on Milkywave until I see something positive.


ID: 43895 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 43896 - Posted: 16 Nov 2010, 23:38:30 UTC - in response to Message 43895.  

Well I still don't see anything that states this problem is resolved so I will remain NNT on Milkywave until I see something positive.
This problem was fixed a long time ago now. I think the only problem mentioned here that's probably still the case is the failures on some old OS X PPC systems.
ID: 43896 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wobo

Send message
Joined: 3 Nov 10
Posts: 3
Credit: 115,198
RAC: 0
Message 43905 - Posted: 17 Nov 2010, 2:56:04 UTC

The pb I am experiencing sonce a couple of days:
I am getting MW WUs by the dozens but all of them claim to be finished within 2 seconds, uploading and then they are invalid. The problem is that MW keeps sending those WUs and I ran completely out of "real" WUs.

Example:
Task 246027162
Name	de_separation_82_2s_20_2_40432_1289765787_0
Workunit	185699499
Created	14 Nov 2010 20:16:39 UTC
Sent	14 Nov 2010 20:18:44 UTC
Received	14 Nov 2010 22:43:41 UTC
Server state	Over
Outcome	Success
Client state	Done
Exit status	0 (0x0)
Computer ID	230245
Report deadline	22 Nov 2010 20:18:44 UTC
Run time	5.087995
CPU time	3.15052
stderr out	

<core_client_version>6.10.56</core_client_version>
<![CDATA[
<stderr_txt>
BOINC_APP_VERSION: 0.18
BOINC_APP_NAME: speedimic_SSE4.1_64
COMPILER: icpc (ICC) 11.0 20090131
Code-Optimizations by Gipsel
Compiled by speedimic
called boinc_finish

</stderr_txt>
]]>

Validate state	Invalid
Claimed credit	0.0240752142977578
Granted credit	0
application version	Anonymous platform

Workunit 185699499
	
name	de_separation_82_2s_20_2_40432_1289765787
application	MilkyWay@Home
created	14 Nov 2010 20:16:27 UTC
minimum quorum	1
initial replication	2
max # of error/total/success tasks	3, 9, 6
ID: 43905 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 43906 - Posted: 17 Nov 2010, 3:03:04 UTC - in response to Message 43905.  

The pb I am experiencing sonce a couple of days:
I am getting MW WUs by the dozens but all of them claim to be finished within 2 seconds, uploading and then they are invalid. The problem is that MW keeps sending those WUs and I ran completely out of "real" WUs.

Example:
Task 246027162
Name	de_separation_82_2s_20_2_40432_1289765787_0
Workunit	185699499
Created	14 Nov 2010 20:16:39 UTC
Sent	14 Nov 2010 20:18:44 UTC
Received	14 Nov 2010 22:43:41 UTC
Server state	Over
Outcome	Success
Client state	Done
Exit status	0 (0x0)
Computer ID	230245
Report deadline	22 Nov 2010 20:18:44 UTC
Run time	5.087995
CPU time	3.15052
stderr out	

<core_client_version>6.10.56</core_client_version>
<![CDATA[
<stderr_txt>
BOINC_APP_VERSION: 0.18
BOINC_APP_NAME: speedimic_SSE4.1_64
COMPILER: icpc (ICC) 11.0 20090131
Code-Optimizations by Gipsel
Compiled by speedimic
called boinc_finish

</stderr_txt>
]]>

Validate state	Invalid
Claimed credit	0.0240752142977578
Granted credit	0
application version	Anonymous platform

Workunit 185699499
	
name	de_separation_82_2s_20_2_40432_1289765787
application	MilkyWay@Home
created	14 Nov 2010 20:16:27 UTC
minimum quorum	1
initial replication	2
max # of error/total/success tasks	3, 9, 6
That isn't the stock application. That doesn't look like it's going to work.
ID: 43906 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : News : Failing workunits

©2024 Astroinformatics Group