new workunit limit
log in

Advanced search

Message boards : Number crunching : new workunit limit

Author Message
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 6951 - Posted: 29 Nov 2008 | 21:06:40 UTC

It looks like the transitioner really can't keep up with what's going on with milkyway right now, so in order to speed things up i would like to reduce the workunit limit (5 would be ideal, 10 passable), to reduce the size of the database, which would speed things up.

Now that the server is assigning WUs at a per-core rate as opposed to a per-computer rate, i think this is should work out fine; it will also give us better results for the searches we're running.

I'm going to lower the WU limit to 5 and if this is really unworkable i'll raise it. Hopefully this should speed up the transitioner and make more work available.
____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 6953 - Posted: 29 Nov 2008 | 21:26:46 UTC

5 doesn't do it unless they are made longer, as in an hour not 10 min.
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

John Clark
Send message
Joined: 4 Oct 08
Posts: 1613
Credit: 62,023,268
RAC: 27,502
Message 6958 - Posted: 29 Nov 2008 | 21:37:15 UTC
Last modified: 29 Nov 2008 | 21:50:19 UTC

Travis

I think 5 per core is too low for the new Penryn Quads. These crunch WUs in about 3 - 5 minutes.

If the server contact script was modified to allow a maximum server recontact time of, say 10 minutes then this may work. But, I would also recommend a slightly higher WU-per-core number (say 10 max). ATM the server recontact script boots through a few minutes then quickly escalates to 45 minutes, then 55 minutes then to 1hour, 2 hours and 3 hours.

My server recontact time is the same as JAMC's 'communication deferred' time.

JAMC
Send message
Joined: 9 Sep 08
Posts: 96
Credit: 336,443,946
RAC: 0
Message 6959 - Posted: 29 Nov 2008 | 21:38:14 UTC
Last modified: 29 Nov 2008 | 21:40:57 UTC

The killer for my quads is the 'communication deferred' time in the BOINC projects tab- is that set by the project or BOINC? Even with network activity set to always on, when the communication deferred time goes to 2 or 3+ hours I lose all hope of keeping WU's cached and always run out and that's with 20 WU/core limit- we have to get longer WU's to make the change to 5WU's/core work and I guess that means we all have to run test apps as well...

Thierry Godefroy
Send message
Joined: 29 Jul 08
Posts: 9
Credit: 839,694
RAC: 0
Message 6972 - Posted: 29 Nov 2008 | 22:44:13 UTC

This is pure non-sense... Even with 20 WUs per core the queue was stalling unless I manually requested more work.

The problem being that when BOINC gets replied "Reached CPU limit" several times in a raw, it starts delaying the work request, and in the end it gets delayed by over 3 hours... And as 20 WUs are crunched in under 110 minutes, you get a queue stall (not to mention it's a Hell of a nightmare to get just a few more WUs at the next request).

The solution is simple: make it so that the optimized apps will need 60 minutes or so to crunch each WU (multiply the work to do per WU by 12).

As it is, I will rather crunch for another project than let the queue stall and the computer staying powered on for nothing at all...

Profile Misfit
Avatar
Send message
Joined: 27 Aug 07
Posts: 915
Credit: 1,503,115
RAC: 0
Message 6974 - Posted: 29 Nov 2008 | 22:46:04 UTC - in response to Message 6951.
Last modified: 29 Nov 2008 | 22:47:13 UTC

11/29/2008 2:44:13 PM|Milkyway@home|Sending scheduler request: Requested by user. Requesting 3317 seconds of work, reporting 3 completed tasks
11/29/2008 2:44:28 PM|Milkyway@home|Scheduler request succeeded: got 0 new tasks
11/29/2008 2:44:28 PM|Milkyway@home|Message from server: No work sent
11/29/2008 2:44:28 PM|Milkyway@home|Message from server: (reached per-CPU limit of 5 tasks)


Well that brought me to the message board. :/

I'm going to lower the WU limit to 5 and if this is really unworkable i'll raise it.

It's really unworkable. You should raise it.
____________

Bigred
Avatar
Send message
Joined: 23 Nov 07
Posts: 33
Credit: 300,042,542
RAC: 0
Message 6975 - Posted: 29 Nov 2008 | 22:50:24 UTC

This stragety seems to be working for me. My Quads are staying at 20 tasks. As soon as any are done they are reported and replaced.


____________

Profile caspr
Avatar
Send message
Joined: 22 Mar 08
Posts: 90
Credit: 501,728
RAC: 0
Message 6977 - Posted: 29 Nov 2008 | 23:00:51 UTC - in response to Message 6975.

This stragety seems to be working for me. My Quads are staying at 20 tasks. As soon as any are done they are reported and replaced.




Same here but also my pendings are still growing.
____________
A clear conscience is usually the sign of a bad memory



Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 6979 - Posted: 29 Nov 2008 | 23:05:32 UTC - in response to Message 6977.

This stragety seems to be working for me. My Quads are staying at 20 tasks. As soon as any are done they are reported and replaced.




Same here but also my pendings are still growing.


I'm hoping with the lower limit the transitioner will be able to keep up with the work requests. I'll bump things up to 8 and see how that works out -- I don't want people getting communication deferreds if they're crunching too fast.

The assimilator/validator for the new app are a lot faster than the old one, so when we make the switch to running only the new app, this should help a bit as well.
____________

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,250,462
RAC: 169,580
Message 6980 - Posted: 29 Nov 2008 | 23:10:02 UTC - in response to Message 6951.

No problem, you did also say that the work units would be 12 to 20 times longer, right? If you REALLY want to lower the stress on the transitioner, then you must increase the length of the work unit. A 25 minute per core cache is only going to increase the stress on the transitioner as it will compel everyone running MW to be consinuously hitting the server.

Seriously, the problem isn't a 5, 10 or 20 WU cache limit, it is the 5 minute WU, fix that and things would be fine, keep it the way it is, and you end up wasting your time chasing server problems along with users wasting their time chasing WU's.


It looks like the transitioner really can't keep up with what's going on with milkyway right now, so in order to speed things up i would like to reduce the workunit limit (5 would be ideal, 10 passable), to reduce the size of the database, which would speed things up.

Now that the server is assigning WUs at a per-core rate as opposed to a per-computer rate, i think this is should work out fine; it will also give us better results for the searches we're running.

I'm going to lower the WU limit to 5 and if this is really unworkable i'll raise it. Hopefully this should speed up the transitioner and make more work available.


____________

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,250,462
RAC: 169,580
Message 6982 - Posted: 29 Nov 2008 | 23:11:50 UTC

I see you just bumped up the cache to 8 WU's from 5. But until you have reasonably timed WU's, keeping things running smoothly is just a dream.

____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 6984 - Posted: 29 Nov 2008 | 23:15:44 UTC - in response to Message 6980.

Seriously, the problem isn't a 5, 10 or 20 WU cache limit, it is the 5 minute WU,


How many times has this been said? I know I have. It was why the old,old,old wu's were made into hours in the first place, until the optimised apps came along.
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 6988 - Posted: 29 Nov 2008 | 23:32:42 UTC - in response to Message 6984.

Seriously, the problem isn't a 5, 10 or 20 WU cache limit, it is the 5 minute WU,


How many times has this been said? I know I have. It was why the old,old,old wu's were made into hours in the first place, until the optimised apps came along.


I know we need longer WUs. In fact, I think this next meeting will be all about how we can get them much longer :P
____________

John Clark
Send message
Joined: 4 Oct 08
Posts: 1613
Credit: 62,023,268
RAC: 27,502
Message 6991 - Posted: 29 Nov 2008 | 23:43:05 UTC - in response to Message 6988.
Last modified: 29 Nov 2008 | 23:44:13 UTC

Seriously, the problem isn't a 5, 10 or 20 WU cache limit, it is the 5 minute WU,


How many times has this been said? I know I have. It was why the old,old,old wu's were made into hours in the first place, until the optimised apps came along.


I know we need longer WUs. In fact, I think this next meeting will be all about how we can get them much longer :P


I presume you can now think about more science to make the WUs longer, more useful to you science.

On an aside -

So far my PCs are being kept fed, and the work ready to send, on the servers, has been high (compared the past) which means that the work is there to satisfy demand.

The only problem I see, when the computers here are unsupervised, is the build up of the time due to deferring communications for xxxx As long as this does not rise above, say, 20 minutes I think the current WUs-per-core limit might work OK.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 6994 - Posted: 29 Nov 2008 | 23:46:36 UTC - in response to Message 6991.

Seriously, the problem isn't a 5, 10 or 20 WU cache limit, it is the 5 minute WU,


How many times has this been said? I know I have. It was why the old,old,old wu's were made into hours in the first place, until the optimised apps came along.


I know we need longer WUs. In fact, I think this next meeting will be all about how we can get them much longer :P


I presume you can now think about more science to make the WUs longer, more useful to you science.


Yes, but unfortunately that takes time :( I'm going to see what we can do more short-term, until we can take the analysis up to the next level (and hopefully make the WUs really long).
____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 6995 - Posted: 29 Nov 2008 | 23:47:11 UTC - in response to Message 6991.

Not quite the same but when I get 'No work' it will deferr: 1 min, 1 min, 1 min, 3 hours (or some varation).
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,250,462
RAC: 169,580
Message 6996 - Posted: 29 Nov 2008 | 23:49:44 UTC - in response to Message 6994.

Well, it might take longer for you as since you are both the cook and bottle washer, the more time you spend nursing the server, the less time you have for all the good stuff (smile>)


Yes, but unfortunately that takes time :( I'm going to see what we can do more short-term, until we can take the analysis up to the next level (and hopefully make the WUs really long).


____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 7006 - Posted: 30 Nov 2008 | 0:25:22 UTC

Seems like only a temp fix:

[As of 30 Nov 2008 0:21:21 UTC]
Results ready to send 1,505
Results in progress 47,165
Workunits waiting for validation 3,524
Workunits waiting for assimilation 461
Workunits waiting for deletion 67
Results waiting for deletion 92
Transitioner backlog (hours) 2

~30 min ago
ready to send: 15k
progess: 35k
valid: >100
(others about same)
deletion: 2
backlog: 2 hours
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Profile m4rtyn
Avatar
Send message
Joined: 16 Jan 08
Posts: 18
Credit: 4,111,257
RAC: 0
Message 7008 - Posted: 30 Nov 2008 | 0:55:10 UTC

It's not gonna work! already I'm getting repeated "No Work Sent" messages and my pc's are backing of to between 1 & 3 hours. Without constant attendance they'll spend most of the time with an empty cache.
____________
m4rtyn
******************************* *******************************

JAMC
Send message
Joined: 9 Sep 08
Posts: 96
Credit: 336,443,946
RAC: 0
Message 7009 - Posted: 30 Nov 2008 | 0:59:02 UTC - in response to Message 7008.

It's not gonna work! already I'm getting repeated "No Work Sent" messages and my pc's are backing of to between 1 & 3 hours. Without constant attendance they'll spend most of the time with an empty cache.


...same here :(

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,250,462
RAC: 169,580
Message 7010 - Posted: 30 Nov 2008 | 1:00:02 UTC - in response to Message 7008.

I suspect that all the change really did was reduce the number of work units folks were pulling from the server for about an hour, and that hour is up, so the server is now getting the same sort of constant drone from hungry workstations.

It's not gonna work! already I'm getting repeated "No Work Sent" messages and my pc's are backing of to between 1 & 3 hours. Without constant attendance they'll spend most of the time with an empty cache.


____________

John Clark
Send message
Joined: 4 Oct 08
Posts: 1613
Credit: 62,023,268
RAC: 27,502
Message 7011 - Posted: 30 Nov 2008 | 1:01:31 UTC
Last modified: 30 Nov 2008 | 1:02:03 UTC

Mine are heading that way, but have not reached that point. Most will be out within 30 minutes and the rest will take a little longer.

Looks like my parallel projects (Einstein and Malaria) will pull more cores than they expected soon.

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,250,462
RAC: 169,580
Message 7012 - Posted: 30 Nov 2008 | 1:05:45 UTC - in response to Message 7011.

Right, I am now pulling more work down from SETI, POEM and Spinhenge as well.

Mine are heading that way, but have not reached that point. Most will be out within 30 minutes and the rest will take a little longer.

Looks like my parallel projects (Einstein and Malaria) will pull more cores than they expected soon.


____________

Profile caspr
Avatar
Send message
Joined: 22 Mar 08
Posts: 90
Credit: 501,728
RAC: 0
Message 7014 - Posted: 30 Nov 2008 | 1:15:21 UTC

Travis, this is hopeless! You need 1 app and spend your time working on it instead of trying to fix all the problems with 10 apps! Its not about the crunchers.... its about the science! While I do like the credit's I'm getting from the opp. app.(credit whore)I joined for the project itself! Get 1 app. working and dont accept work from other apps! A simple "reset project" or "detach/reattach" should take care of it. Put everyone on level ground and get rid of all this drama! If you want to update the app later,... cool EVERYONE updates!

Just my 2cents!
____________
A clear conscience is usually the sign of a bad memory



Dorphas
Send message
Joined: 5 Dec 07
Posts: 6
Credit: 1,033,816
RAC: 0
Message 7016 - Posted: 30 Nov 2008 | 1:37:49 UTC - in response to Message 7014.
Last modified: 30 Nov 2008 | 1:53:51 UTC

i guess i will now have to find another project to crunch. i really don't want to crunch seti anymore. all this waiting for work and the new lower 8 wu limits means, based on the past few days, over 1/3rd of the day my fleet will be idle waiting for something to do..esp when the time often goes to 2 hrs and 3 hrs before the server is even contacted again. milkyway has become too hands on lately for me.

oh well, it was fun and enjoyable while it lasted...

best of luck to everyone.........
____________

Wassertropfen
Avatar
Send message
Joined: 6 Mar 08
Posts: 5
Credit: 7,133,381
RAC: 938
Message 7019 - Posted: 30 Nov 2008 | 2:10:14 UTC
Last modified: 30 Nov 2008 | 2:10:37 UTC

Hello Travis,

5 WU is nothing. 8 WU per core is better, but still nothing.

Can't you increase (400%) the size of the WU?
____________
Constant dripping wears away the stone. :)

Profile nutcase
Send message
Joined: 25 Nov 07
Posts: 11
Credit: 40,758,862
RAC: 0
Message 7022 - Posted: 30 Nov 2008 | 2:54:08 UTC - in response to Message 7019.

this project has now become a waste of my time.

80% of your problems would go away just by increasing the size of the wu's. this would be a good Idea when you switch to the new app.

But for now, it has been placed as a backup project with my computers.

Emanuel
Send message
Joined: 18 Nov 07
Posts: 280
Credit: 2,442,757
RAC: 539
Message 7023 - Posted: 30 Nov 2008 | 3:06:30 UTC - in response to Message 7022.

this project has now become a waste of my time.

80% of your problems would go away just by increasing the size of the wu's. this would be a good Idea when you switch to the new app.

But for now, it has been placed as a backup project with my computers.


Oh come on, at least read the damn thread. Travis has already said that increasing the size of WUs is on the agenda, but that doing this isn't as easy as we may think.

Profile nutcase
Send message
Joined: 25 Nov 07
Posts: 11
Credit: 40,758,862
RAC: 0
Message 7026 - Posted: 30 Nov 2008 | 3:19:56 UTC - in response to Message 7023.

this project has now become a waste of my time.

80% of your problems would go away just by increasing the size of the wu's. this would be a good Idea when you switch to the new app.

But for now, it has been placed as a backup project with my computers.


Oh come on, at least read the damn thread. Travis has already said that increasing the size of WUs is on the agenda, but that doing this isn't as easy as we may think.


1% resource share seems to be doing well with the project at the present time. the only real computer now running the project is My atom 230 system.

My dual xeon and core-i7 systems just go through the project too fast to even worry about running the project on at the present time.

mycal
Send message
Joined: 13 Jan 08
Posts: 18
Credit: 600,484
RAC: 0
Message 7031 - Posted: 30 Nov 2008 | 10:27:36 UTC

30/11/2008 10:19:29|Milkyway@home|Message from server: No work sent
30/11/2008 10:19:29|Milkyway@home|Message from server: (reached per-CPU limit of 8 tasks)
30/11/2008 10:20:30|Milkyway@home|Sending scheduler request: To fetch work. Requesting 65098 seconds of work, reporting 0 completed tasks


Even after running alnight still have a full download of 32 on both quads and 16 on laptop Increase from five to eight nice one.

How about another approach claimed credits as against granted credits seems to work alright on other projects.


Michael

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,250,462
RAC: 169,580
Message 7041 - Posted: 30 Nov 2008 | 20:10:49 UTC

One thing to consider here, given the fragility of the servers, the workload presented by the use of the optimized client, and the 'interesting' approach for awarding credits, it seems to make some sense to NOT use the optimized client. Since the current award scheme seems to be based on cpu time spent and not work done, there is no 'credit penalty' for running the vastly less efficient regular client. The other benefit is that with the much longer running time of the regular client, and the miniscule cache, users don't spend an inordinate amount of time trying to pry a 5 minute work unit from the gasping server, rather, each work unit runs 5 hours.

The two downsides to this are, one produces less science (but perhaps the optimized client produces too much work for the project to handle anyway), and the extra time processing on MilkyWay (for which you get an ample credit award) is not spent on other projects.

I'd not suggest this at all if the credit scheme were work based instead of CPU time based, nor would I suggest this at all if adequate work was readily available and rational queues of say 10 to 24 hours were available. But there it is. I'm seriously considering dumping the optimized client out in favor of the very inefficient 'regular' client -- it will mean I have more time to spend on other things.
____________

Alinator
Send message
Joined: 7 Jun 08
Posts: 393
Credit: 20,834,406
RAC: 66,357
Message 7043 - Posted: 30 Nov 2008 | 20:16:44 UTC
Last modified: 30 Nov 2008 | 20:18:43 UTC

LOL...

I was thinking along the same lines, but to even suggest not using the opti borders on heresy in some circles. :-D

Even if that means it drives the project to near meltdown. ;-)

As far as scoring goes, agreed. If the basis was set to something closer to the current 'nominal', there would probably a lot less CW'ing going on. ;-)

Alinator

Profile Kevint
Avatar
Send message
Joined: 22 Nov 07
Posts: 285
Credit: 1,076,786,368
RAC: 0
Message 7044 - Posted: 30 Nov 2008 | 20:21:56 UTC - in response to Message 7031.



5 was too low, 8 is too low - Travis, your network, and servers were having a hard time keeping up with the demand at 20?

If you feel you need to lower it, do so, but make it reasonable - 10-15 would be a much better figure.

My hosts are constantly asking for more work, and if this is causing my network trouble, I am sure it is causing your systems to cry out in pain.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 7047 - Posted: 30 Nov 2008 | 20:29:39 UTC - in response to Message 7044.



5 was too low, 8 is too low - Travis, your network, and servers were having a hard time keeping up with the demand at 20?

If you feel you need to lower it, do so, but make it reasonable - 10-15 would be a much better figure.

My hosts are constantly asking for more work, and if this is causing my network trouble, I am sure it is causing your systems to cry out in pain.



This should be a non-issue now that we're moving over to the new application. WUs for that should be quite a bit longer.
____________

John Clark
Send message
Joined: 4 Oct 08
Posts: 1613
Credit: 62,023,268
RAC: 27,502
Message 7055 - Posted: 30 Nov 2008 | 21:25:08 UTC

I presume the best way to install the new client is to detachand reattach, as it is now the stock client?

I have downloaded the 1.5 Mb windows Zip file and unzipped it. But there is no install instructions I can see.

How simple is it? Is it just copying it in to the project file in the BOINC folder?

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 7056 - Posted: 30 Nov 2008 | 21:28:29 UTC - in response to Message 7055.

I presume the best way to install the new client is to detachand reattach, as it is now the stock client?

I have downloaded the 1.5 Mb windows Zip file and unzipped it. But there is no install instructions I can see.

How simple is it? Is it just copying it in to the project file in the BOINC folder?


Isn't that just the app so it can be optimized? The new app should auto. download right?
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 7057 - Posted: 30 Nov 2008 | 21:29:03 UTC - in response to Message 7056.
Last modified: 30 Nov 2008 | 21:31:16 UTC

I presume the best way to install the new client is to detachand reattach, as it is now the stock client?

I have downloaded the 1.5 Mb windows Zip file and unzipped it. But there is no install instructions I can see.

How simple is it? Is it just copying it in to the project file in the BOINC folder?


Isn't that just the app so it can be optimized? The new app should auto. download right?


Yeah, the new app should automatically download.

Update: I edited the news post so people don't get confused and think they need to compile it themselves :)
____________

Mr Mystery
Send message
Joined: 21 Nov 08
Posts: 90
Credit: 2,601
RAC: 0
Message 7058 - Posted: 30 Nov 2008 | 21:33:05 UTC - in response to Message 7055.
Last modified: 30 Nov 2008 | 21:33:25 UTC

I presume the best way to install the new client is to detachand reattach, as it is now the stock client?


Make sure you dont have an app_info file in the milkyway folder, then the new program will arrive as soon as its released.
____________

John Clark
Send message
Joined: 4 Oct 08
Posts: 1613
Credit: 62,023,268
RAC: 27,502
Message 7060 - Posted: 30 Nov 2008 | 21:46:49 UTC

Detached my first quad and reattached. The system automatically downloaded the new client and the 8 WUs-per-core limit and got on crunching.

The WUs are labled as - Milkyway@Home optimised 0.04. I assume 0.04 WUs go with the 0.5 client?

These WUs are definitely slower than the Milksop client, which was being crunching on an average of 200 - 205 seconds.

Looking at the first 4 crunched on the rig I type from, the average time is 505 - 514 seconds. That means they are about 2.5 times as long.

Alinator
Send message
Joined: 7 Jun 08
Posts: 393
Credit: 20,834,406
RAC: 66,357
Message 7061 - Posted: 30 Nov 2008 | 21:59:22 UTC - in response to Message 7060.
Last modified: 30 Nov 2008 | 22:00:46 UTC

Detached my first quad and reattached. The system automatically downloaded the new client and the 8 WUs-per-core limit and got on crunching.

The WUs are labled as - Milkyway@Home optimised 0.04. I assume 0.04 WUs go with the 0.5 client?

These WUs are definitely slower than the Milksop client, which was being crunching on an average of 200 - 205 seconds.

Looking at the first 4 crunched on the rig I type from, the average time is 505 - 514 seconds. That means they are about 2.5 times as long.



Hmmm...

Well one thing to keep in mind here, is that the detach and or reset will dump the app_info file (and opti) from the project directory.

What isn't clear so far is; Are the Astronomy searches (gs series) completed now?

If not, then you need to build the custom app_info I mentioned in another thread or you are going to have a TDCF mess if and when the CC picks up an Astronomy task and runs it with the stock version.

Alinator

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 7062 - Posted: 30 Nov 2008 | 22:02:08 UTC - in response to Message 7060.

Detached my first quad and reattached. The system automatically downloaded the new client and the 8 WUs-per-core limit and got on crunching.

The WUs are labled as - Milkyway@Home optimised 0.04. I assume 0.04 WUs go with the 0.5 client?

These WUs are definitely slower than the Milksop client, which was being crunching on an average of 200 - 205 seconds.

Looking at the first 4 crunched on the rig I type from, the average time is 505 - 514 seconds. That means they are about 2.5 times as long.



WUs from nm_testX and nm_stripeX (not nm_stripeX_1) don't have as much work as the nm_stripeX_1s. I've left those searches generating more work because they're almost completed and i want to see what the final results are. After those are done the WUs should be about 4x longer than what they are right now.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 7063 - Posted: 30 Nov 2008 | 22:05:27 UTC - in response to Message 7061.

Detached my first quad and reattached. The system automatically downloaded the new client and the 8 WUs-per-core limit and got on crunching.

The WUs are labled as - Milkyway@Home optimised 0.04. I assume 0.04 WUs go with the 0.5 client?

These WUs are definitely slower than the Milksop client, which was being crunching on an average of 200 - 205 seconds.

Looking at the first 4 crunched on the rig I type from, the average time is 505 - 514 seconds. That means they are about 2.5 times as long.



Hmmm...

Well one thing to keep in mind here, is that the detach and or reset will dump the app_info file (and opti) from the project directory.

What isn't clear so far is; Are the Astronomy searches (gs series) completed now?

If not, then you need to build the custom app_info I mentioned in another thread or you are going to have a TDCF mess if and when the CC picks up an Astronomy task and runs it with the stock version.

Alinator


We won't be generating any more work for the old application. So the gs series that's out there right now is done. I'll probably be starting up some more gs WUs using the new app after i get some results with the nm ones.

GS is doing a genetic search server side, while NM is doing an asynchronous newton method. GS is more of a global search, while the newton method we're testing is local search with a fast convergence rate. Right now we're trying to get some very accurate numbers on stripes 79, 82 and 86 for an upcoming publication.

____________

Alinator
Send message
Joined: 7 Jun 08
Posts: 393
Credit: 20,834,406
RAC: 66,357
Message 7064 - Posted: 30 Nov 2008 | 22:06:09 UTC - in response to Message 7062.
Last modified: 30 Nov 2008 | 22:09:54 UTC

AHHHH....

Thanks for that extra info about just what the different series are targeting! Cool.

So the next question is if you still have old GS tasks hanging around on hosts, is there any point in running them, or can we just dump them at this point?

Alinator

Profile Kevint
Avatar
Send message
Joined: 22 Nov 07
Posts: 285
Credit: 1,076,786,368
RAC: 0
Message 7065 - Posted: 30 Nov 2008 | 22:06:56 UTC - in response to Message 7047.



5 was too low, 8 is too low - Travis, your network, and servers were having a hard time keeping up with the demand at 20?

If you feel you need to lower it, do so, but make it reasonable - 10-15 would be a much better figure.

My hosts are constantly asking for more work, and if this is causing my network trouble, I am sure it is causing your systems to cry out in pain.



This should be a non-issue now that we're moving over to the new application. WUs for that should be quite a bit longer.


Yep, Just noticed that you are moving to the new app - very good. Ignore my previous comment :)

I guess we will see how the new app performs with the cache settings and server load.. Hope this fixes a bunch of things.


Profile alijay
Send message
Joined: 15 Apr 08
Posts: 55
Credit: 18,566
RAC: 0
Message 7067 - Posted: 30 Nov 2008 | 22:25:55 UTC - in response to Message 7065.

When you increse the length of the wu please ensure that the check-point error that has been showing up is fixed. With the shorter wu's it is not a problem with longer wu's and particularly on a slower machine if an error occurs because of a fault in the check-pointing a cruncher loses the credits for an hours work it is annoying.

Alinator
Send message
Joined: 7 Jun 08
Posts: 393
Credit: 20,834,406
RAC: 66,357
Message 7068 - Posted: 30 Nov 2008 | 22:28:12 UTC - in response to Message 7067.

When you increse the length of the wu please ensure that the check-point error that has been showing up is fixed. With the shorter wu's it is not a problem with longer wu's and particularly on a slower machine if an error occurs because of a fault in the check-pointing a cruncher loses the credits for an hours work it is annoying.


Yep, that needs to be fixed, but keeping them in memory is a partial workaround.

Alinator

BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,250,462
RAC: 169,580
Message 7069 - Posted: 30 Nov 2008 | 23:10:38 UTC - in response to Message 7060.

Same here -- about 2.5 times the processing time. The new work units take about 12 minutes to process. The 8 wu queue is still short (under 2 hours), a 20 work unit queue would yield a full 4 hour queue, still short, but workable I guess. (I'd love to see 30 minute WU's with the 20 unit queue).

On the 'keep the BOINC world happy front' -- Dave Anderson should be pleased with the lower awards. The older workunits running the optimized client on the same workstation got me something like 108 credits per hour per core (assuming that they always processed Milkyway which they never could for various reasons discussed all over the place). That would have gotten a RAC of over 10K on an AMD quad core 9850 (the system I am using here). The new version WU's, take just under 12 minutes to process and get 9.7 credits. That slides out to under 50 credits per hour for a RAC of around 4700 on the same system. That number is much closer in line with what SETI offers using their readily available optimized client. With SETI on the same system, fully running, instead of a RAC of 4500, I would get something like 3900. So Milkyway is still offering a 'premium' but it is clearly in the neighborhood.


Detached my first quad and reattached. The system automatically downloaded the new client and the 8 WUs-per-core limit and got on crunching.

The WUs are labled as - Milkyway@Home optimised 0.04. I assume 0.04 WUs go with the 0.5 client?

These WUs are definitely slower than the Milksop client, which was being crunching on an average of 200 - 205 seconds.

Looking at the first 4 crunched on the rig I type from, the average time is 505 - 514 seconds. That means they are about 2.5 times as long.



____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 7073 - Posted: 30 Nov 2008 | 23:24:30 UTC - in response to Message 7068.

When you increse the length of the wu please ensure that the check-point error that has been showing up is fixed. With the shorter wu's it is not a problem with longer wu's and particularly on a slower machine if an error occurs because of a fault in the check-pointing a cruncher loses the credits for an hours work it is annoying.


Yep, that needs to be fixed, but keeping them in memory is a partial workaround.

Alinator


v0.6 has the code to fix the checkpointing problems, so it should be ok.
____________

Profile Sorceress
Avatar
Send message
Joined: 30 Nov 08
Posts: 22
Credit: 63,967
RAC: 0
Message 7110 - Posted: 1 Dec 2008 | 10:20:25 UTC

OK.I seem to be having a problem. Newbie to Milky Way here. The WUs I downloaded says time to completion is 00:08:20 but after 03:33:18 hours of crunching, I'm only 19.87% completed and time to completion is 02:13:24 and increasing. My computer is - GenuineIntel
Intel(R) Core(TM)2 Duo CPU E4500 @ 2.20GHz [x86 Family 6 Model 15 Stepping 13]. OS is- Microsoft Windows XP Professional x86 Editon, Service Pack 3, (05.01.2600.00). Completion date is 12/3/08. At this rate, I might get 3 WU done out of the 16 WU I downloaded on top of the other 9 projects I am crunching for. Same goes for my 2nd older AMD machine. What's happening here?
____________


Alinator
Send message
Joined: 7 Jun 08
Posts: 393
Credit: 20,834,406
RAC: 66,357
Message 7133 - Posted: 1 Dec 2008 | 15:54:20 UTC
Last modified: 1 Dec 2008 | 16:18:58 UTC

Don't worry about it at this point. MW is a tight deadline project for almost all host with the stock apps. This often drives the tasks into HP when you first attach (due to no TDCF data for the project), or running several projects and/or low Resource Share for MW on hosts which have been around awhile.

Give it a week or so and things should settle down.

Alinator

Profile GalaxyIce
Avatar
Send message
Joined: 6 Apr 08
Posts: 2018
Credit: 100,142,856
RAC: 0
Message 7135 - Posted: 1 Dec 2008 | 16:12:39 UTC - in response to Message 7133.

Don't worry about it at this point. MW is a tight deadline project for almost all host with the stock apps. This often drives the tasks into HP when you first attach (due to TDCF data for the project), or running several projects and/or low Resource Share for MW on hosts which have been around awhile.

Give it a week or so and things should settle down.

Alinator

I think I need a week or so to settle down after the personal attention required to MW over the last week or so to keep the milk chocolate crunching ;) I can't complain though - my RAC is still flyin' :)

Welcome Sorceress - nice avatar...
____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 7136 - Posted: 1 Dec 2008 | 16:17:17 UTC

I don't think it has ever run in normal mode for me. Not that it's a big deal, I manually get work from the project I want to do, when I want to.
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Alinator
Send message
Joined: 7 Jun 08
Posts: 393
Credit: 20,834,406
RAC: 66,357
Message 7137 - Posted: 1 Dec 2008 | 16:21:22 UTC - in response to Message 7135.

I think I need a week or so to settle down after the personal attention required to MW over the last week or so to keep the milk chocolate crunching ;) I can't complain though - my RAC is still flyin' :)

Welcome Sorceress - nice avatar...


LOL...

I feel your pain.

Now if we can get off of InstaPurge and back to something a little more user friendly, I'll be a Happy Camper again! ;-)

Alinator

Profile GalaxyIce
Avatar
Send message
Joined: 6 Apr 08
Posts: 2018
Credit: 100,142,856
RAC: 0
Message 7160 - Posted: 1 Dec 2008 | 18:44:28 UTC - in response to Message 7137.


LOL...

I feel your pain.

Now if we can get off of InstaPurge and back to something a little more user friendly, I'll be a Happy Camper again! ;-)

Alinator

Yes it can be a pain not only to supply a free PC or two to a project like this, but then to try and work out what to do to keep up with it all.

But then there are good people like you Alinator to help out the likes of me, so thanks very much for that. And also thanks to Travis and MW mod/science staff here - you're doing a fine job - and thanks!

____________

Profile Sorceress
Avatar
Send message
Joined: 30 Nov 08
Posts: 22
Credit: 63,967
RAC: 0
Message 7161 - Posted: 1 Dec 2008 | 18:45:53 UTC - in response to Message 7135.

Thanks Ice. I just got home and noticed MW is running w/high priority on both machines. Don't know what happened, but its working much better. Noticed also the 'time to completion' is more accurate. I still won't be able to complete much of the WU I recieved. I take my projects seriously and get irritated when I can't get my work done. My older machine trashed the graphics card and it took a while to get a new one. I was hesitant to use my newer machine on boinc but then I said what the heck. Now Im cruching on TWO! Thanks all for the help! Glad to be aboard.
Sorceress
____________


Bigred
Avatar
Send message
Joined: 23 Nov 07
Posts: 33
Credit: 300,042,542
RAC: 0
Message 7170 - Posted: 1 Dec 2008 | 19:32:14 UTC - in response to Message 7161.

Thanks Ice. I just got home and noticed MW is running w/high priority on both machines. Don't know what happened, but its working much better. Noticed also the 'time to completion' is more accurate. I still won't be able to complete much of the WU I recieved. I take my projects seriously and get irritated when I can't get my work done. My older machine trashed the graphics card and it took a while to get a new one. I was hesitant to use my newer machine on boinc but then I said what the heck. Now Im cruching on TWO! Thanks all for the help! Glad to be aboard.
Sorceress


Try enabling experimental apps in the Milkyway settings for your account. That will allow you to get some work with the optimized app for shorter crunch times.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 7180 - Posted: 1 Dec 2008 | 20:40:16 UTC - in response to Message 7170.

Thanks Ice. I just got home and noticed MW is running w/high priority on both machines. Don't know what happened, but its working much better. Noticed also the 'time to completion' is more accurate. I still won't be able to complete much of the WU I recieved. I take my projects seriously and get irritated when I can't get my work done. My older machine trashed the graphics card and it took a while to get a new one. I was hesitant to use my newer machine on boinc but then I said what the heck. Now Im cruching on TWO! Thanks all for the help! Glad to be aboard.
Sorceress


Try enabling experimental apps in the Milkyway settings for your account. That will allow you to get some work with the optimized app for shorter crunch times.


Should have to do that as the stock app is now the new optimized version. It should download automatically. Also, no WUs are being generated for the old version anymore.
____________

Profile Sorceress
Avatar
Send message
Joined: 30 Nov 08
Posts: 22
Credit: 63,967
RAC: 0
Message 7193 - Posted: 1 Dec 2008 | 22:51:19 UTC - in response to Message 7170.

That option is not available in Milky Way preferences, Bigred.
____________


Profile caspr
Avatar
Send message
Joined: 22 Mar 08
Posts: 90
Credit: 501,728
RAC: 0
Message 7195 - Posted: 1 Dec 2008 | 23:11:22 UTC - in response to Message 7193.

That option is not available in Milky Way preferences, Bigred.



Just checked and it is there, "run test applications"?
____________
A clear conscience is usually the sign of a bad memory



Brian Silvers
Send message
Joined: 21 Aug 08
Posts: 625
Credit: 558,425
RAC: 0
Message 7197 - Posted: 1 Dec 2008 | 23:16:09 UTC - in response to Message 7195.

That option is not available in Milky Way preferences, Bigred.



Just checked and it is there, "run test applications"?


However, it is not needed to check that now.

The "test application" was promoted to the "stock application". As of today, that setting will not cause a change for any of us...but you could potentially regret it later if they release a buggy application that was marked as "test", but you had forgotten that you had enabled it...

Rick6718
Send message
Joined: 3 Apr 08
Posts: 7
Credit: 638,859
RAC: 0
Message 7205 - Posted: 2 Dec 2008 | 0:04:54 UTC

I am getting no work sent? Now for almost a day?

Profile Sorceress
Avatar
Send message
Joined: 30 Nov 08
Posts: 22
Credit: 63,967
RAC: 0
Message 7209 - Posted: 2 Dec 2008 | 0:36:48 UTC - in response to Message 7195.

Sorry folks, my bad. ;) Was not looking in the right place. I have this option disabled on advice from others as well. OH, and Rick you can some of my WUs. I have way too many to complete in time. After 05:33:12 hrs I'm 33.2% complete on just one. I have 15 more to go by 12.03.08 That's means roughly 16 hr to complete a WU that was suppose to take 08:20 hrs. Is this normal?? Why is it taking so long to complete a MW WU? I finish most of my other projects WUs well withing the time limits. I am at a loss...
____________


Profile caspr
Avatar
Send message
Joined: 22 Mar 08
Posts: 90
Credit: 501,728
RAC: 0
Message 7212 - Posted: 2 Dec 2008 | 1:09:36 UTC - in response to Message 7205.

I am getting no work sent? Now for almost a day?


Rick, do a detach / reattach and you should get work.

____________
A clear conscience is usually the sign of a bad memory



Profile Sorceress
Avatar
Send message
Joined: 30 Nov 08
Posts: 22
Credit: 63,967
RAC: 0
Message 7227 - Posted: 2 Dec 2008 | 3:18:57 UTC

Thanks to Caspr I am a happy camper. I detached/re-attached to the project and the new WUs are running well. I had the 1.22 WUs but now I have the Optimized 0.04. After 24mins I'm at 43% complete. Yippee!! Thanks Caspr!! xoxox
____________


Post to thread

Message boards : Number crunching : new workunit limit


Main page · Your account · Message boards


Copyright © 2013 AstroInformatics Group