41)
Message boards :
Number crunching :
20 workunit limit
(Message 2372)
Posted 18 Mar 2008 by ChertseyAl Post: 5h return time ? what's that a PII@300 MHz ? or do you mean the time that pases befor your boinc client contacts the server ? Sorry, must have explained this on another thread that you missed. 1.8G machine, WU time 15m. 20WU max. Last WU in will take a minimum of 5h to get crunched and reported. FIFO and all that. So every WU that machine crunches will take 5h from arrival to reporting. Agreed? Solution: Set a buffer of 0.001 hours, so that only one WU is ever active. But that throttles the faster hosts. Also, given the 'reliability' of MW, that ain't a good strategy ;) Now, a BOINC client that can limit the number of WUs per project (say 2 for MW), give them priority over all else, but respect the deadlines of backup projects (and we need BU projects when MW is still flaky) to avoid starving other projects would be good. I'm really thinking that genetic models don't suit BOINC. BOINC is great for boring, grinding, tedious number-crunching, but evolving models won't work. BTW, this has nothing to d0o with MW. MW is one of my fave projects. But, as such, I'd rather see it run for the science than for the user. Al. |
42)
Message boards :
Number crunching :
20 workunit limit
(Message 2368)
Posted 18 Mar 2008 by ChertseyAl Post: It does. So-called "panic mode". *grin* Exactly the point I raised yesterday. The current computing model doesn't really suit BOINC, it relies on rapid reporting. Options: 1) Abandon BOINC and go stand-alone. 2) As 1 above, but farm out 'child' genetic threads to BOINC. 3) Make WUs a mix of parallel genetic seeds, increasing crunching time, but feeding more improved start points back into the matrix. I'm actually thinking that option 1 might be best. My slow host is turning over WUs in 5 hours. This is too slow. I will probably set this host to NNT as I suspect more 'old' science is jsut that ... OLD. Al. |
43)
Message boards :
Number crunching :
More Work !!! Please :)
(Message 2338)
Posted 17 Mar 2008 by ChertseyAl Post: Got 20 on my lappy as well but not on the other boxes. ;-( I seem to have got 10 on my slowest host, nothing on the faster ones. Never mind, by tomorrow morning I may have processed a load on the other machines. Time to crunch some Sleep@Home ... Al. |
44)
Message boards :
Number crunching :
20 workunit limit
(Message 2331)
Posted 17 Mar 2008 by ChertseyAl Post: Replying to myself ... Well, at least I listen to myself sometimes ;) ... The remaining WUs are being cleared at about 132 WU/hour. I usually clear about 15 WU/hour. So we've either got 9 active 'consumers' or a load of totally irrelevant results waiting to be returned from hosts bunged-up with other work. Let's abort the old WUs from the server and make the science count. I no longer run a number of projects because they let stale work trickle in way past it's sell-by date, only to be discarded. Sorry, I went all *serious* for a moment there ;) Al. |
45)
Message boards :
Number crunching :
More Work !!! Please :)
(Message 2329)
Posted 17 Mar 2008 by ChertseyAl Post: I heard an unfounded rumor that I was somehow involved. No no no, that's JRenkar, the HERETIC that started this FIASCO. BURN THE WITCH!!!!! Al. |
46)
Message boards :
Number crunching :
20 workunit limit
(Message 2327)
Posted 17 Mar 2008 by ChertseyAl Post: The server has been out of work for nearly 24 hours now, and there are still over 5000 WUs out in the wind. Seems to me that if new work generation depends on the old results, the deadline should be shortened. And the rate that these are being cleared is very slow. Could we use server-side aborts on 'old' WUs? Yes, it's annoying when your cache/stash gets aborted, but I can't see any point on wasting cycles on useless work. If this isn't feasible, maybe BOINC just isn't the right platform? Maybe the work driving the next generation work should be non-BOINC and other work farmed out to a slower BOINC network. Maybe try 12 hours and see how that goes, at least if the WU ends up on a duffer, it would time out and possibly get sent to a faster, more reliable host. Tight deadlines are a nightmare. MW will end up in High Pri permanently and other projects will be starved. Maybe if each WU carried work from a number of different genetic seeds and ran for a while longer? When MW is running, it's fine for me. But my slow host may take 5 hours to turn around the last WU in a batch :( I can't help but feel that server-side aborts are the way to go, but machines that are not on a permanent net connection are still going to waste work :( Al. |
47)
Message boards :
Number crunching :
Smooth sailing-Quiet board
(Message 2323)
Posted 17 Mar 2008 by ChertseyAl Post: .....Travis....Dave....wake up.... Oooh, you are in SO much trouble when they turn up ;) Al. |
48)
Message boards :
Number crunching :
GECCO2008 paper accepted
(Message 2317)
Posted 17 Mar 2008 by ChertseyAl Post: Nice paper Travis! AS Travis explained in This Thread? Al. |
49)
Message boards :
Number crunching :
Smooth sailing-Quiet board
(Message 2308)
Posted 17 Mar 2008 by ChertseyAl Post:
No good :( We're well and truly Renkar'd ;) Al. |
50)
Message boards :
Number crunching :
More Work !!! Please :)
(Message 2300)
Posted 16 Mar 2008 by ChertseyAl Post: Well since I jinxed us with that other post maybe if I say need more work it will work? Nope. No Cute Kitten picture. Nice try, but no cigar. Al. |
51)
Message boards :
Number crunching :
Smooth sailing-Quiet board
(Message 2295)
Posted 16 Mar 2008 by ChertseyAl Post: It's called the 'Post of Death' server subroutine..... :( Don't worry - I've figured it out. If Cori posts *AND* includes a picture of a cute kitten, we'll be OK. Seems to be some kind of 'dark force' that operates invisibly. I've not had time to extraplote this from forum postings across all 10 dimensions though. I got distracted by a cute kitten in the garden. Maybe Travis can factor CK (Cute Kitten) into the incomprehsible technobabble that passes for a GECCO paper. :) Al. |
52)
Message boards :
Number crunching :
Smooth sailing-Quiet board
(Message 2290)
Posted 16 Mar 2008 by ChertseyAl Post: Seems to be common in alpha-beta projects....guess nobody has anything to say then and is a compliment to the administrators :D Yup - Everything running very nicely client-side. Plenty of work, no more freezings WUs. The server even stood up to a post from Cori yesterday ;) Al. |
53)
Message boards :
Number crunching :
More Work !!! Please :)
(Message 2260)
Posted 14 Mar 2008 by ChertseyAl Post: [quote*Grin* [/quote] Cori has posted. Expect long server outages ;) Al. |
54)
Message boards :
Number crunching :
How about more than 20 workunits at a time.
(Message 2163)
Posted 11 Mar 2008 by ChertseyAl Post: Yes, would be a good workaround for the current situation! ;-) Alternatives: 1) Make the work units bigger, like they used to be. 2) Don't let Dave out of the lab. Push food under the door if he gets hungry(pizza is nice and thin). 3) Every time the server fails, each member of the IT department has to eat a MilkyWay bar. For those that are unfamilar with this 'treat', it's like a mix of clay and window putty, fluffed up to make make it half the weight it should be, and all covered in sickly sweet chocolate. That should focus their minds. Al. |
55)
Message boards :
Number crunching :
Server Outages
(Message 2150)
Posted 10 Mar 2008 by ChertseyAl Post: this really needs looked at, and fixed Alternative: Get Dave to live in the lab, do reboots etc. I have no idea who Dave is, but we need this guy within easy reach of the reboot button ;) I have to say, this is the most exciting project I've every taken part in. Random server access, Krazy Kredit, good science, and a real community spirit. Long may it last :) Al. |
56)
Message boards :
Number crunching :
Server error: can't attach shared memory
(Message 2122)
Posted 8 Mar 2008 by ChertseyAl Post: I'm sure you are aware of this, but I'll post it anyway: 08/03/08 17:53:30|Milkyway@home|Message from server: Server error: can't attach shared memory Same behaviour on 3 hosts. All pending uploads now cleared, it's just the reporting that fails. Cheers, Al. |
57)
Message boards :
Number crunching :
More Work !!! Please :)
(Message 2046)
Posted 7 Mar 2008 by ChertseyAl Post:
Yes! Got some for 2 hosts. The second host didn't get the full amount requested - I guess I drained you dry ;) Al. |
58)
Message boards :
Number crunching :
application v1.21/v1.22 errors/memory leaks/crashes here
(Message 2024)
Posted 7 Mar 2008 by ChertseyAl Post: 1.21 runs very slightly slower than 1.19 on my P4 2.4 XP32, and very slightly faster on my Celeron 2.93 XP32 - Not much in it really. Progress bar is still running slow on both (about a third slower than it should be). No problems so far :) Al. |
59)
Message boards :
Number crunching :
credit issues
(Message 1975)
Posted 6 Mar 2008 by ChertseyAl Post: Let us know if 1.19 is running that much faster on the windows environment -- if so we might have to reduce credit awarded to keep it in line with other projects. 1.19 is running slightly slower (about 5%) than 1.18, 32bit XP. i.e. Still more than twice as fast as before. So, credit is still way too high. But rather than reduce the credit (which is OK), could we just have longer WUs? About 30 minutes to an hour would be nice :) [Oh, progress bar is running at 10% of actual progress (not a big deal)] Al. |
60)
Message boards :
Number crunching :
credit issues
(Message 1907)
Posted 5 Mar 2008 by ChertseyAl Post: Let us know if 1.19 is running that much faster on the windows environment -- if so we might have to reduce credit awarded to keep it in line with other projects. What's happening with 1.18? Might be nice if you replied to my post, it would give some idea if you've actually done anything ;) Al. |
©2024 Astroinformatics Group