Welcome to MilkyWay@home

Server updated

Message boards : News : Server updated
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile Beyond
Avatar

Send message
Joined: 15 Jul 08
Posts: 383
Credit: 729,293,740
RAC: 0
Message 47048 - Posted: 6 Apr 2011, 16:20:31 UTC - in response to Message 47047.  

Judging by all the replies it looks like I broke everything more than I thought.

Actually things are running better here than they ever have. It appears that the standard, non app_info.xml app is having problems though.
ID: 47048 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cncguru
Avatar

Send message
Joined: 11 Jun 10
Posts: 329
Credit: 1,166,222,661
RAC: 0
Message 47049 - Posted: 6 Apr 2011, 16:38:47 UTC
Last modified: 6 Apr 2011, 16:39:57 UTC

Yes by all means fix it so you can get work units running the stock app if we wish but PLEASE don't pinch off our new wu download bonanza!!
I might even make it to 1 MIL RAC, LOL!
ID: 47049 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matt Arsenault
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 8 May 10
Posts: 576
Credit: 15,979,383
RAC: 0
Message 47051 - Posted: 6 Apr 2011, 16:52:59 UTC - in response to Message 47004.  

I think I've fixed the % of CPU issue.
ID: 47051 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile kashi

Send message
Joined: 30 Dec 07
Posts: 311
Credit: 149,490,184
RAC: 0
Message 47057 - Posted: 6 Apr 2011, 17:26:10 UTC

Not sure if this is relevant to the problem but I believe the sending of 32-bit application instead of 64-bit application to 64-bit hosts can be disabled by using <primary_platform_only> option.
http://boinc.berkeley.edu/trac/changeset/22183
ID: 47057 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
FruehwF

Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,840,492
RAC: 0
Message 47069 - Posted: 6 Apr 2011, 19:09:37 UTC - in response to Message 47034.  

[quote]...
How should it loook like the "app_info.xml"

sry but I am a newby...

You can DL the optimized apps here:

http://www.arkayn.us/milkyway/index.html

They should contain the appropriate default app_info.xml


Thats it. Works really fine.

thx.

Franz
ID: 47069 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cncguru
Avatar

Send message
Joined: 11 Jun 10
Posts: 329
Credit: 1,166,222,661
RAC: 0
Message 47070 - Posted: 6 Apr 2011, 19:30:32 UTC

So now all my guys have plenty of work units and they are all crunching away as they always have yet EVERY ONE OF THEM IS LOSING RAC!
And while they keep crunching at their regular paces while things are stopped server side for whatever reason those credits are lost! OH we get them on our totals sure but the RAC is wacked again and again.
And it is taken away at a very much faster pace than it can be recovered so the issue I would like to see fixed more than anything is STOP HOSING OUR RAC'S
This has nothing to do with the current update but I believe I am not alone in my feelings here.
(and yes I understand the dynamic nature of averages, but this system seems weighted against us as numbers we crunch in the "background" are lost credits when looking at our rac's)
ID: 47070 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 47072 - Posted: 6 Apr 2011, 20:19:10 UTC
Last modified: 6 Apr 2011, 20:20:41 UTC

In the OptApp for 64 bit AMD (with V0.23 app) .... the brook64 file .... mine has following attribs:

brook64.dll, compressed size 188kb, uncompressed size 447kb, original file date prior to unpacking 07/04/2010 01:01hrs

Correct file and still current for AMD 0.23 OptApp 64bit and WUs now being issued ??

Just double checking the obvious ... been caught before

Regards
Zy
ID: 47072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cncguru
Avatar

Send message
Joined: 11 Jun 10
Posts: 329
Credit: 1,166,222,661
RAC: 0
Message 47075 - Posted: 6 Apr 2011, 21:03:30 UTC

All systems GO for mine right now and the rac's are jumping back faster than usual too!
ID: 47075 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,525,188
RAC: 0
Message 47078 - Posted: 6 Apr 2011, 22:05:15 UTC - in response to Message 47047.  

That would be a yes.

I've seen two problems

1) Periodically, the site is flat out inaccessible. Other times it is just very slow.

2) Lack of availability of GPU work units.

Workunits have been doled out in driblets as well as larger quantities (as many as 50 on a pass) -- pretty inconsistent there.

Some times the site has been up, but the message boards are not useable (threads listed, no messages).

But the main thing to me is the elevator effect (server up, server down), (workunits available, workunits not available).


Judging by all the replies it looks like I broke everything more than I thought.


ID: 47078 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ed.T

Send message
Joined: 1 Feb 11
Posts: 17
Credit: 16,245,184
RAC: 0
Message 47080 - Posted: 6 Apr 2011, 22:14:36 UTC - in response to Message 47075.  

Are you seeing credit on the MW site? My pending-validation queue is going down but I'm still not seeing points...
Please: WCG - Help Cure Muscular Dystrophy
ID: 47080 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 47082 - Posted: 6 Apr 2011, 22:38:36 UTC

What are the specs on the new server? Must be a beast if it can handle the stress it's currently under.
ID: 47082 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 47083 - Posted: 6 Apr 2011, 22:39:15 UTC - in response to Message 47080.  

Are you seeing credit on the MW site? My pending-validation queue is going down but I'm still not seeing points...


The separation workunits should be awarding credit. There's a problem with the nbody workunits in that for some reason hosts are either not claiming any credit for their results or it's not being stored in the database, which is making the assimilator/validator award 0 credit because it doesn't know what to award. Working on fixing that right now.
ID: 47083 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
BarryAZ

Send message
Joined: 1 Sep 08
Posts: 520
Credit: 302,525,188
RAC: 0
Message 47085 - Posted: 6 Apr 2011, 22:42:21 UTC - in response to Message 47080.  

I've seen credits increase -- can't be sure if they are yielding properly or not since my completed credits are down due to the server not working right generally.

Perhaps it might be best instead of ping-ponging the server multiple times an hours (or at least 3 or 4 times every 6 hours), to actually troubleshoot the issues and resolve them.

Bouncing the server every 40 to 80 minutes with a 20 minute restart cycle without fixing things (hoping that a reboot clears the air so to speak) is not getting it done. Or so it seems to me.

Are you seeing credit on the MW site? My pending-validation queue is going down but I'm still not seeing points...


ID: 47085 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 47086 - Posted: 6 Apr 2011, 22:47:56 UTC - in response to Message 47085.  

I've seen credits increase -- can't be sure if they are yielding properly or not since my completed credits are down due to the server not working right generally.

Perhaps it might be best instead of ping-ponging the server multiple times an hours (or at least 3 or 4 times every 6 hours), to actually troubleshoot the issues and resolve them.

Bouncing the server every 40 to 80 minutes with a 20 minute restart cycle without fixing things (hoping that a reboot clears the air so to speak) is not getting it done. Or so it seems to me.

Are you seeing credit on the MW site? My pending-validation queue is going down but I'm still not seeing points...



The server is "bouncing" because I'm updating the assimilator/validator code to try and figure out what all went wrong. I'm not just blindly restarting things :P
ID: 47086 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zydor
Avatar

Send message
Joined: 24 Feb 09
Posts: 620
Credit: 100,587,625
RAC: 0
Message 47087 - Posted: 6 Apr 2011, 22:57:42 UTC

Lets not do this .... credits are starting to flow now for GPU WUs. The waters must not be muddied by intermixing N Body issues (likely as not software not server related) with a massive increase in I/O throughput and all the related stresses and strains the latter puts on the various parts of the server, and communications bandwidth both internal to the server and external links - parts which have not yet had a chance to be fine tuned properly.

Its going to take two or three days to fine tune the Server properly to cope with the new environment that is totally different to a limit of 6 GPUs per core, its certainly far too early yet to jump on individual aspects, its only been on less than 24 hrs.

Give the guys space for a couple of days, it will take that long to get the server initially tuned up. Then its needs a few more days of real world use before further actions are taken based on objective observation, not anecdotal speculation.

Considering whats just hit that server, its doing fine, needs to be better, and given the appropriate time it will be, it just doesnt happen all by yesterday.

Regards
Zy
ID: 47087 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 47089 - Posted: 6 Apr 2011, 23:02:19 UTC - in response to Message 47087.  

I found the problem with the nbody assimilator not granting credit -- I should have it fixed tonight but until then it's not going to be running. I think the separation assimilator is wokring fine (and granting credit). I'll probably turn on work generation for it later (right now it has a huge queue of workunits waiting to be sent anyways).
ID: 47089 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cncguru
Avatar

Send message
Joined: 11 Jun 10
Posts: 329
Credit: 1,166,222,661
RAC: 0
Message 47103 - Posted: 7 Apr 2011, 1:28:38 UTC
Last modified: 7 Apr 2011, 1:29:28 UTC

well @ the rate of 49-82 sec/wu my main guy is already out of work and there is none to be had!
Can he steal wu's from my other guys, lol???
ID: 47103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 47113 - Posted: 7 Apr 2011, 3:16:14 UTC - in response to Message 47103.  

well @ the rate of 49-82 sec/wu my main guy is already out of work and there is none to be had!
Can he steal wu's from my other guys, lol???

Glad I had the app_info already installed in my main cruncher with a cache of 1 day when the change occured. I've got wu's coming out of my ears! So much so that I've dropped the cache to 0.75 days to limit the number of wu's cached as client_state.xml gets a little too big and wu processing is reduced as such a large file has to get written too often.
ID: 47113 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
cncguru
Avatar

Send message
Joined: 11 Jun 10
Posts: 329
Credit: 1,166,222,661
RAC: 0
Message 47115 - Posted: 7 Apr 2011, 3:41:19 UTC
Last modified: 7 Apr 2011, 3:49:00 UTC

Thats all good but having a stockpile of wu's does nothing to keep your rac up where it should be.
When the server is being worked on and we have cached wu's to keep crunching without interruption how is it that they then take away (big time) from our rac???
We didnt slow down, the system just isn't smart enough to keep up and so we get screwed again and again!
Kind of makes all this worry about cache size irrelevent if you ask me.
Yeah, I know, you didn't ask me, LOL
and you know what one of my guys just made a liar out of me because he in fact is increasing his rac right now while they are working on it, but he's my slowest guy!
I've talked too much.....gonna shut up now, LOL
ID: 47115 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 24 Dec 07
Posts: 1947
Credit: 240,884,648
RAC: 0
Message 47120 - Posted: 7 Apr 2011, 6:30:15 UTC

But if you have a stockpile of wu's which you keep crunching while the server is down and once the server is back up and they are reported your RAC is going to go up. Whereas before without much of a cache your RAC would go down.

Shame my main cruncher looks like it had a fit not too long after I left home for w#$^ this morning, so my RAC has been dropping.

btw, I resigned from my job today. My boss/owner of the company and I didn't get along. Glad other people think I'm good at what I do. Start work at my new job as soon as my notice period is up.
ID: 47120 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : News : Server updated

©2024 Astroinformatics Group