Posts by Jake Weiss
log in
1) Message boards : Number crunching : MW@H DBase problems (Message 67130)
Posted 2 days ago by Profile Jake Weiss
Awesome! Glad everything is looking better. We are actually doing another hardware upgrade late next week to put a second CPU into the server so hopefully it will be smooth sailing for a while after that.


Thanks for sticking with us everyone,

Jake
2) Message boards : News : Scheduled Server Maintenance 2/21 (Message 67123)
Posted 4 days ago by Profile Jake Weiss
Hi Everyone,

There will be a scheduled server outage today around 1pm. This should be a brief outage while we install some additional RAM in the server.

Thank you for your understanding.

Jake

***

The server has been upgraded. Let me know if you see any more database errors or other hiccups here.

***
3) Message boards : Number crunching : Huge number of 'Validation inconclusive' WUs (Message 67116)
Posted 5 days ago by Profile Jake Weiss
Hey Everyone,

Looks like the workunit generator went a little overboard over the weekend and clogged things up. The validation inconclusives should clear our of the queue over the next two or three days as the server works through the backlog.

On another note, the new RAM for the server should be here tomorrow and that should help prevent this from happening again.

Jake
4) Message boards : News : Reducing Workunits to Unreliable Hosts (Message 67097)
Posted 8 days ago by Profile Jake Weiss
There is a limit on the number of workunits you can download and he is way over it. That seems like a bug in BOINC. Not sure what happened there, but is seems pretty rare. If it seems to happen more consistently, I will look into it a bit more.

Jake
5) Message boards : Number crunching : MW@H DBase problems (Message 67091)
Posted 9 days ago by Profile Jake Weiss
Hey everyone,

I'm working on a hardware fix for the server. It's been running out of RAM, but I just ordered another 32gb of RAM for it. Hopefully that will get it running a little more smoothly. It should be here in a week or two.

Jake
6) Message boards : News : Changing Workunit Priority to 1 from 0 (Message 67082)
Posted 11 days ago by Profile Jake Weiss
Hey EG,

This is very strange. I'm not sure what's going on here, but I will look into it.

Jake
7) Message boards : Number crunching : MW@H DBase problems (Message 67077)
Posted 11 days ago by Profile Jake Weiss
Hey Everyone,

I implemented something that might fix this today. The problem seems to be too many open connections on the database. I've made some configuration changed to help improve connection turnover and increase the connection limit.

Jake
8) Message boards : News : Reducing Workunits to Unreliable Hosts (Message 67072)
Posted 13 days ago by Profile Jake Weiss
Hey Everyone,

I've done a little digging to see if unreliable hosts are getting fewer workunits now and that does seem to be the case. I have found several faulty computers who currently have 0 work units in progress where they had 100+ plus as of last week.

I am going to consider this issue resolved for now unless anyone objects.

Jake
9) Message boards : News : Reducing Workunits to Unreliable Hosts (Message 67057)
Posted 14 days ago by Profile Jake Weiss
Says you have zero workunits in progress. That's a good sign!

Jake
10) Message boards : News : Changing Workunit Priority to 1 from 0 (Message 67046)
Posted 15 days ago by Profile Jake Weiss
Hi Everyone,

In order to use the "Use Reliable Host" feature built into the BOINC server software, we must change our workunit priority to 1 from 0. I am implementing this now and will check in over the weekend to make sure everything is running well.

If you have any questions or see any issues, please let me know.

Jake
11) Message boards : News : Reducing Workunits to Unreliable Hosts (Message 67045)
Posted 15 days ago by Profile Jake Weiss
Hey everyone,

I found a small bug in the way that the scheduler tests for reliability on workunits with priority 0. I am going to try changing everything to priority 1 and see if that fixes things. Hopefully this won't change how quickly our workunits are processed or affect how it interacts with workunits from other projects. Let me know if you see any issues with this on your end.

Jake
12) Message boards : News : Reducing Workunits to Unreliable Hosts (Message 67034)
Posted 18 days ago by Profile Jake Weiss
Nothing new to report yet. I've tried changing a few things on the server config side, but none of it seems to make a difference. Hopefully I will find the right knob to tweak soon.

Jake
13) Message boards : News : Nbody 1.68 release (Message 67018)
Posted 24 days ago by Profile Jake Weiss
Congrats on the new version!
14) Message boards : News : Reducing Workunits to Unreliable Hosts (Message 67012)
Posted 26 days ago by Profile Jake Weiss
Hey Everyone,

I wanted to give this a couple days to see if the server just had to learn who was unreliable. Seems that's not the case. I will do a little more research into configuring the server better throughout the week and run some more tests.

Jake
15) Message boards : MilkyWay@home Science : Science Summary (Message 67005)
Posted 26 Jan 2018 by Profile Jake Weiss
Hey Cactus Bob,

We are currently working on several publications, but our project staff is a bit stretched at the moment. Nbody is under considerable development with hard work going into getting a working GPU client, a new dwarf galaxy model and a new optimization metric. Separation has finished most of its client development as of last spring, but in recent times, I have been working on improving the server side optimization methods to better search our parameter space.

A reason there has been a long gap in publication is Sidd and I have both been working hard to develop, test, deploy, and run several iterations of our code to try to get separation and Nbody to give scientifically significant results. We have collectively fixed many issues with our optimizer and respective client codes.

With all of that said, I should have a paper submitted for publication using the results from my runs by the end of February, then I will be working on my thesis for August and then another journal publication shortly after that. We are not trying to be opaque or keep information from you. Sidd and I are very busy working towards getting results and publications and sometimes we neglect answering the forums or updating webpages. Sorry for that.

Jake
16) Message boards : MilkyWay@home Science : YouTube Channel (Message 66990)
Posted 22 Jan 2018 by Profile Jake Weiss
Hey D Pen,

Jeff Thompson is no longer working on the project and he is no longer encouraging us to make videos. A lot of interesting stuff has been going on with NBody though so I will ask Sidd to make a couple videos.

Jake
17) Message boards : Number crunching : Hosts with only invalid results (Message 66989)
Posted 22 Jan 2018 by Profile Jake Weiss
Hey Everyone,

I just tried turning on some options to fix this problem.

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4227

Let me know if there are any issues there.

Jake
18) Message boards : News : Reducing Workunits to Unreliable Hosts (Message 66988)
Posted 22 Jan 2018 by Profile Jake Weiss
Hey Everyone,

I just tried turning on some options to reduce workunits sent to hosts that return a significant number of errors. If you see any issues, please let me know.

Thank you all for your continued support.

Jake
19) Message boards : Number crunching : There's something VERY wrong with the "Top GPU models" list (Message 66880)
Posted 23 Dec 2017 by Profile Jake Weiss
Hey Everyone,

Just for a little insight as to why newer cards may not seem to be a huge improvement over older cards. About a year ago we switched to bundled work units which are about 5 times the size of old work units.

We also currently run our application using double precision calculations to improve the fitting of our models. In older GPUs you might find 1/4 or 1/8 of the cores in the GPU could do double precision calculations. In newer GPUs there are significantly more cores in general but the ratio of double to single precision cores has reduces to 1/24 or 1/32. This means that the number of double precision cores has not scaled to the same degree single precision cores have and as such you will not see huge performance increases with newer cards.

That being said, I have been testing our application on single precision calculations to see if we could have any use for them in the future. The results look promising and hopefully in the first couple months of next year, you might see a test project popping up that will see if we get the same results with it as our double precision application.

Jake
20) Message boards : Number crunching : Profiles? (Message 66869)
Posted 18 Dec 2017 by Profile Jake Weiss
All fixed. It looks like when we migrated over to the new ReCaptcha it broke.

Jake


Next 20

Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group