feel free to cancel any in progress WUs
log in

Advanced search

Message boards : News : feel free to cancel any in progress WUs

Author Message
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 51312 - Posted: 9 Oct 2011 | 19:03:00 UTC

Looks like I'm going to have to drop the result and workunit tables to get the database working again. Feel free to cancel any workunits you have in progress. I apologize for this but it's looking like it's the only way to get the project back on it's feet in any reasonable amount of time.
____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 51313 - Posted: 9 Oct 2011 | 19:29:07 UTC

Getting:

10/9/2011 3:24:06 PM|Milkyway@home|Message from server: Project is temporarily shut down for maintenance

Also the Server & Task page don't want to load. Related I'm sure.
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Profile TimeRanger
Send message
Joined: 31 Oct 10
Posts: 19
Credit: 968,389
RAC: 1,476
Message 51314 - Posted: 9 Oct 2011 | 21:24:55 UTC

Just wondering if - in the future - the number of tasks/WUs a person can have cached can be increased, so we can keep working while the project is down? As it is, I have been without work for about 3 days now. Thanks.

Sunny129
Avatar
Send message
Joined: 25 Jan 11
Posts: 249
Credit: 165,504,595
RAC: 526,238
Message 51315 - Posted: 9 Oct 2011 | 21:25:14 UTC - in response to Message 51313.

Getting:
10/9/2011 3:24:06 PM|Milkyway@home|Message from server: Project is temporarily shut down for maintenance

Also the Server & Task page don't want to load. Related I'm sure.

its been ~2 hours since you posted - server status page is back up, as are all the other MW@H web pages.
____________

Freewill
Send message
Joined: 27 Dec 09
Posts: 3
Credit: 41,127,992
RAC: 0
Message 51316 - Posted: 9 Oct 2011 | 22:14:48 UTC - in response to Message 51312.

Looks like I'm going to have to drop the result and workunit tables to get the database working again. Feel free to cancel any workunits you have in progress. I apologize for this but it's looking like it's the only way to get the project back on it's feet in any reasonable amount of time.

I have a full set of work units which are at "Ready to Report" status. Does this mean I am not going to get credit for them?

Profile TimeRanger
Send message
Joined: 31 Oct 10
Posts: 19
Credit: 968,389
RAC: 1,476
Message 51317 - Posted: 9 Oct 2011 | 22:32:28 UTC - in response to Message 51316.

Looks like I'm going to have to drop the result and workunit tables to get the database working again. Feel free to cancel any workunits you have in progress. I apologize for this but it's looking like it's the only way to get the project back on it's feet in any reasonable amount of time.

I have a full set of work units which are at "Ready to Report" status. Does this mean I am not going to get credit for them?


Im in the same position.

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 915
Credit: 74,781,320
RAC: 237
Message 51318 - Posted: 9 Oct 2011 | 22:39:26 UTC

I only have 11 in the ready to report stage, of course you cannot abort those as they have already completed.
____________

Profile BladeD
Send message
Joined: 2 Nov 10
Posts: 656
Credit: 114,350,545
RAC: 33,372
Message 51319 - Posted: 9 Oct 2011 | 22:43:04 UTC - in response to Message 51314.

Just wondering if - in the future - the number of tasks/WUs a person can have cached can be increased, so we can keep working while the project is down? As it is, I have been without work for about 3 days now. Thanks.

Don't you have a backup project?
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 51320 - Posted: 9 Oct 2011 | 22:53:09 UTC - in response to Message 51314.

Just wondering if - in the future - the number of tasks/WUs a person can have cached can be increased, so we can keep working while the project is down? As it is, I have been without work for about 3 days now. Thanks.


If we increase the cache, then this type of database crash would happen significantly more often. We just don't have a powerful enough server to increase the size of the workunit and result tables that much.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 51321 - Posted: 9 Oct 2011 | 22:54:28 UTC - in response to Message 51317.

Looks like I'm going to have to drop the result and workunit tables to get the database working again. Feel free to cancel any workunits you have in progress. I apologize for this but it's looking like it's the only way to get the project back on it's feet in any reasonable amount of time.

I have a full set of work units which are at "Ready to Report" status. Does this mean I am not going to get credit for them?


Im in the same position.


Sadly, that's the case. :( It was taking about an hour to do a single query on the result table -- which is why everything was brought to a screeching halt. The only way I could get things responsive again was to clear the result and workunit tables.

I'm going to have to lower the time workunits are kept in teh database, I think the number was too high and that's what caused the result table to get too large, corrupt and then crash the whole project.
____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 51322 - Posted: 9 Oct 2011 | 23:59:09 UTC - in response to Message 51321.

What time limit are you going to try?
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Profile TimeRanger
Send message
Joined: 31 Oct 10
Posts: 19
Credit: 968,389
RAC: 1,476
Message 51323 - Posted: 9 Oct 2011 | 23:59:53 UTC - in response to Message 51319.

Just wondering if - in the future - the number of tasks/WUs a person can have cached can be increased, so we can keep working while the project is down? As it is, I have been without work for about 3 days now. Thanks.

Don't you have a backup project?


I originally started with SETI and stayed with them for many years. However, about 18 months ago, their application kept locking up and I started doing MW as my backup...then I quit SETI altogether due to the probs. I thought I had found a good project with MW ... Guess I'm going to have to look around and see what else I can find to give this machine something to do.

Damon
Send message
Joined: 28 Sep 11
Posts: 1
Credit: 220,800
RAC: 41
Message 51324 - Posted: 10 Oct 2011 | 0:20:14 UTC - in response to Message 51323.

You could try world community grid. They have a bunch of projects such as searching for a drug for a certain tropical disease (sorry it's a pain to recall the spelling) that infects several million people a year mostly children (no vaccine) existing treatments can be fatal or with serious side effects.

Others are to find organic substances that can be used to make solar cells with 20% efficiency thereby making solar power incredibly cheap and environmentally friendly in addition to providing a lot more power.

You can use the boinc manager to add the World Community Grid to your tasks and do both milkyway at home and world community grid tasks at the same time.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 51325 - Posted: 10 Oct 2011 | 0:51:31 UTC - in response to Message 51322.

What time limit are you going to try?


right now 20% of a day, so about 5 hours.
____________

w42
Send message
Joined: 5 Sep 09
Posts: 1
Credit: 24,643,448
RAC: 0
Message 51326 - Posted: 10 Oct 2011 | 3:59:07 UTC - in response to Message 51320.

If we increase the cache, then this type of database crash would happen significantly more often. We just don't have a powerful enough server to increase the size of the workunit and result tables that much.


Oy.. :( What's your DB size and especially result table size when you start having problems?

And I take it you can't increase the work size so there's more work to do for people for less result/workunits in the DB?

Profile BladeD
Send message
Joined: 2 Nov 10
Posts: 656
Credit: 114,350,545
RAC: 33,372
Message 51327 - Posted: 10 Oct 2011 | 4:16:07 UTC

So, what's the status on getting things flowing again?
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 1976
Credit: 26,480
RAC: 0
Message 51328 - Posted: 10 Oct 2011 | 4:46:35 UTC - in response to Message 51327.

So, what's the status on getting things flowing again?


Sometime tomorrow it's looking like. Things aren't quite ready yet and I do need to try and get some sleep. :(
____________

ryan
Send message
Joined: 5 Nov 10
Posts: 1
Credit: 35,461
RAC: 293
Message 51329 - Posted: 10 Oct 2011 | 5:28:28 UTC

Seems like I have done a ton of work for this project only to have my results say that the result is being checked,or some such thing,and no credit is given. Happens WAY too much

Profile dskagcommunity
Avatar
Send message
Joined: 26 Feb 11
Posts: 155
Credit: 32,421,708
RAC: 170,065
Message 51330 - Posted: 10 Oct 2011 | 6:00:02 UTC
Last modified: 10 Oct 2011 | 6:00:33 UTC

We should donate something to geta new server for MW. This project stable would be good any days O.o
____________
DSKAG Austria Research Team: http://www.research.dskag.at



BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,431,090
RAC: 162,144
Message 51331 - Posted: 10 Oct 2011 | 6:26:43 UTC - in response to Message 51323.

If you are looking for another Astro project, consider Einstein (assuming you are running CPU projects). If you are running an ATI GPU configuration, there are a few options - but not astro oriented -- Collatz -- excellent stability there though the credit payout is lower than MW (for what its worth), or Moowrapper or Dnet. Those projects also work with CUDA GPU's as well (and all three of them do NOT require double precision GPU's).




I originally started with SETI and stayed with them for many years. However, about 18 months ago, their application kept locking up and I started doing MW as my backup...then I quit SETI altogether due to the probs. I thought I had found a good project with MW ... Guess I'm going to have to look around and see what else I can find to give this machine something to do.


____________

Profile Tony Stark
Avatar
Send message
Joined: 21 Sep 11
Posts: 38
Credit: 184,327,575
RAC: 14
Message 51332 - Posted: 10 Oct 2011 | 7:55:32 UTC - in response to Message 51320.

(snip)... We just don't have a powerful enough server to increase the size of the workunit and result tables that much.


What. Don't you have some nice NSF grant money you could throw at it? ;-)

But seriously, what kind of hardware would we be talking about? (please be specific)

I've already spent $1500 USD on my first cruncher, specifically buying double precision Radeons because I liked this project. If we're talking about a single machine built from off-the-shelf parts, I bet if you asked around the boards, enough people would rise to the occasion to make it happen. I'd be willing to get the ball rolling with, say, $500 USD. Just chip in what you can folks.

Really.

HassanShebli
Send message
Joined: 2 Oct 10
Posts: 56
Credit: 15,870,183
RAC: 1,438
Message 51333 - Posted: 10 Oct 2011 | 8:35:19 UTC
Last modified: 10 Oct 2011 | 8:36:43 UTC

This is frustrating

The only reasonable solution is to increase either the WU size so it takes hours to finish along with the credit , Or to increase the number of WUs so users could have enough load to crunch.

As I understood, you need bigger server to achieve that… how much will it cost ?
I've 6970 and it works with very few projects beside MW which I love, but the frequent crashing drives me crazy. I think I am going to get rid of my card and get a Nvidia as it runs with a lot more projects.


Note: still no WUs is sent to my machine so I guess the server is down.

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 51334 - Posted: 10 Oct 2011 | 10:53:50 UTC

DC has a big problem. Volunteer computers have increasing capabilities (faster multicore CPU's and faster GPU's), project hardware was designed for lower traffic and less wu throughput.

MW is down, Seti is down, DNA has no wu's, Orbit has no wu's, Spinhendge has a pause for three months, LHC has currently no wu's and I'm shure that other projects have similar problems.
In many cases it's a hardware problem, in all cases it's a manpower problem and in some cases it's a management problem. And in every case it's a cash-problem.

We, the volunteers, cannot help with manpower (like Gipsel did two years ago), the only things we can do is transport ideas from other projects to increase performance or donate for new hardware.

A new server: I mean, a dual CPU MB with two 8-core CPU's, 12-16GB RAM, 6 SSD-Drives, Case and Power Supply should do the job for the next 2 years. If one does not buy the cheapest available parts, 5k$ should be ok.
So ten guys like Toni could keep the ball rolling ...

Other projects for ATI GPU's: GPUGRID is working on a OpenCL-App for ATI-Cards, should be available in a couple of days, let's say as backup project.

Or: take some time for maintenance - remove the dust from your cards, check the fans and enjoy the silence.

Transporting ideas: http://setiathome.berkeley.edu/forum_thread.php?id=65455

Last but not least: we could try not to further increase the frustration of the admins, I'm shure, they are not happy with the situation as well.

FruehwF
Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,830,617
RAC: 0
Message 51335 - Posted: 10 Oct 2011 | 10:55:00 UTC

Well I don't know anything about the topologie of the MW - WU so it is just a quess. If an increase of the WU count is not possible. Maybe it's possible to increase the Size of the WU's.
Maybe it's possible to launche an extra application (only for GPU's) with large WU's. So that 1 WU last 1 or 2 hours on a HD5870?

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 51336 - Posted: 10 Oct 2011 | 11:09:49 UTC - in response to Message 51335.

Well I don't know anything about the topologie of the MW - WU so it is just a quess. If an increase of the WU count is not possible. Maybe it's possible to increase the Size of the WU's.
Maybe it's possible to launche an extra application (only for GPU's) with large WU's. So that 1 WU last 1 or 2 hours on a HD5870?

good point.
Collatz gives you the choice to select 'Collatz' or 'Mini Collatz'.
But with larger wu's a checkpointing must be implemented.

Profile dskagcommunity
Avatar
Send message
Joined: 26 Feb 11
Posts: 155
Credit: 32,421,708
RAC: 170,065
Message 51337 - Posted: 10 Oct 2011 | 11:49:32 UTC - in response to Message 51334.
Last modified: 10 Oct 2011 | 11:51:46 UTC


MW is down, Seti is down, DNA has no wu's, Orbit has no wu's, Spinhendge has a pause for three months, LHC has currently no wu's and I'm shure that other projects have similar problems.
In many cases it's a hardware problem, in all cases it's a manpower problem and in some cases it's a management problem. And in every case it's a cash-problem.



Oohh yes im struggeling too, to feed my (mostly NVIDIA) machines with Projects with real scientific sense (Hard with ATI Card cos there is only MW. Soon GPUGrid :)) *sign*

I tried to donate but it does not accept my creditcard? O.o
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Chris S
Avatar
Send message
Joined: 20 Sep 08
Posts: 1357
Credit: 173,075,472
RAC: 7
Message 51339 - Posted: 10 Oct 2011 | 12:15:56 UTC

DNETC and Moo Wrapper are running OK for ATI cards. Useful when MW is down.

Profile BladeD
Send message
Joined: 2 Nov 10
Posts: 656
Credit: 114,350,545
RAC: 33,372
Message 51340 - Posted: 10 Oct 2011 | 13:33:52 UTC - in response to Message 51339.

DNETC and Moo Wrapper are running OK for ATI cards. Useful when MW is down.

Too bad they don't run under BOINC manager.
____________

Profile dskagcommunity
Avatar
Send message
Joined: 26 Feb 11
Posts: 155
Credit: 32,421,708
RAC: 170,065
Message 51341 - Posted: 10 Oct 2011 | 15:01:30 UTC

hm? why? how do you mean that? I have dnetc as backup backup backup emergency energywasting project when SETI Backup offers no Astropulse on the MW Machine. Normaly over BOINC Manager..
____________
DSKAG Austria Research Team: http://www.research.dskag.at



FruehwF
Send message
Joined: 28 Feb 10
Posts: 120
Credit: 109,830,617
RAC: 0
Message 51342 - Posted: 10 Oct 2011 | 15:11:48 UTC - in response to Message 51336.

...

good point.
Collatz gives you the choice to select 'Collatz' or 'Mini Collatz'.
But with larger wu's a checkpointing must be implemented.


Since 0.82 checkpointing is implemented.

Profile Kathryn Tombaugh-Weber
Send message
Joined: 12 Aug 11
Posts: 7
Credit: 10,975
RAC: 3
Message 51343 - Posted: 10 Oct 2011 | 16:59:53 UTC
Last modified: 10 Oct 2011 | 17:03:42 UTC

10/10/2011 2:38:27 AM | Milkyway@home | Restarting task ps_separation_82_2s_mix0_1_3396869_0 using milkyway version 88
10/10/2011 4:01:47 AM | Milkyway@home | Computation for task ps_separation_82_2s_mix0_1_3396869_0 finished
10/10/2011 4:01:48 AM | Milkyway@home | Sending scheduler request: To report completed tasks.
10/10/2011 4:01:48 AM | Milkyway@home | Reporting 1 completed tasks, not requesting new tasks
10/10/2011 4:01:51 AM | Milkyway@home | Scheduler request completed


I finished a WU last night and it went out. Didn't see any changes so THEN I read this post. Am I to understand that task is LOST?

Profile BladeD
Send message
Joined: 2 Nov 10
Posts: 656
Credit: 114,350,545
RAC: 33,372
Message 51344 - Posted: 10 Oct 2011 | 18:10:56 UTC - in response to Message 51341.

hm? why? how do you mean that? I have dnetc as backup backup backup emergency energywasting project when SETI Backup offers no Astropulse on the MW Machine. Normaly over BOINC Manager..

You can't just attach to DNETC and Moo Wrapper via BOINC manager.
____________

pvh
Send message
Joined: 8 Feb 10
Posts: 13
Credit: 55,706,042
RAC: 73,389
Message 51345 - Posted: 10 Oct 2011 | 18:53:33 UTC

I for one would also be in favor of _much_ bigger workunits for GPUs (I would say roughly 100x bigger). As it is now, the turnaround time is ridiculously short: around every 1 - 2 min the server needs to be contacted for a new WU. And that is for a single GPU. No wonder your server cannot keep up. A side effect is that MW gets completely bullied by the backup projects. I get a maximum of roughly 25 - 35 min of work in my cache. So every time the server is unresponsive for that amount of time (and that happens quite often) my backup project immediately dumps 20 hours of work on me. If that would happen once every day (we are not very far off), I would be running 20 hours of backup project and only 4 hours of MW per day. I too have an ATI GPU, so the choices for backup projects are very limited and I find them all more or less useless, so I really don't want to be running these backup projects at all...

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 51346 - Posted: 10 Oct 2011 | 19:06:20 UTC - in response to Message 51345.

So every time the server is unresponsive for that amount of time (and that happens quite often) my backup project immediately dumps 20 hours of work on me.

If collatz is your backup project, you can set the resource share to 0. This means, only one wu / gpu will be picked up. When that one finishes, the next one (again a single wu) is downloaded.

Profile Carlos R. Moreira
Avatar
Send message
Joined: 12 Sep 11
Posts: 10
Credit: 4,745,303
RAC: 0
Message 51347 - Posted: 10 Oct 2011 | 19:08:05 UTC - in response to Message 51344.
Last modified: 10 Oct 2011 | 19:13:45 UTC

hm? why? how do you mean that? I have dnetc as backup backup backup emergency energywasting project when SETI Backup offers no Astropulse on the MW Machine. Normaly over BOINC Manager..

You can't just attach to DNETC and Moo Wrapper via BOINC manager.


About DNETC, yes, you can attach to BOINC Manager because i was testing it last couple days and i had it attached to BOINC Manager, tho, i didn't found much information about what exactly i was processing, so i kinda abandoned for now DNETC, about Moo Wrapper i have no info how it can or cannot be attached to BOINC Manager. So far, i'm crunching PrimeGRID as backup project for MW@Home...
____________

[boinc.at] Nowi
Send message
Joined: 22 Mar 09
Posts: 89
Credit: 346,001,407
RAC: 412,607
Message 51348 - Posted: 10 Oct 2011 | 19:12:00 UTC

I second longer WU, but one problem is the mix of CPU and GPU for crunching. Increase the WU-length could make it impossible for CPU to crunch for M@W in a reasonable time. So the decision have to be made, if M@W is going to be a GPU-project only.

And I don´t know, if the project leaders want to go this step...

Nowi
____________

pvh
Send message
Joined: 8 Feb 10
Posts: 13
Credit: 55,706,042
RAC: 73,389
Message 51349 - Posted: 10 Oct 2011 | 20:01:35 UTC - in response to Message 51346.
Last modified: 10 Oct 2011 | 20:02:16 UTC

If collatz is your backup project, you can set the resource share to 0. This means, only one wu / gpu will be picked up. When that one finishes, the next one (again a single wu) is downloaded.


I have Primegrid as my backup, it is the only backup project that runs on an ATI and I consider to be at least vaguely useful... If you set the resource share to zero, it only makes the project the backup. It does not limit the number of WUs that are downloaded once the backup kicks in. I think backup projects should work the way you describe, but they don't. I checked on the BOINC site. There is no way to force BOINC to only download a single WU at a time.

pvh
Send message
Joined: 8 Feb 10
Posts: 13
Credit: 55,706,042
RAC: 73,389
Message 51350 - Posted: 10 Oct 2011 | 20:05:33 UTC - in response to Message 51348.

Increase the WU-length could make it impossible for CPU to crunch for M@W in a reasonable time.


Is it strictly necessary that GPU and CPU WUs do the same amount of work? If so, then you will always have a problem since GPUs are so much faster... But I am not convinced that they need to be of the same size...

Profile dskagcommunity
Avatar
Send message
Joined: 26 Feb 11
Posts: 155
Credit: 32,421,708
RAC: 170,065
Message 51351 - Posted: 10 Oct 2011 | 20:13:35 UTC - in response to Message 51344.

hm? why? how do you mean that? I have dnetc as backup backup backup emergency energywasting project when SETI Backup offers no Astropulse on the MW Machine. Normaly over BOINC Manager..


You can't just attach to DNETC and Moo Wrapper via BOINC manager.


Then plz explain it to me how i done it with DNETC when you know it that exactly ... ;)

____________
DSKAG Austria Research Team: http://www.research.dskag.at



[boinc.at] Nowi
Send message
Joined: 22 Mar 09
Posts: 89
Credit: 346,001,407
RAC: 412,607
Message 51352 - Posted: 10 Oct 2011 | 20:16:22 UTC - in response to Message 51350.

Increase the WU-length could make it impossible for CPU to crunch for M@W in a reasonable time.


Is it strictly necessary that GPU and CPU WUs do the same amount of work? If so, then you will always have a problem since GPUs are so much faster... But I am not convinced that they need to be of the same size...


This is a question which must be answered by the project scientists. At present it seems to be a must.

____________

Profile Werkstatt
Send message
Joined: 19 Feb 08
Posts: 297
Credit: 105,742,273
RAC: 84
Message 51354 - Posted: 10 Oct 2011 | 20:21:58 UTC - in response to Message 51349.


I have Primegrid as my backup, it is the only backup project that runs on an ATI and I consider to be at least vaguely useful... If you set the resource share to zero, it only makes the project the backup. It does not limit the number of WUs that are downloaded once the backup kicks in. I think backup projects should work the way you describe, but they don't. I checked on the BOINC site. There is no way to force BOINC to only download a single WU at a time.


Sorry, no. I use collatz as backup, I use ATI-Cards, and it's always only one wu that is downloaded.
It depends on the projects server version, but at collatz this feature works perfect.

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 51355 - Posted: 10 Oct 2011 | 20:22:14 UTC - in response to Message 51343.

I finished a WU last night and it went out. Didn't see any changes so THEN I read this post. Am I to understand that task is LOST?


Yes.
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

Profile Tony Stark
Avatar
Send message
Joined: 21 Sep 11
Posts: 38
Credit: 184,327,575
RAC: 14
Message 51356 - Posted: 10 Oct 2011 | 20:23:06 UTC - in response to Message 51320.


If we increase the cache, then this type of database crash would happen significantly more often. We just don't have a powerful enough server to increase the size of the workunit and result tables that much.


On that note...
This is my idea of a budget (best value/cost ratio) server. All prices are USD, quoted from US retailer Newegg.com on Oct 10, 2011(I do not work for them) I think we as a community could come up with this. I will put up $500 seed money.

I have started a seperate thread for this in the cruncher section. All feedback, especially from those with experience with server configurations, is welcome.

Not Included: Case/mounting hardware, power supply, raid hardware, which are all better left to whoever sets it up.

Qty. Product Description Total Price
1 ASUS KGPE-D16 SSI EEB 3.61 Server Motherboard Dual Socket G34 AMD SR5690 DDR3 800/1066/1333
Item #: N82E16813131643 $429.9

2 AMD Opteron 6128 Magny-Cours 2.0GHz Socket G34 115W 8-Core Server Processor OS6128WKT8EGOWOF
Item #: N82E16819105266 $499.98

16 Kingston 8GB 240-Pin DDR3 SDRAM DDR3 1333 ECC Registered w/ Parity Server Memory Model KVR1333D3Q8R9S/8G
Item #: N82E16820139280 $1,215.84
Though i would discourage it, if we can get by with 64GB, we could save ~$600

2 Mushkin Enhanced Chronos Deluxe MKNSSDCR120GB-DX 2.5" 120GB SATA III MLC Internal Solid State Drive (SSD)
Item #: N82E16820226225 $499.98
System/boot drive. 2x120GB is for possible raid or swap/pagefile. If neither is wanted we could get 1x240 for the same price or just save ~$250

2 Western Digital AV-GP WD20EURS 2TB SATA 3.0Gb/s 3.5" Internal Hard Drive -Bare Drive
Item #: N82E16822136783 $189.98
I don't know if this application needs much local storage, but its cheap.

2 Thermaltake CLS0015 70mm 1 Ball, 1 Sleeve CPU Cooler for AMD Socket G34 1U
Item #: N82E16835106158 $71.98
------------------------
$2,907.75

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 915
Credit: 74,781,320
RAC: 237
Message 51359 - Posted: 10 Oct 2011 | 23:03:49 UTC - in response to Message 51354.


I have Primegrid as my backup, it is the only backup project that runs on an ATI and I consider to be at least vaguely useful... If you set the resource share to zero, it only makes the project the backup. It does not limit the number of WUs that are downloaded once the backup kicks in. I think backup projects should work the way you describe, but they don't. I checked on the BOINC site. There is no way to force BOINC to only download a single WU at a time.


Sorry, no. I use collatz as backup, I use ATI-Cards, and it's always only one wu that is downloaded.
It depends on the projects server version, but at collatz this feature works perfect.



Actually you are both correct, the difference lies in the BOINC manager version that you are running.

The 6.10.xx series requests a chunk of work and it gets multiple work units.

The 6.12.xx series requests 1 work unit per idle resource.

I have Collatz as my backup project for the HD5830 and it only maintains 1 work unit at a time right now.
____________

Profile BladeD
Send message
Joined: 2 Nov 10
Posts: 656
Credit: 114,350,545
RAC: 33,372
Message 51361 - Posted: 11 Oct 2011 | 0:19:30 UTC - in response to Message 51351.
Last modified: 11 Oct 2011 | 0:24:13 UTC

hm? why? how do you mean that? I have dnetc as backup backup backup emergency energywasting project when SETI Backup offers no Astropulse on the MW Machine. Normaly over BOINC Manager..


You can't just attach to DNETC and Moo Wrapper via BOINC manager.


Then plz explain it to me how i done it with DNETC when you know it that exactly ... ;)

I don't know. It's not listed for me when I go to attach a project.

I guess there was no good news to report today...
____________

Profile dskagcommunity
Avatar
Send message
Joined: 26 Feb 11
Posts: 155
Credit: 32,421,708
RAC: 170,065
Message 51365 - Posted: 11 Oct 2011 | 5:42:02 UTC

Much Projects are not listend in there. Thats nothing new ;) Try DNETC.org (or .net) as projectURL. One of them work.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



BarryAZ
Send message
Joined: 1 Sep 08
Posts: 512
Credit: 223,431,090
RAC: 162,144
Message 51367 - Posted: 11 Oct 2011 | 6:49:42 UTC - in response to Message 51361.

It isn't listed there (that might be a developer choice at BOINC central), but you can attach by this sequence in the BOINC client.

1) Tools
2) Attach to project or account manager
3) Attach to project
4) Type in the project URL:

For Dnet that would be http://dnetc.net/
For Moowraper that would be http://moowrap.net/

I don't know that you can do this via the BOINC account manager side of things though as I attach to projects via the attach to project option.




Then plz explain it to me how i done it with DNETC when you know it that exactly ... ;)

I don't know. It's not listed for me when I go to attach a project.

I guess there was no good news to report today...[/quote]

____________

Profile Kathryn Tombaugh-Weber
Send message
Joined: 12 Aug 11
Posts: 7
Credit: 10,975
RAC: 3
Message 51374 - Posted: 11 Oct 2011 | 17:02:54 UTC

I was following this discussion yesterday, since I completed a nice unit and it went out to you, only to discover belatedly that you were down.

I see the "big" contributors are asking for larger work units. I don't support that at all. I understand computers always have glitches and I'm not mad about no credit, but I'm certainly not interested in using up a lot of time for nothing again. Just saying.

Profile Blurf
Volunteer moderator
Project administrator
Send message
Joined: 13 Mar 08
Posts: 619
Credit: 25,447,954
RAC: 0
Message 51376 - Posted: 11 Oct 2011 | 19:08:54 UTC

I am gauging the cost of an appropriate server right now.
____________

Profile banditwolf
Avatar
Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 295,133
RAC: 0
Message 51392 - Posted: 12 Oct 2011 | 13:15:45 UTC - in response to Message 51376.
Last modified: 12 Oct 2011 | 13:17:49 UTC

nevermind.
____________
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.

TJ
Send message
Joined: 12 Aug 09
Posts: 251
Credit: 77,777,096
RAC: 628
Message 51394 - Posted: 12 Oct 2011 | 15:26:09 UTC

Travis, we had problems in the past then you came with the idea to clear the database very quick, so most of us could not see the results page.
Then there where no more outages, unless airco problem RPI, not your fault.

A few weeks ago you increased the results pages (36 for me some times) this is off course a heavy load on the database and server. If you reduce this again, wouldn't it help?


____________
Greetings from,
TJ

Profile arkayn
Avatar
Send message
Joined: 14 Feb 09
Posts: 915
Credit: 74,781,320
RAC: 237
Message 51404 - Posted: 13 Oct 2011 | 4:16:20 UTC - in response to Message 51394.

Travis, we had problems in the past then you came with the idea to clear the database very quick, so most of us could not see the results page.
Then there where no more outages, unless airco problem RPI, not your fault.

A few weeks ago you increased the results pages (36 for me some times) this is off course a heavy load on the database and server. If you reduce this again, wouldn't it help?



He already has actually.

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=2621&nowrap=true#51321
I'm going to have to lower the time workunits are kept in teh database, I think the number was too high and that's what caused the result table to get too large, corrupt and then crash the whole project.

____________

Post to thread

Message boards : News : feel free to cancel any in progress WUs


Main page · Your account · Message boards


Copyright © 2013 AstroInformatics Group