rpi_logo
Database Maintenance 9-4-2014
Database Maintenance 9-4-2014
log in

Advanced search

Message boards : News : Database Maintenance 9-4-2014

Previous · 1 · 2 · 3 · Next
Author Message
aad
Send message
Joined: 30 Mar 09
Posts: 59
Credit: 391,613,431
RAC: 153,079

Message 67785 - Posted: 6 Sep 2018, 21:31:10 UTC - in response to Message 67784.
Last modified: 6 Sep 2018, 21:31:45 UTC

Yep, all of my WUs are _0 as well, so it looks like _1 are kept behind for now.


+1

Manfred Reiff
Send message
Joined: 27 Apr 18
Posts: 5
Credit: 33,419,295
RAC: 104,797

Message 67786 - Posted: 6 Sep 2018, 22:28:35 UTC

I have the same problem - 363 workunits finished, but validation inconclusive.
My computer (Intel Core i9-7900X, 64 GB RAM, nVidia GeForce GTX 1080 Ti) is working and working but I will be rewarded for a maximum of ONE workunit out of two or three that were uploaded at once. Often I received no credits.
On a normal day I will receive some 140,000 to 160,000 credits; today I received 51,000 acc. to my Excel database.
So I splitted the computer usage in the afternoon: 19 of the 20 processors will work on Milkyway and SETI@Home, one processor and the GPU will work on Einstein@Home GPU tasks - I received 237,000 credits today...
From tomorrow onwards I will reduce Milkyway usage to a maximum of 5 processors until the problem is solved!.
The actual situation is inacceptable!

Profile Chooka
Avatar
Send message
Joined: 13 Dec 12
Posts: 48
Credit: 113,813,908
RAC: 826,461

Message 67787 - Posted: 7 Sep 2018, 2:56:37 UTC - in response to Message 67786.

Yep I have nearly 4000 invalids :(
____________

Profile Saenger
Avatar
Send message
Joined: 28 Aug 07
Posts: 130
Credit: 10,985,554
RAC: 10,797

Message 67788 - Posted: 7 Sep 2018, 4:04:18 UTC - in response to Message 67787.

Yep I have nearly 4000 invalids :(

Not invalid, just not yet validated, because they are waiting for the wing-WU.
On one and the other of your computers.
____________
Grüße vom Sänger

Profile Chooka
Avatar
Send message
Joined: 13 Dec 12
Posts: 48
Credit: 113,813,908
RAC: 826,461

Message 67789 - Posted: 7 Sep 2018, 4:17:39 UTC - in response to Message 67788.

Yes sorry.... I just realised my mistake :)
____________

Profile mikey
Avatar
Send message
Joined: 8 May 09
Posts: 2205
Credit: 250,022,407
RAC: 2,009

Message 67790 - Posted: 7 Sep 2018, 11:22:16 UTC - in response to Message 67789.
Last modified: 7 Sep 2018, 11:23:16 UTC

Yes sorry.... I just realised my mistake :)


After last years database problems and them cancelling thousands of wu's I've shut mine off from crunching here right now!!

It also bothers me that no Admin has some back and said 'we are working on it', I wonder if they even have a clue at this point.

gambatesa
Send message
Joined: 23 Feb 18
Posts: 7
Credit: 1,018,387,668
RAC: 4,364,139

Message 67791 - Posted: 7 Sep 2018, 11:44:42 UTC

Same here.. about 14000 validation inconclusive.. we have to wait

Manfred Reiff
Send message
Joined: 27 Apr 18
Posts: 5
Credit: 33,419,295
RAC: 104,797

Message 67792 - Posted: 7 Sep 2018, 12:21:02 UTC

I think things aren't as worse as discussed here.
At present there are some 350 workunits waiting for validation including several workunits finished this morning (local time, CEST = UTC+2 hours). The number of 350 is decreasing steadily... One by one older workunits from Tuesday through to Thursday are validated. Credit reward at present is 5,000 to 6,000 per hour (a total of 30,000 for Friday morning). Till Monday hourly reward was around 10,000. I also found some "invalid" workunits in the tasks section. That's disappointing while using a high-end computer...
Due to that I changed my working flow from today: now I'm focussing more on Einstein@Home GPU's (+130,000 in the past 5 hours) and SETI@Home CPU's (the same disappointing delay as present Milkyway). Until Monday Milkyway was using the whole computer power (CPU's and GPU). At certain times throughout the day Einstein tasks were using 1 CPU plus the GPU. Now Milkyway is using 10 processors (no GPU), Einstein is using 1 CPU and the GPU and SETI is using the other 9 processors (CPU's).
In the past four months I was concentrating on Milkyway@Home with an credit increase of 4.5M per month acc. to Free-DC and BOINC stats. Now it is time to push Einstein@Home a bit (again)... Unfortunately credit reward for SETI is really low...

Profile mikey
Avatar
Send message
Joined: 8 May 09
Posts: 2205
Credit: 250,022,407
RAC: 2,009

Message 67793 - Posted: 7 Sep 2018, 14:21:50 UTC - in response to Message 67792.
Last modified: 7 Sep 2018, 14:23:45 UTC

I think things aren't as worse as discussed here.
At present there are some 350 workunits waiting for validation including several workunits finished this morning (local time, CEST = UTC+2 hours). The number of 350 is decreasing steadily... One by one older workunits from Tuesday through to Thursday are validated. Credit reward at present is 5,000 to 6,000 per hour (a total of 30,000 for Friday morning). Till Monday hourly reward was around 10,000. I also found some "invalid" workunits in the tasks section. That's disappointing while using a high-end computer...
Due to that I changed my working flow from today: now I'm focussing more on Einstein@Home GPU's (+130,000 in the past 5 hours) and SETI@Home CPU's (the same disappointing delay as present Milkyway). Until Monday Milkyway was using the whole computer power (CPU's and GPU). At certain times throughout the day Einstein tasks were using 1 CPU plus the GPU. Now Milkyway is using 10 processors (no GPU), Einstein is using 1 CPU and the GPU and SETI is using the other 9 processors (CPU's).
In the past four months I was concentrating on Milkyway@Home with an credit increase of 4.5M per month acc. to Free-DC and BOINC stats. Now it is time to push Einstein@Home a bit (again)... Unfortunately credit reward for SETI is really low...


I think you are misreading the problem or the database isn't keeping up...
RIGHT THIS MINUTE I have:
Validation inconclusive (1497)

Of those the first 10 I checked HAVE been sent out to wingmen, but the next 1000 HAVE NOT and remain "unsent" to any wingman whatsoever!!

I also have In progress (479) ALL of which are suspended until the "unsent" wu's come back down to my more normal 450 to 500 level.

THIS is the problem:

name de_modfit_sim19fixed_bundle4_4s_NoContraintsWithDisk260_1_1536186184_320295
application MilkyWay@Home
created 6 Sep 2018, 1:32:53 UTC
minimum quorum 1
initial replication 2
max # of error/total/success tasks 2, 9, 6
Task
click for details Computer Sent Time reported
or deadline
explain Status Run time
(sec) CPU time
(sec) Credit Application
2727463 779850 6 Sep 2018, 14:48:13 UTC 6 Sep 2018, 20:05:28 UTC Completed, validation inconclusive 224.18 53.73 pending MilkyWay@Home v1.46 (opencl_nvidia_101)
3729195 --- --- --- Unsent --- --- --- ---

http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=1653413894

As you can see I have gotten the wu, crunched it and returned it and it STILL has not even been SENT to a wingman yet!!!

BTW with your 1080Ti and using the optimization codes from the Collatz website you could be getting about 10,000 RAC per day off that ONE gpu at Collatz!!
____________

Jim1348
Send message
Joined: 9 Jul 17
Posts: 37
Credit: 1,920,449
RAC: 1,841

Message 67794 - Posted: 7 Sep 2018, 15:17:41 UTC
Last modified: 7 Sep 2018, 15:19:26 UTC

I am not sure where I fit in to the overall scheme of things, but I have been crunching N-Body only on Windows 7 64-bit.
https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=783174

So I have discontinued crunching until the DB can catch up. There is no point overloading it, and possibly losing work.

Profile Saenger
Avatar
Send message
Joined: 28 Aug 07
Posts: 130
Credit: 10,985,554
RAC: 10,797

Message 67796 - Posted: 7 Sep 2018, 16:28:00 UTC
Last modified: 7 Sep 2018, 16:28:28 UTC

I'm getting credit, even for WUs that got crunched after the maintenance. And I have a bunch of _1 WUs an my machine. So they are probably getting things sorted by and by.
____________
Grüße vom Sänger

Profile Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Feb 13
Posts: 510
Credit: 36,785,265
RAC: 161,959

Message 67797 - Posted: 7 Sep 2018, 16:41:16 UTC

Hey Everyone,

Just wanted to explain the validation inconclusive rates after this maintenance, and most maintenance. This is not an issue with how quickly our database can handle the workunits, but how quickly our users can cross validate workunits. As many of you know, we can require up to 4 other users to cross validate workunits before they are considered valid.

At any given time, we have 300,000+ workunits being calculated by volunteers. When we take the server down for maintenance, many of these are completed while we are down. They are then all sent back to us at once to be validated, so we end up with a queue of 300,000 workunits or more that have to be validated by other volunteers to catch up. This doesn't require much work for our server or database, but it does take a long time for users to work through them all.

Sorry that there is such a backlog for validation.

Jake

Profile mikey
Avatar
Send message
Joined: 8 May 09
Posts: 2205
Credit: 250,022,407
RAC: 2,009

Message 67798 - Posted: 7 Sep 2018, 21:54:17 UTC - in response to Message 67797.

Hey Everyone,

Just wanted to explain the validation inconclusive rates after this maintenance, and most maintenance. This is not an issue with how quickly our database can handle the workunits, but how quickly our users can cross validate workunits. As many of you know, we can require up to 4 other users to cross validate workunits before they are considered valid.

At any given time, we have 300,000+ workunits being calculated by volunteers. When we take the server down for maintenance, many of these are completed while we are down. They are then all sent back to us at once to be validated, so we end up with a queue of 300,000 workunits or more that have to be validated by other volunteers to catch up. This doesn't require much work for our server or database, but it does take a long time for users to work through them all.

Sorry that there is such a backlog for validation.

Jake


So you are saying you don't send out the same wu to two computers at the same time but instead wait for someone to return it first before sending it out for validation by a 2nd pc? That's contrary to what I was seeing prior to the maintenance as many days I never saw an unsent wu listed in my list of workunits waiting for validation.

Manfred Reiff
Send message
Joined: 27 Apr 18
Posts: 5
Credit: 33,419,295
RAC: 104,797

Message 67799 - Posted: 7 Sep 2018, 22:20:42 UTC

Hi Jake,

thanks for your explanations.

I think the situation is beginning to normalize - slowly but gradually... Within the last hour (21:00 to 22:00 UTC = 23:00 to midnight local time) I was rewarded with some 20,000 credits. Acc. to my own database (an Excel datasheet) I was rewarded with nearly 82,000 credits today. That's 60% more than yesterday (50,609). And the number of workunits "validation inconclusive" is decreasing. I think will reach my previous daily credit rate of 140,000 to 160,000 in the next future. And the present number of unvalidated workunits (410 at 22:20 UTC) will tend towards zero...

Manfred Reiff
Send message
Joined: 27 Apr 18
Posts: 5
Credit: 33,419,295
RAC: 104,797

Message 67800 - Posted: 7 Sep 2018, 22:26:06 UTC - in response to Message 67793.

BTW with your 1080Ti and using the optimization codes from the Collatz website you could be getting about 10,000 RAC per day off that ONE gpu at Collatz!!


Thanks for your informations. I will give it a try.

Profile Tackleway
Send message
Joined: 17 Mar 10
Posts: 14
Credit: 4,964,905
RAC: 842

Message 67801 - Posted: 7 Sep 2018, 22:31:15 UTC - in response to Message 67798.

Hi Mikey, what I've noticed is that some of the recent work units state initial replication = 1 ,where as they were usually = 2. Just a thought.
____________

alanb1951
Send message
Joined: 16 Mar 10
Posts: 35
Credit: 29,631,885
RAC: 11,220

Message 67802 - Posted: 8 Sep 2018, 4:17:33 UTC - in response to Message 67798.
Last modified: 8 Sep 2018, 4:19:20 UTC

Hey Everyone,

Just wanted to explain the validation inconclusive rates after this maintenance, and most maintenance. This is not an issue with how quickly our database can handle the workunits, but how quickly our users can cross validate workunits. As many of you know, we can require up to 4 other users to cross validate workunits before they are considered valid.

At any given time, we have 300,000+ workunits being calculated by volunteers. When we take the server down for maintenance, many of these are completed while we are down. They are then all sent back to us at once to be validated, so we end up with a queue of 300,000 workunits or more that have to be validated by other volunteers to catch up. This doesn't require much work for our server or database, but it does take a long time for users to work through them all.

Sorry that there is such a backlog for validation.

Jake


So you are saying you don't send out the same wu to two computers at the same time but instead wait for someone to return it first before sending it out for validation by a 2nd pc? That's contrary to what I was seeing prior to the maintenance as many days I never saw an unsent wu listed in my list of workunits waiting for validation.


I've been seeing this "We've sent one out, it's replied, now we'll send another one" behaviour for a long time. If you check the "Sent" time of the second task sent, you'll see that it will be anything from a few minutes to a lot longer from when your task was returned. It's been like that for quite a while now (or, at least, it has for my tasks...)

This is not just a recent occurrence. I think it dates back to a time when it was possible to look at a result and decide it was within a given range (possibly a near-exact match for a prior from the same start-point?) -- if it was, no wing-man would be called upon.

I'm unsure whether this stopped when we started getting batched work-units or earlier, and it could be that it's no longer possible. If it will always need at least one wing-man now, then (as you imply) the server should probably be re-configured to send out two at the beginning!...

Perhaps Jake can clarify this for us??? Inquiring minds want to know!

Cheers - Al.

gambatesa
Send message
Joined: 23 Feb 18
Posts: 7
Credit: 1,018,387,668
RAC: 4,364,139

Message 67803 - Posted: 8 Sep 2018, 8:04:00 UTC - in response to Message 67791.
Last modified: 8 Sep 2018, 8:09:49 UTC

Same here.. about 14000 validation inconclusive.. we have to wait


Validation Inconclusive WU are steadily decreasing.. about 5500 in this moment..

There is always a beginning with 0 wingmans.. But is more evident if everyone run out of work and everyone start again from 0

There is no reason to panic.. this kind of validation queue is absolutely normal and that's why if you stop crunching today you continue to receive (decreasing) credits in following days..
____________
Teach to sons what they can do with a single GPU.. and then show your 20 Gpus at work..

Profile mikey
Avatar
Send message
Joined: 8 May 09
Posts: 2205
Credit: 250,022,407
RAC: 2,009

Message 67804 - Posted: 8 Sep 2018, 12:22:38 UTC - in response to Message 67801.

Hi Mikey, what I've noticed is that some of the recent work units state initial replication = 1 ,where as they were usually = 2. Just a thought.


Yes that's MW's way of trying to eliminate the wingman, it's been around for awhile but I think is still undergoing testing and I'm not sure it will replace the wingman altogether in all cases.

Profile mikey
Avatar
Send message
Joined: 8 May 09
Posts: 2205
Credit: 250,022,407
RAC: 2,009

Message 67805 - Posted: 8 Sep 2018, 12:29:12 UTC - in response to Message 67803.

Same here.. about 14000 validation inconclusive.. we have to wait


Validation Inconclusive WU are steadily decreasing.. about 5500 in this moment..

There is always a beginning with 0 wingmans.. But is more evident if everyone run out of work and everyone start again from 0

There is no reason to panic.. this kind of validation queue is absolutely normal and that's why if you stop crunching today you continue to receive (decreasing) credits in following days..


That's very true but last time the inconclusives got sooo big they wiped out EVERYONES work and said 'sorry the database is bad' and we ALL lost hundreds if not thousands of already completed workunits. I would rather that not happen again, I personally lost over 500 completed workunits, while others reported losing thousands of them...to me having my rac dip is much better than losing already completed work. This isn't a project one crunches to get max rac anyway, this is a project one crunches for the purpose of helping out Science.

Previous · 1 · 2 · 3 · Next
Post to thread

Message boards : News : Database Maintenance 9-4-2014


Main page · Your account · Message boards


Copyright © 2019 AstroInformatics Group