rpi_logo
testing work generation with 'ps_separation_14_2s_null_3'
testing work generation with 'ps_separation_14_2s_null_3'
log in

Advanced search

Message boards : News : testing work generation with 'ps_separation_14_2s_null_3'

1 · 2 · 3 · 4 · Next
Author Message
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54578 - Posted: 1 Jun 2012, 18:00:30 UTC

I'm testing work generation right now, there should be workunits available to download now. Let me know how these workunits are crunching!

--Travis
____________

Phil
Send message
Joined: 29 Aug 10
Posts: 25
Credit: 2,172,252,217
RAC: 0

Message 54579 - Posted: 1 Jun 2012, 18:01:49 UTC

I am getting computation errors on all WUs

Sebastian*
Send message
Joined: 8 Apr 09
Posts: 64
Credit: 5,815,211,677
RAC: 374,080

Message 54580 - Posted: 1 Jun 2012, 18:02:43 UTC

Same here. They run like normal, but when they reach 100% they end with a Computation Error.

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54581 - Posted: 1 Jun 2012, 18:33:03 UTC - in response to Message 54579.

I am getting computation errors on all WUs


Think I fixed the problem, I generated a new batch of workunits, let me know how these are crunching.
____________

Profile tomast
Avatar
Send message
Joined: 9 May 12
Posts: 12
Credit: 10,339,447
RAC: 0

Message 54582 - Posted: 1 Jun 2012, 18:41:14 UTC

Still getting computation errors
(Not had any errors before today.)

Incorrect function. (0x1) - exit code 1 (0x1)
Failed to read number of star points from file
(2): No such file or directory

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54584 - Posted: 1 Jun 2012, 19:01:14 UTC - in response to Message 54582.

Still getting computation errors
(Not had any errors before today.)

Incorrect function. (0x1) - exit code 1 (0x1)
Failed to read number of star points from file
(2): No such file or directory


Looks like Matt N. gave me a bad star file. Started up 'ps_separation_14_2s_null_3_v2', hopefully that will fix it.
____________

Jimmy Gondek
Send message
Joined: 28 Sep 11
Posts: 60
Credit: 22,764,173
RAC: 0

Message 54586 - Posted: 1 Jun 2012, 19:17:33 UTC

...nope, nothing comin' out of the hose...you sure the water's turned on?...

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54587 - Posted: 1 Jun 2012, 19:27:04 UTC - in response to Message 54586.

...nope, nothing comin' out of the hose...you sure the water's turned on?...


Just made 500 more workunits from the new search.
____________

Profile tomast
Avatar
Send message
Joined: 9 May 12
Posts: 12
Credit: 10,339,447
RAC: 0

Message 54588 - Posted: 1 Jun 2012, 19:42:00 UTC

_V2 still the same error right at the end of procesing.
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=225454907

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54589 - Posted: 1 Jun 2012, 20:09:18 UTC - in response to Message 54588.

_V2 still the same error right at the end of procesing.
http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=225454907


I'm looking into this, seems like something weird is going on with the star files Matt N. gave me.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54591 - Posted: 1 Jun 2012, 20:21:33 UTC - in response to Message 54589.

Looks like _v2 might have been using the old wrong star file. I'm hoping v3 fixes that.
____________

Profile Ray_GTI-R
Avatar
Send message
Joined: 5 Nov 10
Posts: 69
Credit: 15,061,882
RAC: 0

Message 54593 - Posted: 1 Jun 2012, 21:39:36 UTC - in response to Message 54591.

Same for me, Computer 427419.
Will credits be given for completed work that fail this way?
Thanks.

Sunny129
Avatar
Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0

Message 54594 - Posted: 1 Jun 2012, 21:49:51 UTC

thank god others are having the same problem LOL. i've been pulling my hair out for the last hour trying to figure out why tasks are essentially running to completion and then erroring out at the last second...i feel much better now that i know its a server-side issue.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54595 - Posted: 1 Jun 2012, 21:53:45 UTC - in response to Message 54594.

thank god others are having the same problem LOL. i've been pulling my hair out for the last hour trying to figure out why tasks are essentially running to completion and then erroring out at the last second...i feel much better now that i know its a server-side issue.


From what I can tell, it looks like the newly generating 'ps_separation_14_2s_null_3_v3' workunits are crunching and validating, so I think we're in the clear from here on out.
____________

Sunny129
Avatar
Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0

Message 54596 - Posted: 1 Jun 2012, 21:58:40 UTC - in response to Message 54595.

From what I can tell, it looks like the newly generating 'ps_separation_14_2s_null_3_v3' workunits are crunching and validating, so I think we're in the clear from here on out.

thanks for the update Travis. i wouldn't know yet, as i immediately suspended all MW@H work as soon as i saw WU's erroring out. now that i've just discovered the nature of the problem, i can resume crunching the remaining MW@H tasks in my queue (even though i know they'll error out). once those tasks have cleared my host, i can test the ps_separation_14_2s_null_3_v3 WU's and confirm whether or not the errors are gone...that is, if someone doesn't beat me to it.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54597 - Posted: 1 Jun 2012, 22:01:04 UTC - in response to Message 54596.

From what I can tell, it looks like the newly generating 'ps_separation_14_2s_null_3_v3' workunits are crunching and validating, so I think we're in the clear from here on out.

thanks for the update Travis. i wouldn't know yet, as i immediately suspended all MW@H work as soon as i saw WU's erroring out. now that i've just discovered the nature of the problem, i can resume crunching the remaining MW@H tasks in my queue (even though i know they'll error out). once those tasks have cleared my host, i can test the ps_separation_14_2s_null_3_v3 WU's and confirm whether or not the errors are gone...that is, if someone doesn't beat me to it.



We'll i've gotten back a bunch of successful ps_separation_14_2s_null_3_v3 results, so it's looking like here on out things will be good unless I screw something else up. I've actually been surprised at how smooth things have been going so far (considering it was a total reimplementation). I did a lot of offline testing but there's always kinks to work out when something like that goes live. Of course, I'm probably shooting myself in the foot by saying that, so expect incoming catastrophic errors. :P
____________

Sunny129
Avatar
Send message
Joined: 25 Jan 11
Posts: 271
Credit: 346,072,284
RAC: 0

Message 54598 - Posted: 1 Jun 2012, 22:05:39 UTC

ok, the ps_separation_14_2s_null_3_v3 are crunching to completion without errors...so it seems all is well for the time being.
____________

Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0

Message 54599 - Posted: 1 Jun 2012, 22:09:53 UTC - in response to Message 54598.

I've also started a DE search: 'de_separation_14_2s_05_3'. It's using a different star file (but correctly formatted as far as I can tell), so let me know if those are crunching correctly as well.
____________

Profile tomast
Avatar
Send message
Joined: 9 May 12
Posts: 12
Credit: 10,339,447
RAC: 0

Message 54601 - Posted: 1 Jun 2012, 22:50:13 UTC

So far so good ;-) mostly...

null_3_v4 --- Completed, validation inconclusive (all good so far)
05_3 --- Completed, validation inconclusive (all good so far)
sample_1 --- Completed and validated (all good so far)
null_3_v2 --- Computation error

Profile Ray_GTI-R
Avatar
Send message
Joined: 5 Nov 10
Posts: 69
Credit: 15,061,882
RAC: 0

Message 54602 - Posted: 1 Jun 2012, 23:40:18 UTC - in response to Message 54601.

GPU-only tasks tested ...

PC #A
Just completed a ps_separation_09 task, OK
All else fails with computation eror at 100% completion:- ps_separation_14_2s_null_3_v2, v3, v4, ps_separation_14_2s_05_03
I have restarted, cold booted and detached/reattached. Same problem as above.
Result:-
Will suspend project and abort existing ps_separation_14 tasks until a fix is in place.

PC #B (only now switched it on after a couple of days, so no recent tasks have been loaded yet)
Completing ps_separation_09 tasks (7 of them so far), OK
Result:-
I've switched to "No new tasks" for now.

For those with headless/unattended servers, you're going to either be busy for a while or else waste a lot of electricity doing nothing until a fix is found.

1 · 2 · 3 · 4 · Next

Message boards : News : testing work generation with 'ps_separation_14_2s_null_3'


Main page · Your account · Message boards


Copyright © 2018 AstroInformatics Group