Welcome to MilkyWay@home

New Version of Separation Modified Fit (1.32)

Message boards : News : New Version of Separation Modified Fit (1.32)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile bcavnaugh
Avatar

Send message
Joined: 14 Feb 14
Posts: 22
Credit: 195,835,315
RAC: 0
Message 62329 - Posted: 12 Sep 2014, 19:38:31 UTC - in response to Message 62328.  

Thanks for the Update!
Well be standing-by.
ID: 62329 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 62331 - Posted: 12 Sep 2014, 20:47:33 UTC

Looks like the download issue should be resolved and I can start working on some of the other issues again.

Jake W.
ID: 62331 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 14 Feb 14
Posts: 22
Credit: 195,835,315
RAC: 0
Message 62332 - Posted: 12 Sep 2014, 20:51:12 UTC - in response to Message 62331.  
Last modified: 12 Sep 2014, 21:48:53 UTC

Yes and No, I got one project of each but cannot get any more.
Took a long time to get new projects after your reset.
827160402 617077081 12 Sep 2014, 20:39:36 UTC 12 Sep 2014, 20:47:01 UTC Completed, validation inconclusive 400.07 52.63 pending Milkyway@Home Separation (Modified Fit) v1.32 (opencl_nvidia)

Also it shows that I have 97 In progress and I have no programs running on this computer ID: 561866.
Looking at ID: 581205 now as this is my other Rig running your projects.
ID: 62332 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 62333 - Posted: 13 Sep 2014, 0:19:21 UTC

It will take a while for the server to push out some more of my work units because there were so many errors. It tried to distribute them evenly among the different runs and since I had so many sent out earlier it just needs to let the others catch up. Should be back to normal running conditions soon.

Jake W
ID: 62333 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 14 Feb 14
Posts: 22
Credit: 195,835,315
RAC: 0
Message 62337 - Posted: 13 Sep 2014, 15:49:15 UTC

ID: 561866 is still not getting any tasks.
827160586 617077177 12 Sep 2014, 20:39:36 UTC 12 Sep 2014, 20:47:01 UTC Completed and validated 200.36 26.07 106.88 MilkyWay@Home v1.02 (opencl_nvidia)

Still shows 97 running tasks
827054583 617018049 12 Sep 2014, 16:44:48 UTC 24 Sep 2014, 16:44:48 UTC In progress --- --- --- MilkyWay@Home v1.02 (opencl_nvidia)

I have reset the project a few times now.
Thanks,
Bill
ID: 62337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael Bennett

Send message
Joined: 10 Mar 09
Posts: 13
Credit: 4,301,877
RAC: 0
Message 62342 - Posted: 14 Sep 2014, 17:24:11 UTC - in response to Message 62283.  

Getting error messages, checked my stats and none of them are processing. I will enclose the "error message," the next time it appears. Mike
ID: 62342 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Larr2000

Send message
Joined: 12 Jul 14
Posts: 1
Credit: 409,907
RAC: 0
Message 62344 - Posted: 15 Sep 2014, 20:00:32 UTC - in response to Message 62283.  

Hi Jake,
I have been supporting MilkyWay for a long time and decided to get the T-shirt with my donation. Was disappointed that the shirt did not have a photo of our beautiful Milkyway!!

Second, I just got 750 hours to process BEFORE 23 Sept!! I am on an iMac and do not have that kind of horsepower. Is it OK to abort those that I cannot finish by the deadline?

Larr2000@Gmail.com
ID: 62344 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 62345 - Posted: 15 Sep 2014, 20:06:28 UTC

Hey Larr2000,

Which application is it that is giving you a 750 hour work unit? If it is separation you can abort it. If it is nbody it is probably just not calculating it correctly because the nbody application is notorious for giving incorrect estimated run times. Sorry the shirt didn't have a photo of the Milky Way on it. For future reference the design is posted on the fund raiser page.

Jake W.
ID: 62345 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
swiftmallard
Avatar

Send message
Joined: 18 Jul 09
Posts: 300
Credit: 303,562,776
RAC: 0
Message 62346 - Posted: 15 Sep 2014, 20:13:42 UTC - in response to Message 62344.  

Is it OK to abort those that I cannot finish by the deadline?

Aborted WUs simply get sent out to other crunchers sooner than if they go past deadline. If you cannot crunch them, the sooner you abort them, then the sooner they will get completed.
ID: 62346 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DutchDK

Send message
Joined: 13 Nov 10
Posts: 5
Credit: 18,929,782
RAC: 0
Message 62347 - Posted: 15 Sep 2014, 23:55:46 UTC

129 invalid, due to Workunit error, validation error etc. 23 error as well

http://milkyway.cs.rpi.edu/milkyway/results.php?userid=135022&offset=0&show_names=0&state=5&appid=

Something definitely is amiss with the new version.
ID: 62347 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,176,722
RAC: 50,773
Message 62348 - Posted: 16 Sep 2014, 2:32:39 UTC - in response to Message 62346.  

Is it OK to abort those that I cannot finish by the deadline?

Aborted WUs simply get sent out to other crunchers sooner than if they go past deadline. If you cannot crunch them, the sooner you abort them, then the sooner they will get completed.


This may well be true, but each aborted work unit would appear to count as an "error" as far as validation is concerned, with the result that there can be a task that ends up as "Completed, can't validate" because of (typically) two aborts and one genuine crash. To the best of my knowledge, there's no way to stop BOINC treating user aborts the same as errors...

This is no big deal if a GPU job gets wasted, but it's very frustrating if one sees several hours-worth of cpu time black-holed because a couple of users have aborted MilkyWay jobs and someone else's [Nvidia, usually :-)] gpu job crashes.

However, it's been a fairly rare occurrence to date (for me, that is), so I'm merely pointing out a possible down side...
ID: 62348 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,761,672
RAC: 27,782
Message 62349 - Posted: 16 Sep 2014, 10:42:09 UTC - in response to Message 62348.  

Is it OK to abort those that I cannot finish by the deadline?

Aborted WUs simply get sent out to other crunchers sooner than if they go past deadline. If you cannot crunch them, the sooner you abort them, then the sooner they will get completed.


This may well be true, but each aborted work unit would appear to count as an "error" as far as validation is concerned, with the result that there can be a task that ends up as "Completed, can't validate" because of (typically) two aborts and one genuine crash. To the best of my knowledge, there's no way to stop BOINC treating user aborts the same as errors...

This is no big deal if a GPU job gets wasted, but it's very frustrating if one sees several hours-worth of cpu time black-holed because a couple of users have aborted MilkyWay jobs and someone else's [Nvidia, usually :-)] gpu job crashes.

However, it's been a fairly rare occurrence to date (for me, that is), so I'm merely pointing out a possible down side...


Then there is a problem somewhere in the Server Side software, because aborting a unit is supposed to do nothing more then put it back in the cache of available workunits. Now if you abort too many workunits it CAN decrease the total number of workunits you can get in a day, but as soon as you start returning valid units again that will exponentially pop right back up to normal. The key is to abort what you have to and crunch what you can.
ID: 62349 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 62351 - Posted: 16 Sep 2014, 13:01:15 UTC

I talked about the aborting work unit problem about a year ago. It is a problem but I don't work on the server side of the project much so I can't personally fix it. I will ask Travis about it again since he is in charge of most server side things.

Jake W.
ID: 62351 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SLRE

Send message
Joined: 26 Jan 09
Posts: 12
Credit: 53,679,035
RAC: 0
Message 62353 - Posted: 16 Sep 2014, 22:09:51 UTC
Last modified: 16 Sep 2014, 22:11:26 UTC

For info: The following (representative) jobs all errored out over the last couple of days.

ps_modfit_15_3s_130_wrap_const_1_1405680903_7765724_3
de_modfit_15_3s_132_wrap_8_1410552780_140958_1
de_modfit_15_3s_132_wrap_8_1410552780_141748_0
de_modfit_15_3s_132_wrap_8_1410552780_131333_0
de_modfit_15_3s_132_wrap_8_1410552780_133902_0
de_modfit_15_3s_132_wrap_7_1410552780_130778_1
ps_modfit_15_3s_132_wrap_3_1410552780_136533_0
de_modfit_15_3s_132_wrap_8_1410552780_131960_2

This isn't occasional;it's endemic across all modfit jobs on my Windows machine.
Machine is Win7, Nvidia GT640. Currently running MW jobs doubled up, so that may be contributing ...

On the linux box (Mint, 32-bit, Geforce GTX 660) most 1.30 modfit and some 1.32 teststars jobs are erroring out as other folk have reported.
ID: 62353 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 62355 - Posted: 17 Sep 2014, 13:27:54 UTC

SLRE,

Are you using the most recent NVidia drivers for your cards?

Jake W.
ID: 62355 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 4 Sep 12
Posts: 219
Credit: 456,474
RAC: 0
Message 62356 - Posted: 17 Sep 2014, 15:49:12 UTC - in response to Message 62355.  

SLRE,

Are you using the most recent NVidia drivers for your cards?

Jake W.

There are hints at SETI@Home of a possible OpenCL problem with NVidia driver 340.52 - the problems observed so far relate to Compute Capability 1.x cards only, but that could be the tip of the iceberg,

NVidia have reproduced the observed problem and are investigating. https://developer.nvidia.com/nvbugs/cuda/edit/1554016 (accessible to registered developers only)
ID: 62356 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
DutchDK

Send message
Joined: 13 Nov 10
Posts: 5
Credit: 18,929,782
RAC: 0
Message 62358 - Posted: 17 Sep 2014, 18:12:10 UTC

Still seeing errors and unable to validate/validate error in my jobs list.

Can someone with a clue on the new version and its coding, check up on it ?
ID: 62358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Mumak
Avatar

Send message
Joined: 8 Apr 13
Posts: 89
Credit: 517,085,245
RAC: 0
Message 62359 - Posted: 18 Sep 2014, 6:27:38 UTC

I'm too getting lots of ps_modfit errors recently on AMD HD 7950.
I had to opt-out of the Modfit tasks.
ID: 62359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 62361 - Posted: 19 Sep 2014, 16:31:13 UTC

DutchDK,

I am checking up on it. I coded all of the new changes. Doesn't seem to be any issues with the code though. Most likely it has to do with some library issues since we switched over to a new, more automated, build system.

Jake W.
ID: 62361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Michael Bennett

Send message
Joined: 10 Mar 09
Posts: 13
Credit: 4,301,877
RAC: 0
Message 62362 - Posted: 20 Sep 2014, 16:50:52 UTC - in response to Message 62353.  

FYI
All my Separation (Modified Fit...) work unites stop after 6-10 seconds with the message "Computation error." Only the 1.02 (opend_nvidia) get processed.
Mike
ID: 62362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : News : New Version of Separation Modified Fit (1.32)

©2024 Astroinformatics Group