Welcome to MilkyWay@home

NoContraintsWithDisk200 vs. south4s (or whatever)

Message boards : Number crunching : NoContraintsWithDisk200 vs. south4s (or whatever)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Saenger
Avatar

Send message
Joined: 28 Aug 07
Posts: 133
Credit: 29,423,179
RAC: 0
Message 68533 - Posted: 14 Apr 2019, 17:19:50 UTC

I got tons of invalid tasks in the last couple of days, and I don't have a clue why. I know the names of those who get crunched fine lately all include NoContraintsWithDisk200, while those without this, and currently south4s in its name, all crash asap. As I don't loose much crunchhtime due to their immediate crashing, I won't stop with MilkyWay, but...

I'd like to know how I can avoid DLing those WUs that won't run, or, even better, how to get them to run on my machine.
Grüße vom Sänger
ID: 68533 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TB

Send message
Joined: 5 Nov 17
Posts: 4
Credit: 2,795,498
RAC: 0
Message 68534 - Posted: 15 Apr 2019, 2:32:46 UTC - in response to Message 68533.  

They haven't crashed for me. NoConstraintsWithDisk200 have worked fine--hadn't noticed the change up. Noticed "south" just now (first one's I've seen) so I searched after the mess last month (or was that February?) and saw your post. Trying to run them now.

No explanation, but thought you'd like to know it's not a general problem.
ID: 68534 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
marmadukejoe

Send message
Joined: 11 May 13
Posts: 2
Credit: 6,280,452
RAC: 2,679
Message 68535 - Posted: 15 Apr 2019, 3:41:11 UTC - in response to Message 68533.  

I'm seeing the same problem. This machine is also Linux 64bit, but no GPU is being used. I have noticed that the workunits that fail are subsequently being validated by others, but the machines that complete them all seem to be running Windows.
ID: 68535 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 12 Dec 15
Posts: 53
Credit: 132,602,298
RAC: 30,608
Message 68540 - Posted: 15 Apr 2019, 20:41:54 UTC - in response to Message 68533.  

I got tons of invalid tasks in the last couple of days, and I don't have a clue why.


Those are WU's ending in errors, not ones ending with invalid state. The invalid WU category is where the WU completed but the answer found differs from the wingunits.

Your errored out WU's all report:

<stderr_txt>
Process creation (&#143;J&#252;) failed: Error -1, errno=2
execv: No such file or directory

</stderr_txt>
]]>


Some reason your BOINC.exe can't create the slot directory for the WU?
ID: 68540 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 28 Aug 07
Posts: 133
Credit: 29,423,179
RAC: 0
Message 68542 - Posted: 15 Apr 2019, 21:25:18 UTC - in response to Message 68540.  

Some reason your BOINC.exe can't create the slot directory for the WU?

No, as it's doing fine on the NoContraintsWithDisk200, just the south4s are crashing. And I don't care about linguistic sophistry, they fail, full stop.
Before everything was fine as well, so nothing here with me.
Grüße vom Sänger
ID: 68542 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 12 Dec 15
Posts: 53
Credit: 132,602,298
RAC: 30,608
Message 68543 - Posted: 15 Apr 2019, 23:11:59 UTC - in response to Message 68542.  
Last modified: 15 Apr 2019, 23:32:28 UTC

Some reason your BOINC.exe can't create the slot directory for the WU?

No, as it's doing fine on the NoContraintsWithDisk200, just the south4s are crashing.

Before everything was fine as well, so nothing here with me.

Dev needs to address why only that WU type would have issues creating the folders.

EDIT:--------
Oh, now I remember, after Milkyway did the server move, I had similar errors of inability to create folders.
Try resetting the project so it d/l's a clean set of WU's and initializes xmls.

Also had a friends computer run WUProps WU's for 2 years then the Spectrum Antivirus received updated database and decided that the WUProps WU was dangerous and blacklisted it. Those errors were a similar unable to create folder error in the WUProps stderr.txt.
------------

And I don't care about linguistic sophistry, they fail, full stop.

Wasn't being pedantic or pushing fallacies. When I see 'invalid' mentioned about failing work units, that has a very specific meaning on the results table and I just came off a problem with 40% Milkyway WU's being invalid which turned out to be driver related. If you were having that many invalids, I was initially going to suggest driver issues.
ID: 68543 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 12 Dec 15
Posts: 53
Credit: 132,602,298
RAC: 30,608
Message 68544 - Posted: 16 Apr 2019, 0:38:06 UTC - in response to Message 68543.  
Last modified: 16 Apr 2019, 0:40:20 UTC

Also had a friends computer run WUProps WU's for 2 years then the Spectrum Antivirus received updated database and decided that the WUProps WU was dangerous and blacklisted it. Those errors were a similar unable to create folder error in the WUProps stderr.txt.

WU being run should be identical in process name; only difference is data set, and so the anti-virus shouldn't be the cause.


Dev needs to address why only that WU type would have issues creating the folders.
EDIT:--------
Oh, now I remember, after Milkyway did the server move, I had similar errors of inability to create folders.
Try resetting the project so it d/l's a clean set of WU's and initializes xmls.

The WU is working properly for some users so likely not the WU coding, but probably a local error in the configuration file about that data set.

If it's configuration file then resetting the project may fix it.
ID: 68544 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 210
Credit: 105,914,931
RAC: 25,163
Message 68545 - Posted: 16 Apr 2019, 0:44:37 UTC

Just a thought here: I suspect most (if not all) of the people having the "instant fail" trouble with the new work units might be running older versions of BOINC which have a maximum command line capability of about 1024 characters. There have been other tasks in the past that had lots of parameters and overflowed that limit, though I'm not sure it produced this error.

In particular, it appears the OP is using 7.2.42 (which certainly truncated command lines for one of the earlier groups of tasks with extra parameters!)

So if nothing else seems to work it might be worth trying a newer BOINC client if one is available for your distribution. I think 7.8 didn't have this problem, and I know 7.14.2 doesn't because that's what I use and it runs these new tasks quite happily.

As I said, just a thought...

Good luck - Al.
ID: 68545 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 12 Dec 15
Posts: 53
Credit: 132,602,298
RAC: 30,608
Message 68546 - Posted: 16 Apr 2019, 3:39:15 UTC - in response to Message 68545.  

Just a thought here: I suspect most (if not all) of the people having the "instant fail" trouble with the new work units might be running older versions of BOINC which have a maximum command line capability of about 1024 characters. There have been other tasks in the past that had lots of parameters and overflowed that limit, though I'm not sure it produced this error.

In particular, it appears the OP is using 7.2.42 (which certainly truncated command lines for one of the earlier groups of tasks with extra parameters!)

So if nothing else seems to work it might be worth trying a newer BOINC client if one is available for your distribution. I think 7.8 didn't have this problem, and I know 7.14.2 doesn't because that's what I use and it runs these new tasks quite happily.

As I said, just a thought...

Good luck - Al.


Obscure and incredibly useful piece of knowledge.
Need to bookmark your answer; relevant across all projects.
ID: 68546 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 28 Aug 07
Posts: 133
Credit: 29,423,179
RAC: 0
Message 68547 - Posted: 16 Apr 2019, 4:13:32 UTC - in response to Message 68545.  

Just a thought here: I suspect most (if not all) of the people having the "instant fail" trouble with the new work units might be running older versions of BOINC which have a maximum command line capability of about 1024 characters. There have been other tasks in the past that had lots of parameters and overflowed that limit, though I'm not sure it produced this error.

In particular, it appears the OP is using 7.2.42 (which certainly truncated command lines for one of the earlier groups of tasks with extra parameters!)

So if nothing else seems to work it might be worth trying a newer BOINC client if one is available for your distribution. I think 7.8 didn't have this problem, and I know 7.14.2 doesn't because that's what I use and it runs these new tasks quite happily.

But the names of the fine running WUs are considerably longer than the ones of the failing WUs.
Perhaps I'll try a new BOINC some time later.
Grüße vom Sänger
ID: 68547 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
marmadukejoe

Send message
Joined: 11 May 13
Posts: 2
Credit: 6,280,452
RAC: 2,679
Message 68548 - Posted: 16 Apr 2019, 6:13:02 UTC - in response to Message 68547.  

Upthread I reported the same problem. I'm also on version 7.2.42, which is the current available (Mint 17). I've now tried upgrading to a more recent version from ppa:costamagnagianfranco/boinc (https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/locutusofborg-ppa). The client upgraded to 7.6.31 but the manager was held back. In synaptic the manager & metapackage are still version 7.2.42 and lists 7.6.31 as the available upgrade version, but shows them as 'broken packages' and won't allow an upgrade.
Mint 17 reaches end of support this month so I'll upgrade soon and reattach to MW when I've done so.

As a test I've just now installed BOINC on a laptop running Mint 18.2. This has installed version 7.6.31. The 'south' wu's are running happily, (however once this block has completed I'll detach that machine).

One further observation on the original problem, the last wu that downloaded in each batch I attempted error'd with 'download error'.

--------------------------------------------------
Stderr output

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>stars-84-donlon.txt</file_name>
<error_code>-200 (wrong size)</error_code>
</file_xfer_error>

</message>
]]>
-----------------------------------------------------
ID: 68548 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 68549 - Posted: 16 Apr 2019, 16:30:54 UTC

Hey Everyone,

Part of this issue is probably mine and Tom's fault. We forgot to change the number of workunits we bundle together when we increased the number of parameters in our model for these runs. We will put up new runs ASAP which have a smaller bundle size and should nolonger crash on machines with smaller commandline sizes. So sorry about that.

Best,

Jake
ID: 68549 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tom Donlon
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 10 Apr 19
Posts: 408
Credit: 120,203,200
RAC: 0
Message 68552 - Posted: 16 Apr 2019, 18:14:44 UTC - in response to Message 68549.  

Hey everyone,

The new runs with adjusted bundle size have been put up and the buggy runs have been taken down. If you have any problems with the new runs, please post them at https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4432 so that they are seen quickly. Sorry about that!

Thanks for your patience during this transition period,

Tom
ID: 68552 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : NoContraintsWithDisk200 vs. south4s (or whatever)

©2024 Astroinformatics Group