Welcome to MilkyWay@home

New Linux system trashes all tasks

Message boards : Number crunching : New Linux system trashes all tasks
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 540,085,541
RAC: 86,653
Message 67963 - Posted: 26 Dec 2018, 1:30:08 UTC

Just for giggles I decided to get a quick pull of tasks before I set NNT again on the project to see what happened when I tried to crunch tasks after I reverted to BOINC 7.4.44 for tomorrows Seti outage so I can bank tasks greater than what 7.8.3 allows me. I also wanted to see if anything is different with the RTX 2080 card doing the processing.

I expected all of them to fail instantly like in the past but it looks like about 1/2 of them compute. I wondered if the compute kernel created under the BOINC 7.8.3 files would allow computing when I just change the five 7.4.44 executable files. But it looks like MW creates the compute kernel on the fly and doesn't leave a persistent compute kernel in the project directory and must create it on the fly in each task slot when needed.

So then I noticed that all the validated ones had the 203.92 credit. I know that there is another task type that awards 227.62 and I believe those are the ones that fail instantly.

So what is the difference between those two task types. And what is the likelihood of getting just the 203 credit types. Or are both evenly mixed into the RTS buffer and you just have luck pulling one type or the other.
ID: 67963 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 208
Credit: 105,464,057
RAC: 36,084
Message 67964 - Posted: 26 Dec 2018, 3:08:06 UTC - in response to Message 67963.  

Keith,

The major difference between the two job types is that one does 6 tasks with 14 parameters and the other does 4 tasks with 26 parameters.

I'm guessing here, but I suspect the reason you're having problems with the older BOINC client might be because of the extra parameters! The command line it has to construct contains the path to the executable (a 96-character relative path on my system) and the parameters as given by the <command_line> element of the <workunit> section in client_state.xml (which is 861 characters for one of the 26-parameter tasks I've just looked at!) The total command line length for that job would be nearly 960 characters, and that may not be the longest it could be...

The parameters seem to be free-format (in that they don't have a fixed number of decimal places) but typically have 6 or 7 digits and a decimal point. Some also have a minus-sign. There are a few with less than 6 digits, but not many in the examples I looked at.

So I'm wondering if there's an issue with the maximum command line length that the older client can handle, and perhaps these jobs trigger that problem?

As I said, just guessing!

Cheers - Al.
ID: 67964 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 696
Credit: 540,085,541
RAC: 86,653
Message 67965 - Posted: 26 Dec 2018, 3:16:41 UTC - in response to Message 67964.  
Last modified: 26 Dec 2018, 3:30:32 UTC

Hi Al. Thanks for the explanation of the different parameter sets for the two tasks types. I had noticed that in the stderr.txt output but I never looked into the actual parameter contents in the slot or the client_state.

So just luck of the draw in getting the four task bundle compared to the six task bundles. Or is one or the other more prevalent now? Interesting that the six task bundle with only 14 parameters are the ones that I can process but they award less credit than the ones with less tasks in the bundle. I guess the six task bundles are easier to compute, thus why they have two more tasks compared to the six task bundle tasks.

[Edit] I just asked the question over in the BOINC message boards for the client. I would like to know since I will have to build a new client anyway in the future once David Anderson and Richard Haselgrove figure out where and why the client blows up on my RTX 2080 host with <gpu_exclude> and <max_concurrent> statements. They still haven't nailed down the fix yet.
ID: 67965 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : New Linux system trashes all tasks

©2024 Astroinformatics Group