Welcome to MilkyWay@home

Nbody WU Flush

Message boards : News : Nbody WU Flush
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 · Next

AuthorMessage
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 73597 - Posted: 19 May 2022, 3:47:26 UTC - in response to Message 73584.  

yep me too but some machines just won't handle it so I do what I have to do to get them crunching


You have machines that won't handle windows? WTF?


Yup too old or too non standard or just too ornery or I got tired of babysitting them
Eh? I've had windows since a 286. What is your oldest computer?
ID: 73597 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 73598 - Posted: 19 May 2022, 3:49:09 UTC - in response to Message 73585.  
Last modified: 19 May 2022, 3:56:11 UTC

I think you should bring every single one of your cpu cores here RIGHT NOW to help clear them out!!
I'd love to but the big CPUs have got GPUs running Separation. If I try to get Nbody for those CPUs, the server gives me CPU seperations! Which are tremendously slow compared to GPUs and utterly pointless. I really don't understand why we don't have more options in the server preferences. Einstein manages.

Something is making the whole server sluggish, it's taking forever just to post this message, not sure what's changed.

Although I do notice the number of seperations waiting to go out is 30000 instead of the usual 10000. Don't tell me we're going to get a huge mass of those to clear too (although that should be easier as we can do them on GPUs).


Just keep aborting the Separation units, that's probably why people are on the 4th go round with them

Oh and yes it's really easy to add choices to the Server code, you just have to know how
At the point I wrote that message, I was getting about 95% of my CPU tasks being seperation. I can't just keep aborting them, as obviously I'm not here overnight etc. And I don't use a huge buffer. I've put CPUs that have GPUs on the same machine into other projects.

I've asked here at LHC: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5850

Someone who hasn't been banned from Einstein could ask in there, as they have very detailed options.
ID: 73598 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 73599 - Posted: 19 May 2022, 4:00:26 UTC

Now getting the full 300 per GPU of seperation on asking, but I see the validation queue is huge again.
ID: 73599 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile bhorlor

Send message
Joined: 13 Apr 20
Posts: 9
Credit: 166,215,305
RAC: 36,963
Message 73609 - Posted: 20 May 2022, 14:01:15 UTC - in response to Message 73578.  

I think you should bring every single one of your cpu cores here RIGHT NOW to help clear them out!!
I'd love to but the big CPUs have got GPUs running Separation. If I try to get Nbody for those CPUs, the server gives me CPU seperations! Which are tremendously slow compared to GPUs and utterly pointless. I really don't understand why we don't have more options in the server preferences. Einstein manages.

Something is making the whole server sluggish, it's taking forever just to post this message, not sure what's changed.

Although I do notice the number of seperations waiting to go out is 30000 instead of the usual 10000. Don't tell me we're going to get a huge mass of those to clear too (although that should be easier as we can do them on GPUs).

Why are they pointless?
ID: 73609 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 May 11
Posts: 71
Credit: 5,685,114
RAC: 0
Message 73610 - Posted: 20 May 2022, 14:07:31 UTC - in response to Message 73609.  
Last modified: 20 May 2022, 14:14:51 UTC

Because nvidia gtx 1650 completes workunit in 6 minutes while ryzen 3 3100 would need a hour.
ID: 73610 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile bhorlor

Send message
Joined: 13 Apr 20
Posts: 9
Credit: 166,215,305
RAC: 36,963
Message 73620 - Posted: 20 May 2022, 20:40:15 UTC - in response to Message 73610.  

I see, CPU separations vs. GPU separations. My original interpretation was that CPUs are worthless here.
Thanks for the clarification.
ID: 73620 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73621 - Posted: 21 May 2022, 2:28:16 UTC - in response to Message 73620.  

I see, CPU separations vs. GPU separations. My original interpretation was that CPUs are worthless here.
Thanks for the clarification.


CPU's are NOT worthless here there are 2 types of tasks here and only one of them is for gpu's, the NBody tasks are cpu only while the Separation tasks ARE for both cpu's and gpu's and then as noted before the gpu's ARE faster when running these tasks. The Nbody tasks are multi-threaded tasks meaning they will use all available cpu cores that Boinc will give it unless you use an app_config.xml file to control it.
ID: 73621 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eoin Moore 1971

Send message
Joined: 24 Aug 11
Posts: 2
Credit: 7,936,071
RAC: 0
Message 73627 - Posted: 21 May 2022, 15:14:44 UTC

I have 1011 WUs listed in "Tasks"

702 of these are listed as "validation inconclusive", mostly N-Body/mt (CPU), but some Separation (GPU)

Is this others' experience at the moment?
ID: 73627 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 73628 - Posted: 21 May 2022, 15:29:22 UTC - in response to Message 73627.  
Last modified: 21 May 2022, 15:30:05 UTC

I have 1011 WUs listed in "Tasks"

702 of these are listed as "validation inconclusive", mostly N-Body/mt (CPU), but some Separation (GPU)

Is this others' experience at the moment?
Even in normal times, you have to wait until your wingman has verified your tasks. At the moment the server is panting like mad with millions of work units in the database because somehow it generated 15,000,000 instead of 10,000. Boinc is not the brightest of programs, on your end or theirs.
ID: 73628 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73634 - Posted: 21 May 2022, 20:26:24 UTC - in response to Message 73574.  

Oh no not again.....

Seperation for GPU:

1st request: 0 tasks
2nd request: 0 tasks
3rd request: 0 tasks
4th request: 26 tasks

I usually get 900 on that machine.

We're clearing the Nbodys well, I think I'll get more cores on there in the hope the server will be happier.


I saw this over at Number Fields post Pentathlon:

"I had to disable the "accelerated retries" mechanism due to all the abortions. When things stabilize I will turn that back on. This means reissued WUs will have their deadline halved."

So there IS a way to get the Server to use shorter deadlines for the resends, I'll bet if Tom would talk to:
Eric Driver
Project administrator
Project developer
Project tester
Project scientist

He would help Tom figure it out as he's the one who made the above quote.
ID: 73634 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eoin Moore 1971

Send message
Joined: 24 Aug 11
Posts: 2
Credit: 7,936,071
RAC: 0
Message 73637 - Posted: 21 May 2022, 22:15:57 UTC - in response to Message 73628.  
Last modified: 21 May 2022, 22:21:52 UTC

I have 1011 WUs listed in "Tasks"

702 of these are listed as "validation inconclusive", mostly N-Body/mt (CPU), but some Separation (GPU)

Is this others' experience at the moment?


Even in normal times, you have to wait until your wingman has verified your tasks. At the moment the server is panting like mad with millions of work units in the database because somehow it generated 15,000,000 instead of 10,000. Boinc is not the brightest of programs, on your end or theirs.


Ah, thanks, maybe I was confusing "Validation inconclusive" with "validation pending".

(edit) No, it seems that "validation pending" is a separate category, so I have 0 of those, and 702 "Completed, validation inconclusive"

I am not concerned about credit etc., but 70% of WUs seems a high proportion to waste.
ID: 73637 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73639 - Posted: 22 May 2022, 2:20:22 UTC - in response to Message 73637.  

I have 1011 WUs listed in "Tasks"

702 of these are listed as "validation inconclusive", mostly N-Body/mt (CPU), but some Separation (GPU)

Is this others' experience at the moment?


Even in normal times, you have to wait until your wingman has verified your tasks. At the moment the server is panting like mad with millions of work units in the database because somehow it generated 15,000,000 instead of 10,000. Boinc is not the brightest of programs, on your end or theirs.


Ah, thanks, maybe I was confusing "Validation inconclusive" with "validation pending".

(edit) No, it seems that "validation pending" is a separate category, so I have 0 of those, and 702 "Completed, validation inconclusive"

I am not concerned about credit etc., but 70% of WUs seems a high proportion to waste.


They are not "wasted" they are just outside the expected range, every task has a sorta kinda thing they are looking for and when your task is outside that 'range' then they will send it out to someone else to crunch to confirm if the problem is on your end or the projects or even with the task itself. That's what all these numbers mean in every task:
max # of error/total/success tasks 2, 9, 6.
ID: 73639 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 73647 - Posted: 22 May 2022, 19:55:41 UTC - in response to Message 73634.  

I saw this over at Number Fields post Pentathlon:

"I had to disable the "accelerated retries" mechanism due to all the abortions. When things stabilize I will turn that back on. This means reissued WUs will have their deadline halved."

So there IS a way to get the Server to use shorter deadlines for the resends, I'll bet if Tom would talk to:
Eric Driver
Project administrator
Project developer
Project tester
Project scientist

He would help Tom figure it out as he's the one who made the above quote.
Indeed, many projects do this. I often see shorter deadlines on retries.
ID: 73647 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 73654 - Posted: 23 May 2022, 1:49:00 UTC - in response to Message 73647.  

I saw this over at Number Fields post Pentathlon:

"I had to disable the "accelerated retries" mechanism due to all the abortions. When things stabilize I will turn that back on. This means reissued WUs will have their deadline halved."

So there IS a way to get the Server to use shorter deadlines for the resends, I'll bet if Tom would talk to:
Eric Driver
Project administrator
Project developer
Project tester
Project scientist

He would help Tom figure it out as he's the one who made the above quote.


Indeed, many projects do this. I often see shorter deadlines on retries.


I do too
ID: 73654 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wailing Angus Beef

Send message
Joined: 24 Dec 07
Posts: 33
Credit: 1,922,935,234
RAC: 28,425
Message 73663 - Posted: 23 May 2022, 12:57:10 UTC - in response to Message 73585.  

I think you should bring every single one of your cpu cores here RIGHT NOW to help clear them out!!
I'd love to but the big CPUs have got GPUs running Separation. If I try to get Nbody for those CPUs, the server gives me CPU seperations! Which are tremendously slow compared to GPUs and utterly pointless. I really don't understand why we don't have more options in the server preferences. Einstein manages.

Something is making the whole server sluggish, it's taking forever just to post this message, not sure what's changed.

Although I do notice the number of seperations waiting to go out is 30000 instead of the usual 10000. Don't tell me we're going to get a huge mass of those to clear too (although that should be easier as we can do them on GPUs).


Just keep aborting the Separation units, that's probably why people are on the 4th go round with them

Oh and yes it's really easy to add choices to the Server code, you just have to know how

If anyone wishes to run just the GPU separation WUs and n-body on their CPUs, send me the contents of your client_state.xml in a PM. I'll create a an app_info.xml that will make it work. I have a version already for nvidia GPUs on linux but not Windows. I need the info from the client_state.xml to see what the program filenames and other info. Maybe the app_info.xml files are already available in another forum thread. Don't know
ID: 73663 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 44
Credit: 226,954,567
RAC: 3,943
Message 73670 - Posted: 23 May 2022, 20:11:36 UTC - in response to Message 73663.  

Isn't app_info.xml meant for anonymous platform mechanism setups? If it can be used in the standard setup, wouldn't you have to update the app_info.xml when the project updates its apps or if your system changes?
ID: 73670 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wailing Angus Beef

Send message
Joined: 24 Dec 07
Posts: 33
Credit: 1,922,935,234
RAC: 28,425
Message 73671 - Posted: 23 May 2022, 20:26:04 UTC - in response to Message 73670.  

Isn't app_info.xml meant for anonymous platform mechanism setups? If it can be used in the standard setup, wouldn't you have to update the app_info.xml when the project updates its apps or if your system changes?

Yes, if the project updates their app, the app_info.xml will need to be updated. Just offering an option to run just the separation GPU WUs and n-body CPU WUs on the same client without getting separation CPU WUs
ID: 73671 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 0
Message 73672 - Posted: 24 May 2022, 3:25:35 UTC - in response to Message 73663.  

I think you should bring every single one of your cpu cores here RIGHT NOW to help clear them out!!
I'd love to but the big CPUs have got GPUs running Separation. If I try to get Nbody for those CPUs, the server gives me CPU seperations! Which are tremendously slow compared to GPUs and utterly pointless. I really don't understand why we don't have more options in the server preferences. Einstein manages.

Something is making the whole server sluggish, it's taking forever just to post this message, not sure what's changed.

Although I do notice the number of seperations waiting to go out is 30000 instead of the usual 10000. Don't tell me we're going to get a huge mass of those to clear too (although that should be easier as we can do them on GPUs).


Just keep aborting the Separation units, that's probably why people are on the 4th go round with them

Oh and yes it's really easy to add choices to the Server code, you just have to know how

If anyone wishes to run just the GPU separation WUs and n-body on their CPUs, send me the contents of your client_state.xml in a PM. I'll create a an app_info.xml that will make it work. I have a version already for nvidia GPUs on linux but not Windows. I need the info from the client_state.xml to see what the program filenames and other info. Maybe the app_info.xml files are already available in another forum thread. Don't know
I would like to do that but am on travel all week. Will try to PM you on Sunday. Thanks for the offer. That is exactly what I want to do.
ID: 73672 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 13 Oct 21
Posts: 44
Credit: 226,954,567
RAC: 3,943
Message 73676 - Posted: 24 May 2022, 11:46:03 UTC

I was also able to create a working app_info.xml but for Windows. If anyone is interesting in trying it this way read the following BOINC page: https://boinc.berkeley.edu/wiki/Anonymous_platform to see what app_info.xml is and what it's meant to be used for as well as info on how to create one. I'd suggest draining your MilkyWay cache and setting the cache settings to absolute minimums before messing with app_info as it can cause unwanted consequences if mistakes in app_info are made during the process. This is not a set it and forget it option and will require some maintenance over time. Whenever the project apps, BOINC, or your system change the corresponding entries in app_info file will have to be updated manually.

If your system is Linux app_info.xml is probably the only way to exclude Separation CPU tasks (and only get N-Body). If it's Windows 10 or 11 another option is to use WSL2, which is what I do and may go back to doing after checking how this works out since I use WSL2 to run other BOINC projects that either require or run better on Linux.
ID: 73676 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile HRFMguy

Send message
Joined: 12 Nov 21
Posts: 236
Credit: 575,038,236
RAC: 0
Message 73873 - Posted: 19 Jun 2022, 19:41:54 UTC

Hey Tom,

Now that the n body units are totally flushed(so to speak!), what are your plans?

Will there be more n body in the future? Or is it totally done?

Does your team need a few months to digest the data before committing to new runs?
ID: 73873 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 · Next

Message boards : News : Nbody WU Flush

©2024 Astroinformatics Group