Welcome to MilkyWay@home

Admin Updates Discussion

Message boards : News : Admin Updates Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
MJKelleher

Send message
Joined: 23 Feb 13
Posts: 1
Credit: 5,101,800
RAC: 50
Message 76908 - Posted: 10 Feb 2024, 19:00:33 UTC - in response to Message 76904.  

But there's always Cosmology, as long as you are already have an account there


Cosmology has been off-line, including the website, since before the first of the year. Einstein@Home is still working.
ID: 76908 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Finn the Human
Avatar

Send message
Joined: 23 Dec 18
Posts: 23
Credit: 10,214,542
RAC: 102
Message 76909 - Posted: 10 Feb 2024, 22:04:21 UTC

We have to see this through. Eventually, the validation WUs will be sent; just be patient.

1300 WUs are waiting for validation from my side so far, and I allocated more cores to help get more WUs calculated.
Everything stays
But it still changes
Ever so slightly
Daily and nightly
In little ways
When everything stays...

ID: 76909 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 4 Jul 09
Posts: 99
Credit: 17,434,413
RAC: 2,338
Message 76911 - Posted: 11 Feb 2024, 3:41:13 UTC - in response to Message 76909.  

At least the science here is being used and has value. The old Cosmology project is unattended and any tasks being done are at best for credit only. Any electricity used on it would be better used on Einstein@home or Asteroids or this project. Universe would be a good choice when they return to active work.

Bill F
In October of 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.


ID: 76911 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Finn the Human
Avatar

Send message
Joined: 23 Dec 18
Posts: 23
Credit: 10,214,542
RAC: 102
Message 76916 - Posted: 12 Feb 2024, 17:48:22 UTC

A few validated WUs have shown up on my tasks list!
Everything stays
But it still changes
Ever so slightly
Daily and nightly
In little ways
When everything stays...

ID: 76916 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 627
Credit: 19,362,373
RAC: 3,550
Message 76918 - Posted: 12 Feb 2024, 19:24:13 UTC - in response to Message 76916.  
Last modified: 12 Feb 2024, 19:26:24 UTC

One valid here and I got now one _1 on my computer. The ready to send buffer also dropped by about 3k. That means we made it through that huge pile of _0s and now we need to make it through the same huge pile of _1s. :D
ID: 76918 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76921 - Posted: 13 Feb 2024, 0:23:46 UTC - in response to Message 76918.  

One valid here and I got now one _1 on my computer. The ready to send buffer also dropped by about 3k. That means we made it through that huge pile of _0s and now we need to make it through the same huge pile of _1s. :D


I have some _2 and _3 tasks on my pc, so we ARE getting closer to normal day to day stuff again.
ID: 76921 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Link
Avatar

Send message
Joined: 19 Jul 10
Posts: 627
Credit: 19,362,373
RAC: 3,550
Message 76922 - Posted: 13 Feb 2024, 12:13:31 UTC - in response to Message 76921.  
Last modified: 13 Feb 2024, 12:13:48 UTC

Now 3 valids for me and the results ready to send are down to 682024. Yay.
ID: 76922 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jerry

Send message
Joined: 16 Oct 17
Posts: 1
Credit: 45,368,680
RAC: 6,437
Message 76923 - Posted: 14 Feb 2024, 0:52:57 UTC - in response to Message 76907.  

Which problem with database? The inconclusive tasks are not a problem with database and it's not a problem at all,

It is a problem with volunteer relations. The process sending out tasks could trivially issue a quorum in rapid succession even if the number of workunits in the queue is large.

What useful purpose is served by operating in a manner that causes unnecessary concern for people who are donating their resources?
ID: 76923 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Finn the Human
Avatar

Send message
Joined: 23 Dec 18
Posts: 23
Credit: 10,214,542
RAC: 102
Message 76924 - Posted: 14 Feb 2024, 3:55:46 UTC - in response to Message 76923.  

It is a problem with volunteer relations. The process sending out tasks could trivially issue a quorum in rapid succession even if the number of workunits in the queue is large.

What useful purpose is served by operating in a manner that causes unnecessary concern for people who are donating their resources?


I agree that the lack of proper communication requires us to extrapolate information. From our side, a sudden stop of credits being awarded is a big cause for concern.

However, I feel the sole developer here is overburdened and the lack of engagement is the consequence.
Everything stays
But it still changes
Ever so slightly
Daily and nightly
In little ways
When everything stays...

ID: 76924 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xii5ku

Send message
Joined: 1 Jan 17
Posts: 39
Credit: 113,155,484
RAC: 40,539
Message 76926 - Posted: 14 Feb 2024, 23:42:32 UTC - in response to Message 76923.  

Jerry wrote:
The process sending out tasks could trivially issue a quorum in rapid succession even if the number of workunits in the queue is large.
I agree. It seems desirable that those who have insight into the NBody validator review whether or not the current minimum quorum of 1 really makes sense: *If* it is very unlikely that a single result can be validated (or worse: actually impossible),¹ then NBody workunits should be configured to minium quorum = 2 (and initial replication = 2). That's not only for the users' sake, it should (if the mentioned condition is true) also reduce the database size somewhat as it should reduce the number of workunits waiting for validation.

________
¹) I for one have never spotted a workunit which was validated from a single task. Hence it seems to me that it is indeed highly unlikely or impossible.
ID: 76926 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 556,844,997
RAC: 43,988
Message 76927 - Posted: 15 Feb 2024, 2:03:01 UTC - in response to Message 76926.  

Jerry wrote:
The process sending out tasks could trivially issue a quorum in rapid succession even if the number of workunits in the queue is large.
I agree. It seems desirable that those who have insight into the NBody validator review whether or not the current minimum quorum of 1 really makes sense: *If* it is very unlikely that a single result can be validated (or worse: actually impossible),¹ then NBody workunits should be configured to minium quorum = 2 (and initial replication = 2). That's not only for the users' sake, it should (if the mentioned condition is true) also reduce the database size somewhat as it should reduce the number of workunits waiting for validation.

________
¹) I for one have never spotted a workunit which was validated from a single task. Hence it seems to me that it is indeed highly unlikely or impossible.

Separation tasks were almost always validated by a single task on "trusted" hosts.
ID: 76927 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 4 Jul 09
Posts: 99
Credit: 17,434,413
RAC: 2,338
Message 76928 - Posted: 15 Feb 2024, 2:17:30 UTC - in response to Message 76924.  
Last modified: 15 Feb 2024, 2:19:12 UTC

It is a problem with volunteer relations. The process sending out tasks could trivially issue a quorum in rapid succession even if the number of workunits in the queue is large.

What useful purpose is served by operating in a manner that causes unnecessary concern for people who are donating their resources?


I agree that the lack of proper communication requires us to extrapolate information. From our side, a sudden stop of credits being awarded is a big cause for concern.

However, I feel the sole developer here is overburdened and the lack of engagement is the consequence.


It depends on the volunteers prospective. If a volunteer is donating their hardware and electricity for science then the "sudden stop of credits" is interesting and requires watching BUT is not a not a big cause for concern. If the volunteer is donating hardware and electricity for Credit's and recognition and / or badges then it might be a "big cause for concern" for that volunteer.

Science wins....

Bill F
In October of 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.


ID: 76928 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Finn the Human
Avatar

Send message
Joined: 23 Dec 18
Posts: 23
Credit: 10,214,542
RAC: 102
Message 76929 - Posted: 15 Feb 2024, 4:06:01 UTC - in response to Message 76928.  


It depends on the volunteers prospective. If a volunteer is donating their hardware and electricity for science then the "sudden stop of credits" is interesting and requires watching BUT is not a not a big cause for concern. If the volunteer is donating hardware and electricity for Credit's and recognition and / or badges then it might be a "big cause for concern" for that volunteer.

Science wins....

Bill F


I'm not folding for the "rewards." The sudden loss of credits concerns whether we are doing science at all since that's the only immediate empirical metric users see. None of the admin update posts mentioned that no one was getting credited, so it gave way to all the questions about "why aren't we getting credits? Is my machine broken? Etc..." If a proactive announcement had been made that explained what had happened in a manner the public understands, this wouldn't have been a big concern.
Everything stays
But it still changes
Ever so slightly
Daily and nightly
In little ways
When everything stays...

ID: 76929 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xii5ku

Send message
Joined: 1 Jan 17
Posts: 39
Credit: 113,155,484
RAC: 40,539
Message 76930 - Posted: 15 Feb 2024, 10:53:25 UTC - in response to Message 76927.  
Last modified: 15 Feb 2024, 11:18:20 UTC

Keith Myers wrote:
xii5ku wrote:
It seems desirable that those who have insight into the NBody validator review whether or not the current minimum quorum of 1 really makes sense: [...]
I for one have never spotted a workunit which was validated from a single task. Hence it seems to me that it is indeed highly unlikely or impossible.
Separation tasks were almost always validated by a single task on "trusted" hosts.
Right; corresponding to that I specifically referred to the NBody validator, I should have written "I for one have never spotted an NBody workunit which was validated from a single task" for clarity.

[Even back in the time when both Separation and NBody were still active in parallel, it would have been technically possible to configure different quorum and replication settings for the two on the server. And perhaps not only possible but also potentially beneficial to the database size.]

--------

Bill F wrote:
Finn the Human wrote:
[...] From our side, a sudden stop of credits being awarded is a big cause for concern. [...]
It depends on the volunteers prospective. If a volunteer is donating their hardware and electricity for science then the "sudden stop of credits" is interesting and requires watching BUT is not a not a big cause for concern. If the volunteer is donating hardware and electricity for Credit's and recognition and / or badges then it might be a "big cause for concern" for that volunteer.
Regardless if the motivation is biased to scientific contribution or to nice scores, either way we are aiming that our hosts return results which are valid. Therefore the confusion among us when suddenly nothing was validated any more. However, the explanation of why this happened as well as estimations when validations were to resume could be found in the message board. (Although, as Finn the Human put it, this info was mostly extrapolated by users.)
ID: 76930 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rz5rqt

Send message
Joined: 5 Sep 09
Posts: 9
Credit: 562,405,735
RAC: 102,700
Message 76946 - Posted: 29 Feb 2024, 22:59:43 UTC

I got the first of the de_nbody orbit_fitting tasks today. It seems like they will not follow the app_conf.xml. I have configured one of my "48 CPU" computers to run 4 tasks at a time using 12 CPUs each. All of the "old" nbody tasks obey this config file. But the 10 orbit_fitting tasks I got today are all listed as "Ready to start (16 CPUs) (none have run yet). Background, I have two identical computers. One has no app_config.xml file ( runs three tasks at a time using 16 CPUs ). The other has an app_config.xml file to run 4 tasks at a time using 12 CPUs. This has always worked. Even the plain ole nbody tasks I got AFTER the orbit_fitting tasks show "Ready to start (12 CPUs). Is this by design?[/img]
ID: 76946 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JohnDK
Avatar

Send message
Joined: 18 Feb 10
Posts: 57
Credit: 222,646,444
RAC: 5,838
Message 76947 - Posted: 29 Feb 2024, 23:56:24 UTC - in response to Message 76946.  

I got the first of the de_nbody orbit_fitting tasks today. It seems like they will not follow the app_conf.xml. I have configured one of my "48 CPU" computers to run 4 tasks at a time using 12 CPUs each. All of the "old" nbody tasks obey this config file. But the 10 orbit_fitting tasks I got today are all listed as "Ready to start (16 CPUs) (none have run yet). Background, I have two identical computers. One has no app_config.xml file ( runs three tasks at a time using 16 CPUs ). The other has an app_config.xml file to run 4 tasks at a time using 12 CPUs. This has always worked. Even the plain ole nbody tasks I got AFTER the orbit_fitting tasks show "Ready to start (12 CPUs). Is this by design?[/img]

You need to modify the app_confg file. This is mine which seems to work, you just need change the numbers to fit your need.

<app_config>

 <app>
  <name>milkyway_nbody_orbit_fitting</name>
  <max_concurrent>2</max_concurrent>
 </app>
 <app_version>
  <app_name>milkyway_nbody_orbit_fitting</app_name>
  <plan_class>mt</plan_class>
  <avg_ncpus>3</avg_ncpus>
  <cmdline>--nthreads 3</cmdline>
 </app_version>

 <app>
  <name>milkyway_nbody</name>
  <max_concurrent>2</max_concurrent>
 </app>
 <app_version>
  <app_name>milkyway_nbody</app_name>
  <plan_class>mt</plan_class>
  <avg_ncpus>3</avg_ncpus>
  <cmdline>--nthreads 3</cmdline>
 </app_version>

<project_max_concurrent>2</project_max_concurrent>
</app_config>
ID: 76947 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rz5rqt

Send message
Joined: 5 Sep 09
Posts: 9
Credit: 562,405,735
RAC: 102,700
Message 76948 - Posted: 1 Mar 2024, 2:53:14 UTC - in response to Message 76947.  

Been so long I forgot it was application specific. Thanks.
ID: 76948 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 0
Message 76949 - Posted: 1 Mar 2024, 10:37:17 UTC - in response to Message 76946.  

I got the first of the de_nbody orbit_fitting tasks today. It seems like they will not follow the app_conf.xml. I have configured one of my "48 CPU" computers to run 4 tasks at a time using 12 CPUs each. All of the "old" nbody tasks obey this config file. But the 10 orbit_fitting tasks I got today are all listed as "Ready to start (16 CPUs) (none have run yet). Background, I have two identical computers. One has no app_config.xml file ( runs three tasks at a time using 16 CPUs ). The other has an app_config.xml file to run 4 tasks at a time using 12 CPUs. This has always worked. Even the plain ole nbody tasks I got AFTER the orbit_fitting tasks show "Ready to start (12 CPUs). Is this by design?[/img]


Mine looks like this now and works for me:

<app_config>


<app_version>
<app_name>milkyway_nbody</app_name>
<plan_class>mt</plan_class>
<avg_ncpus>2</avg_ncpus>
<cmdline>--nthreads 2</cmdline>
</app_version>

<app_version>
<app_name>milkyway_nbody_orbit_fitting</app_name>
<plan_class>mt</plan_class>
<avg_ncpus>2</avg_ncpus>
<cmdline>--nthreads 2</cmdline>
</app_version>

<project_max_concurrent>1</project_max_concurrent>

</app_config>

You can see from mine that you have to add a new section with the new app name in it.

I run mine with 2 cpu cores each and they just take longer to run but they run just fie so far, I'm waiting for my wingmen to know for sure of course. I am also only running 1 task at a time on my laptop, my desktops will have different settings based on the capability of each one.
ID: 76949 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jimbocous
Avatar

Send message
Joined: 7 Mar 20
Posts: 22
Credit: 106,245,701
RAC: 10,976
Message 76950 - Posted: 1 Mar 2024, 11:25:12 UTC

Thanks for the app_config reminder, guys. Much appreciated:)
ID: 76950 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 24 Jan 11
Posts: 715
Credit: 556,844,997
RAC: 43,988
Message 76951 - Posted: 1 Mar 2024, 17:55:27 UTC - in response to Message 76946.  

Likely the name changed for the tasks and that is why your app_config does not work anymore. If they are releasing BOTH the old N-body and whatever the new Orbit tasks are named, just use two app_version sections.
ID: 76951 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

Message boards : News : Admin Updates Discussion

©2024 Astroinformatics Group