Welcome to MilkyWay@home

Thread to report issues after server migration

Message boards : Number crunching : Thread to report issues after server migration
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
alanb1951

Send message
Joined: 16 Mar 10
Posts: 211
Credit: 108,184,825
RAC: 5,003
Message 76525 - Posted: 3 Nov 2023, 17:11:00 UTC - in response to Message 76523.  


P.S. I note that the new master URL has -new added, but the web site doesn't -- is that the long-term plan or will the master URL end up changing back to not having -new in it?


milkyway-new.cs.rpi.edu is the master address but using just milkyway.cs.rpi.edu will also reach milkyway-new. This is the way the security team here at RPI set it up.

Thanks for the clarification!

Back to the problem at hand: I see that the certificate issues are fairly well documented in this thread by now :-) -- if your security folks are saying there's nothing wrong [because it works for Windows and for browsers on Linux] please inform them otherwise :-)

Cheers - Al.

P.S. I will not be patching my certificate store :-)
ID: 76525 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Kevin Roux
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 9 Aug 22
Posts: 80
Credit: 2,558,312
RAC: 7,513
Message 76526 - Posted: 3 Nov 2023, 19:26:23 UTC - in response to Message 76525.  


P.S. I note that the new master URL has -new added, but the web site doesn't -- is that the long-term plan or will the master URL end up changing back to not having -new in it?


milkyway-new.cs.rpi.edu is the master address but using just milkyway.cs.rpi.edu will also reach milkyway-new. This is the way the security team here at RPI set it up.

Thanks for the clarification!

Back to the problem at hand: I see that the certificate issues are fairly well documented in this thread by now :-) -- if your security folks are saying there's nothing wrong [because it works for Windows and for browsers on Linux] please inform them otherwise :-)

Cheers - Al.

P.S. I will not be patching my certificate store :-)


I have contacted them and linked them this thread as well. If they don't get back to me today then I'm afraid it will have to wait until Monday.
I will try and fix other issues not related with the certification issue while I wait for their response.
ID: 76526 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nickfolino

Send message
Joined: 8 Jan 20
Posts: 11
Credit: 6,164,641
RAC: 0
Message 76528 - Posted: 3 Nov 2023, 19:48:15 UTC - in response to Message 76525.  

P.S. I will not be patching my certificate store :-)


It's not a patch. The cert store holds certificates that you trust.
In order for you to trust the certificate given to you by the milkyway site, you must trust the issuer of that certificate.
Certificates presented by websites generally contain a chain of certificates that link back to a "trusted" root.
In the case of the new certificate being presented by milkyway, the certificate chain was not built correctly.
The 2nd certificate in their chain is not the correct one.
Web browsers come by default with many trusted root certificates. Which is why your browser isn't complaining about the new site.
The 2 certificates I posted are the correct ones that properly complete the security chain.
It basically makes your OS trust their new cert just as your browser does because they complete the trusted chain.

Nick
ID: 76528 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 211
Credit: 108,184,825
RAC: 5,003
Message 76529 - Posted: 3 Nov 2023, 19:54:17 UTC - in response to Message 76526.  

Back to the problem at hand: I see that the certificate issues are fairly well documented in this thread by now :-) -- if your security folks are saying there's nothing wrong [because it works for Windows and for browsers on Linux] please inform them otherwise :-)

Cheers - Al.

P.S. I will not be patching my certificate store :-)


I have contacted them and linked them this thread as well. If they don't get back to me today then I'm afraid it will have to wait until Monday.
I will try and fix other issues not related with the certification issue while I wait for their response.

Thanks, Kevin. The "wait until the technicians return on Monday" issue affects other BOINC projects too (CPDN and WCG to name but two), and it's understandable1.

Good luck with the other fixing up. Although the [limited] parts of the web site I've used seem o.k., I've seen some of the other bug reports with hosts of PHP errors [yuk!]

Cheers - Al.

1 Having worked in a University Computing Service at one time I have seen that at first hand, both before and after the introduction of 24/7 "on call"...
ID: 76529 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JohnDK
Avatar

Send message
Joined: 18 Feb 10
Posts: 57
Credit: 222,342,010
RAC: 4,984
Message 76531 - Posted: 3 Nov 2023, 20:07:21 UTC

So I removed and added the project again on my Win11 PC, getting new tasks and it seems to work.

But when BOINC requests new work I get this error message:

03/11/2023 21.00.20 | Milkyway@home | Scheduler request to url failed: Couldn't resolve host name

I do get new work even with this message.
ID: 76531 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 211
Credit: 108,184,825
RAC: 5,003
Message 76534 - Posted: 3 Nov 2023, 22:50:42 UTC - in response to Message 76528.  

Nick -

It's not a patch. The cert store holds certificates that you trust.
I use the certificate bundle supplied by Ubuntu (which appears to be based on the Mozilla bundle1!), so as far as I'm concerned, adding to it is akin to patching (though the remark was tongue-in-cheek...) -- I tend to avoid altering something that shouldn't need modifying :-)

Web browsers come by default with many trusted root certificates. Which is why your browser isn't complaining about the new site.
Does this mean that the valid certificate chain Firefox is supposed to have downloaded [from the MW site, I presumed] is actually a concoction constructed by Firefox? If so, fair enough (but a badly worded Firefox certificate information page!) and I'd be [vaguely] interested in how it does it2. If that is the case, that would certainly explain why I couldn't make sense of the two very different certificate chains I could see!

Cheers - Al.

P.S. I make no claims to being an SSL/TLS guru, so excuse my [apparently] limited understanding :-)

1 According to the package manager ca-certificates "Contains the certificate authorities shipped with Mozilla's browser to allow SSL-based applications to check for the authenticity of SSL connections."

2 I presume that all the Linux software relevant here (BOINC, openssl, Firefox) ends up using a recent libssl version - BOINC seems to use libcurl as an intermediary to libssl3, openssl uses libssl3 and the Firefox snap seems to be a static build...
ID: 76534 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nickfolino

Send message
Joined: 8 Jan 20
Posts: 11
Credit: 6,164,641
RAC: 0
Message 76536 - Posted: 4 Nov 2023, 0:31:31 UTC - in response to Message 76534.  

Does this mean that the valid certificate chain Firefox is supposed to have downloaded [from the MW site, I presumed] is actually a concoction constructed by Firefox? If so, fair enough (but a badly worded Firefox certificate information page!) and I'd be [vaguely] interested in how it does it2. If that is the case, that would certainly explain why I couldn't make sense of the two very different certificate chains I could see!


I'll try to make it simple as I know it can be confusing. I'll use the MW certificate as an example.
MW was issued a certificate that was signed by an intermediate authority.
The intermediate authority's certificate was signed by a trusted root authority.
So the certificate chain for the MW site should have 3 certificates in it. The site cert, an intermediate cert, and a root cert.
All three are presented to you when you go to the site. The application you are using to get to the site then validates the chain.
It looks at the site cert, sees it was signed by the intermediate, which was signed by the root. So the chain is verified.
The chain that MW is currently presenting has the site cert, then an intermediate that wasn't used to sign the site cert, then the root cert.
It can't validate the chain because the intermediate cert isn't correct.
But if you tell your app to trust the correct intermediate and root all will be good.

Web browsers already have these certs in their cert store so they don't break.
Windows, being windows, uses the same cert store for the OS and the browser, which is why those clients aren't having problems as they already have the certs in their cert store.

Hope that helps.

Nick
ID: 76536 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 211
Credit: 108,184,825
RAC: 5,003
Message 76537 - Posted: 4 Nov 2023, 1:34:49 UTC - in response to Message 76536.  
Last modified: 4 Nov 2023, 1:36:45 UTC

Nick,

Thanks for your effort, but I rather think we're talking past one another (probably my fault) rather than communicating... What the details I saw on the Firefox certificate stuff had me pondering was whether all MW servers are actually sending out the same certificate chain!

What I read from your last reply is that if the browser can find certificates for the relevant intermediate and root CA issuers in its own store it won't bother to look at the rest of the chain sent by the [MW] server... The alternatives I can think of would be either

  1. Firefox sees the broken certificate and takes what action it can to work round it;
  2. the MW web server and whatever server(s) BOINC (and openssl) are hitting are actually returning different certificate chains!


Again, thanks for your efforts, but I don't think there's really any point in my pursuing this any further -- I'm not going to start digging with a network monitor, as I'm not that bothered :-) -- we've all combined to identify and report the problem, and it will probably be sorted out on Monday!

Cheers - Al.

ID: 76537 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 76549 - Posted: 4 Nov 2023, 11:02:01 UTC - in response to Message 76523.  

P.S. I note that the new master URL has -new added, but the web site doesn't -- is that the long-term plan or will the master URL end up changing back to not having -new in it?
milkyway-new.cs.rpi.edu is the master address but using just milkyway.cs.rpi.edu will also reach milkyway-new. This is the way the security team here at RPI set it up.
Seems to be working fine on my 10 Windows machines which I left connected to the old address. Can I just leave them there forever? This thread seems to be full of people creating a problem by reattaching when not necessary.
The above was double spaced between sentences, I apologise for the forum software ruining my post.
ID: 76549 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 5 Jul 11
Posts: 990
Credit: 376,143,149
RAC: 0
Message 76550 - Posted: 4 Nov 2023, 11:05:11 UTC - in response to Message 76536.  

Windows, being windows, uses the same cert store for the OS and the browser, which is why those clients aren't having problems as they already have the certs in their cert store.
Sounds like Windows is very sensible, why have two copies of the same thing?
The above was double spaced between sentences, I apologise for the forum software ruining my post.
ID: 76550 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Xterelle

Send message
Joined: 7 Aug 22
Posts: 9
Credit: 20,033,952
RAC: 0
Message 76553 - Posted: 4 Nov 2023, 14:50:27 UTC

There's some bullshit with the new task

https://milkyway-new.cs.rpi.edu/milkyway/result.php?resultid=930721522

<core_client_version>7.24.1</core_client_version>
<![CDATA[
<stderr_txt>
<search_application> milkyway_nbody 1.76 Windows x86_64 double OpenMP, Crlibm </search_application>
Using OpenMP 16 max threads on a system with 20 processors
Number of particles in bins is very small compared to total. (0 << 1). Skipping distance calculation
Number of particles in bins is very small compared to total. (0 << 1). Skipping distance calculation
<search_likelihood>-9999999.900000000400000</search_likelihood>
<search_likelihood_EMD>-0.000000000000000</search_likelihood_EMD>
<search_likelihood_Mass>-0.000000000000000</search_likelihood_Mass>
<search_likelihood_Beta>-0.000000000000000</search_likelihood_Beta>
17:44:38 (3712): called boinc_finish(0)

</stderr_txt>
]]>
ID: 76553 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
brewsmith

Send message
Joined: 31 Aug 10
Posts: 11
Credit: 2,752,146
RAC: 0
Message 76558 - Posted: 4 Nov 2023, 19:00:33 UTC

Thank you for everyone working on the connection issue with MilkyWay@home. I have a couple Boinc processes happening on Linux (ubuntu, ubuntu server) and they are all experiencing the same issue. I removed the MilkyWay@home connection, tried to reconnect (or add new project), and I have tried with both the -new address and the old one. All attempts result in "Failed to connect, try again later". I am aware of the issue with certificates, and I am patient in waiting for the solution.

Someone on our Discord Server asked if there was a problem connecting and I gave them a link to this thread.

I look forward to a solution and I am willing to try to connect whenever you want to try to see if a solution works.

Cheers and keep up the good work.
ID: 76558 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3339
Credit: 524,010,781
RAC: 1
Message 76566 - Posted: 5 Nov 2023, 10:41:33 UTC - in response to Message 76558.  

Thank you for everyone working on the connection issue with MilkyWay@home. I have a couple Boinc processes happening on Linux (ubuntu, ubuntu server) and they are all experiencing the same issue. I removed the MilkyWay@home connection, tried to reconnect (or add new project), and I have tried with both the -new address and the old one. All attempts result in "Failed to connect, try again later". I am aware of the issue with certificates, and I am patient in waiting for the solution.

Someone on our Discord Server asked if there was a problem connecting and I gave them a link to this thread.

I look forward to a solution and I am willing to try to connect whenever you want to try to see if a solution works.

Cheers and keep up the good work.


The problem is a screwed up Certificate on the Server side and the IT guys come back to work on Monday and should fix it then
ID: 76566 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nickfolino

Send message
Joined: 8 Jan 20
Posts: 11
Credit: 6,164,641
RAC: 0
Message 76569 - Posted: 5 Nov 2023, 12:50:03 UTC - in response to Message 76550.  

[/quote]Sounds like Windows is very sensible, why have two copies of the same thing?[/quote]

They're not the same thing. There are many reasons to keep them separate, spillage is the first one that pops into my head.

Nick
ID: 76569 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Anthony Liggins

Send message
Joined: 9 Jun 12
Posts: 6
Credit: 20,077,757
RAC: 0
Message 76572 - Posted: 5 Nov 2023, 21:30:03 UTC

Hi everyone,

After letting my wu's runout. I deleted the project and then added it again to be in sync with the migration. My issue is, since then BOINC stats has stopped receiving my daily update.
Is anyone else experiencing this?
Will it right it's self?
Or do I need to renew my CPUID to fix this?

Kind regards,
Anthony.
ID: 76572 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 4 Jul 09
Posts: 92
Credit: 17,303,551
RAC: 2,610
Message 76575 - Posted: 6 Nov 2023, 11:03:53 UTC - in response to Message 76572.  

Yes as part of the migration issues Stat's and Credits not updating has been identified as an issue t be fixed.

Bill F
ID: 76575 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alanb1951

Send message
Joined: 16 Mar 10
Posts: 211
Credit: 108,184,825
RAC: 5,003
Message 76579 - Posted: 6 Nov 2023, 16:24:24 UTC

The certificate issue seems to be resolved now -- I was able to get MW up and running again on my Linux systems from around 15:45 UTC on 2023-11-06. It even worked as I wanted once I remembered that I needed to restore/rebuild my app_config.xml files to cut the number of threads per task :-)

Thank you Kevin and the RPI techs.

Cheers - Al.
ID: 76579 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Kevin Roux
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 9 Aug 22
Posts: 80
Credit: 2,558,312
RAC: 7,513
Message 76580 - Posted: 6 Nov 2023, 17:10:48 UTC - in response to Message 76549.  

P.S. I note that the new master URL has -new added, but the web site doesn't -- is that the long-term plan or will the master URL end up changing back to not having -new in it?
milkyway-new.cs.rpi.edu is the master address but using just milkyway.cs.rpi.edu will also reach milkyway-new. This is the way the security team here at RPI set it up.
Seems to be working fine on my 10 Windows machines which I left connected to the old address. Can I just leave them there forever? This thread seems to be full of people creating a problem by reattaching when not necessary.


You shouldn't need to change anything. The old address and the new address are essentially the same thing.
ID: 76580 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Kevin Roux
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 9 Aug 22
Posts: 80
Credit: 2,558,312
RAC: 7,513
Message 76581 - Posted: 6 Nov 2023, 17:14:05 UTC - in response to Message 76579.  

The certificate issue seems to be resolved now -- I was able to get MW up and running again on my Linux systems from around 15:45 UTC on 2023-11-06. It even worked as I wanted once I remembered that I needed to restore/rebuild my app_config.xml files to cut the number of threads per task :-)

Thank you Kevin and the RPI techs.

Cheers - Al.


Yes the issue with the certificates has been fixed.
Please let me know if there are other issues with connecting.
I will keep checking this thread as I work to fix all the issues.
ID: 76581 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
brewsmith

Send message
Joined: 31 Aug 10
Posts: 11
Credit: 2,752,146
RAC: 0
Message 76583 - Posted: 6 Nov 2023, 18:35:12 UTC - in response to Message 76581.  

Thank you for all the hard work. On my Linux machines, I have now added MilkyWay@home back successfully.

Question - Is credit for work done now working as well?

The certificate issue seems to be resolved now -- I was able to get MW up and running again on my Linux systems from around 15:45 UTC on 2023-11-06. It even worked as I wanted once I remembered that I needed to restore/rebuild my app_config.xml files to cut the number of threads per task :-)

Thank you Kevin and the RPI techs.

Cheers - Al.


Yes the issue with the certificates has been fixed.
Please let me know if there are other issues with connecting.
I will keep checking this thread as I work to fix all the issues.
ID: 76583 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Thread to report issues after server migration

©2024 Astroinformatics Group