Message boards :
Number crunching :
Thread to report issues after server migration
Message board moderation
Author | Message |
---|---|
Send message Joined: 9 Aug 22 Posts: 80 Credit: 2,553,087 RAC: 7,641 |
Hello everyone, Please post all issues you have with the project that pertain to the recent (Nov 1st, 2023) server migration. I will be monitoring this thread closely for the next week and will be working to fix all issues as fast as I can. Having said this, please be patient with me as there will probably be many issues. Thank you, Kevin |
Send message Joined: 9 Aug 22 Posts: 80 Credit: 2,553,087 RAC: 7,641 |
I am aware that workunit downloads are failing and will be working to fix this first. |
Send message Joined: 16 Mar 10 Posts: 211 Credit: 108,180,811 RAC: 5,037 |
Kevin, As has been noted in the News thread, it appears that Linux systems can't re-attach... Trying to amend the URL offered by BOINC Manager results in a "Please try again later" message. Not helpful :-) So I used boinccmd to attach instead, and that seemed to do something. However, looking in BOINC Manager after that shows the project identified by URL rather than by name, and the status is reported as "Scheduler request pending. Project initialization" (with a Communication deferred time appended) If I try an update, it offers "Fetching scheduler list" then reports "Project communication failed"... Checking in /var/lib/boinc-client, I have what seems to be a valid account_milkyway-new.cs.rpi.edu_milkyway.xml but master_milkyway-new.cs.rpi.edu_milkyway.xml is empty. Hope that helps :-) Cheers - Al. P.S. I note that the new master URL has -new added, but the web site doesn't -- is that the long-term plan or will the master URL end up changing back to not having -new in it? |
Send message Joined: 9 Aug 22 Posts: 80 Credit: 2,553,087 RAC: 7,641 |
Kevin, So I have been trying to see if the issue was on my end. It might be that there's something in cache or settings that is messed. Updating BOINC seemed to work when testing to see what the issue was. |
Send message Joined: 7 Aug 22 Posts: 9 Credit: 20,033,952 RAC: 0 |
There are a lot of errors on these pages https://milkyway-new.cs.rpi.edu/milkyway/top_users.php https://milkyway-new.cs.rpi.edu/milkyway/top_teams.php https://milkyway-new.cs.rpi.edu/milkyway/top_hosts.php |
Send message Joined: 8 Jan 20 Posts: 11 Credit: 6,164,641 RAC: 0 |
I compiled the new 7.25.0 client on a few Linux servers and still cannot connect to the new server. Nick |
Send message Joined: 9 Aug 22 Posts: 80 Credit: 2,553,087 RAC: 7,641 |
I compiled the new 7.25.0 client on a few Linux servers and still cannot connect to the new server. What error are you receiving in your event log? |
Send message Joined: 16 Mar 10 Posts: 211 Credit: 108,180,811 RAC: 5,037 |
Kevin, I see Nick has tried the latest available client, but I'm using the latest repository clients (which are a tad older!) and given his experience I'm not going to spend ages working out how to build a client :-) On client 7.20.2 I see the following at the end of a connection attempt (after it has redirected to https on port 433): Thu 02 Nov 2023 20:54:23 GMT | http://milkyway-new.cs.rpi.edu/milkyway/ | [http] [ID#1] Info: TLSv1.2 (OUT), TLS header, Unknown (21): Thu 02 Nov 2023 20:54:23 GMT | http://milkyway-new.cs.rpi.edu/milkyway/ | [http] [ID#1] Info: TLSv1.3 (OUT), TLS alert, unknown CA (560): Thu 02 Nov 2023 20:54:23 GMT | http://milkyway-new.cs.rpi.edu/milkyway/ | [http] [ID#1] Info: SSL certificate problem: unable to get local issuer certificate Thu 02 Nov 2023 20:54:23 GMT | http://milkyway-new.cs.rpi.edu/milkyway/ | [http] [ID#1] Info: Closing connection 16 Thu 02 Nov 2023 20:54:23 GMT | http://milkyway-new.cs.rpi.edu/milkyway/ | [http] HTTP error: SSL peer certificate or SSH remote key was not OK Similar on client 7.20.5 too. Hope this helps. Cheers - Al. |
Send message Joined: 8 Jan 20 Posts: 11 Credit: 6,164,641 RAC: 0 |
Nov 02 17:33:46 inspiron.folino.us boinc[209154]: 02-Nov-2023 17:33:46 [---] Fetching configuration file from https://milkyway-new.cs.rpi.edu/milkyway/get_project_config.php Nov 02 17:33:47 inspiron.folino.us boinc[209154]: 02-Nov-2023 17:33:47 [---] Project communication failed: attempting access to reference site Nov 02 17:33:49 inspiron.folino.us boinc[209154]: 02-Nov-2023 17:33:49 [---] Internet access OK - project servers may be temporarily down. Nick |
Send message Joined: 8 Jan 20 Posts: 11 Credit: 6,164,641 RAC: 0 |
I can see in Wireshark the "Unknown CA" error reported by Al. If you can point me to the location of the cert in the source I can try and replace it and see if it works. Nick |
Send message Joined: 8 Jan 20 Posts: 11 Credit: 6,164,641 RAC: 0 |
I updated client/http_curl.cpp to not check for a valid cert and it now works. Not the best solution, but it has me back up for now. Nick |
Send message Joined: 28 Jul 22 Posts: 4 Credit: 7,756,599 RAC: 0 |
It's definitely certificate related on my Linux Docker. Running this command on my mac works, but on my container it returns an error. curl https://milkyway-new.cs.rpi.edu/milkyway/get_project_config.php -v Results on Docker: root@84d5ad767049:/tmp/certs# curl https://milkyway.cs.rpi.edu/milkyway/get_project_config.php -v * Trying 128.113.126.54:443... * Connected to milkyway.cs.rpi.edu (128.113.126.54) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * CAfile: /etc/ssl/certs/ca-certificates.crt * CApath: /etc/ssl/certs * TLSv1.0 (OUT), TLS header, Certificate Status (22): * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.2 (IN), TLS header, Certificate Status (22): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS header, Finished (20): * TLSv1.2 (IN), TLS header, Supplemental data (23): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.2 (IN), TLS header, Supplemental data (23): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.2 (OUT), TLS header, Unknown (21): * TLSv1.3 (OUT), TLS alert, unknown CA (560): * SSL certificate problem: unable to get local issuer certificate * Closing connection 0 curl: (60) SSL certificate problem: unable to get local issuer certificate More details here: https://curl.se/docs/sslcerts.html curl failed to verify the legitimacy of the server and therefore could not establish a secure connection to it. To learn more about this situation and how to fix it, please visit the web page mentioned above. The root CA is in my ca-certificates.crt file, so I am not sure what is going on yet. |
Send message Joined: 28 Jul 22 Posts: 4 Credit: 7,756,599 RAC: 0 |
Looks like the issue is the server is not sending the full certificate chain. See this link for details: https://www.ssllabs.com/ssltest/analyze.html?d=milkyway-new.cs.rpi.edu The server needs to be configured to send the "InCommon RSA Server CA 2" intermediate CA certificate as well. Looks like there are instructions on how to configure Apache here: https://www.digicert.com/kb/csr-ssl-installation/apache-openssl.htm You'll need a .crt file containing the entire certificate chain. |
Send message Joined: 8 Jan 20 Posts: 11 Credit: 6,164,641 RAC: 0 |
Looks like the issue is the server is not sending the full certificate chain. I see the full chain. Try this: openssl s_client -connect milkyway-new.cs.rpi.edu:443 -showcerts |
Send message Joined: 28 Jul 22 Posts: 4 Credit: 7,756,599 RAC: 0 |
Yeah, openssl s_client seems to report the full chain, but also reports an error of "Verify return code: 21 (unable to verify the first certificate)" Reading a bunch of docs about it all point to the server not sending the full chain when that error is reported. One blog post said to use the website I linked to check and sure enough it reports the immediate issuing certificate is not being presented by the server. |
Send message Joined: 8 Jan 20 Posts: 11 Credit: 6,164,641 RAC: 0 |
Ahhhh... The second cert in the chain is not correct. The server cert is signed by: C = US, O = Internet2, CN = InCommon RSA Server CA 2 But the chain has: C = US, ST = MI, L = Ann Arbor, O = Internet2, OU = InCommon, CN = InCommon RSA Server CA Nick |
Send message Joined: 28 Jul 22 Posts: 4 Credit: 7,756,599 RAC: 0 |
Wow, good catch. That is exactly the issue. |
Send message Joined: 18 Jun 09 Posts: 35 Credit: 11,811,888 RAC: 0 |
There's a problem with the certificate chain on the new server. I can confirm that after finangling with the local system certificate store and adding those lo and behold BOINC manager can connect milkyway project, thus it needs a certificate fix on server side ASAP |
Send message Joined: 7 Aug 22 Posts: 9 Credit: 20,033,952 RAC: 0 |
There are some errors in the pages with all the tasks. For example here Deprecated: Implicit conversion from float-string "637.401947" to int loses precision in /home/boinc/boinc/milkyway/html/inc/util.inc on line 424 Deprecated: Implicit conversion from float-string "1753.75" to int loses precision in /home/boinc/boinc/milkyway/html/inc/util.inc on line 424 https://milkyway-new.cs.rpi.edu/milkyway/result.php?resultid=930340479 |
Send message Joined: 9 Aug 22 Posts: 80 Credit: 2,553,087 RAC: 7,641 |
milkyway-new.cs.rpi.edu is the master address but using just milkyway.cs.rpi.edu will also reach milkyway-new. This is the way the security team here at RPI set it up. |
©2024 Astroinformatics Group