Welcome to MilkyWay@home

BOINC in Docker Swarm

Questions and Answers : Unix/Linux : BOINC in Docker Swarm
Message board moderation

To post messages, you must log in.

AuthorMessage
Peter Dragon
Avatar

Send message
Joined: 27 Feb 22
Posts: 18
Credit: 2,967,695
RAC: 0
Message 73169 - Posted: 29 Apr 2022, 23:55:34 UTC
Last modified: 29 Apr 2022, 23:57:04 UTC

So I run a 6 node (6 VMs Running Ubuntu 18.04.6 LTS) Docker Swarm, which hosts 6 docker containers running BOINC on a mature IBM 3550 M4. Today I lost quorum in the cluster which forced all containers to consolidate on to 1 node. However I noticed that despite the containers being active and the service for BOINC still running in each container, no processing was happening.

First thing I tried was to force a rebalance across the cluster after fixing the issue with the other 5 nodes.

:~# docker service update --force boinc
boinc
overall progress: 6 out of 6 tasks
1/6: running   [==================================================>]
2/6: running   [==================================================>]
3/6: running   [==================================================>]
4/6: running   [==================================================>]
5/6: running   [==================================================>]
6/6: running   [==================================================>]
verify: Service converged


Once the containers were spread evenly across the swarm and rebalanced, I waited but still no activity. So I hopped on BOINC Manager, connected to each container, and verified the service and project. Forced and update, but the containers just wouldn't check in.

This is where I learned how picky BOINC running in docker can be. I checked the logs and come to find out, the consolidation (Not the rebalance) actually changed the container names, which didn't match up to what was in the MilkyWay project under "View Computers".

So while annoying not really a big deal, since its docker, just trash all 6 containers and redeploy...

:~# docker service rm boinc
boinc

:~# docker service ls
ID        NAME      MODE      REPLICAS   IMAGE     PORTS

:~# docker service create --replicas 6 --name boinc --network=boinc -p xxxx -e BOINC_GUI_RPC_PASSWORD="xxxxxxxxxxxxx" -e BOINC_CMD_LINE_OPTIONS="--allow_remote_gui_rpc" boinc/xxxxxxxxxxxxxx
overall progress: 6 out of 6 tasks
1/6: running   [==================================================>]
2/6: running   [==================================================>]
3/6: running   [==================================================>]
4/6: running   [==================================================>]
5/6: running   [==================================================>]
6/6: running   [==================================================>]
verify: Service converged

Then once the new images were deployed, just had to register the new docker containers with the new names with MilkyWay@home.

:~# docker run --rm --network boinc boinc/client boinccmd_swarm --passwd xxxxxxxxxxxxx --project_attach http://milkyway.cs.rpi.edu/milkyway/ <insert keys here, no you can't see mine>
===== Client at M.Y.I.P2 =====
===== Client at M.Y.I.P2 =====
===== Client at M.Y.I.P3 =====
===== Client at M.Y.I.P4 =====
===== Client at M.Y.I.P5 =====
===== Client at M.Y.I.P6 =====


Boom, activity see after a few minutes, and they showed up under "View Computers". After that, it was just a matter of clean up, by merging the old computer ids with the new computer ids.

And yeah I know its a complicated and odd way to run BOINC, but idle hands and all... Figured I would share incase anyone else was running BOINC in docker.

-PD

PS: Please excuse the redacted parts, etc...etc..
ID: 73169 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : BOINC in Docker Swarm

©2024 Astroinformatics Group