Questions and Answers :
Unix/Linux :
BOINC in Docker Swarm
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Feb 22 Posts: 18 Credit: 2,967,695 RAC: 0 |
So I run a 6 node (6 VMs Running Ubuntu 18.04.6 LTS) Docker Swarm, which hosts 6 docker containers running BOINC on a mature IBM 3550 M4. Today I lost quorum in the cluster which forced all containers to consolidate on to 1 node. However I noticed that despite the containers being active and the service for BOINC still running in each container, no processing was happening. First thing I tried was to force a rebalance across the cluster after fixing the issue with the other 5 nodes. :~# docker service update --force boinc boinc overall progress: 6 out of 6 tasks 1/6: running [==================================================>] 2/6: running [==================================================>] 3/6: running [==================================================>] 4/6: running [==================================================>] 5/6: running [==================================================>] 6/6: running [==================================================>] verify: Service converged Once the containers were spread evenly across the swarm and rebalanced, I waited but still no activity. So I hopped on BOINC Manager, connected to each container, and verified the service and project. Forced and update, but the containers just wouldn't check in. This is where I learned how picky BOINC running in docker can be. I checked the logs and come to find out, the consolidation (Not the rebalance) actually changed the container names, which didn't match up to what was in the MilkyWay project under "View Computers". So while annoying not really a big deal, since its docker, just trash all 6 containers and redeploy... :~# docker service rm boinc boinc :~# docker service ls ID NAME MODE REPLICAS IMAGE PORTS :~# docker service create --replicas 6 --name boinc --network=boinc -p xxxx -e BOINC_GUI_RPC_PASSWORD="xxxxxxxxxxxxx" -e BOINC_CMD_LINE_OPTIONS="--allow_remote_gui_rpc" boinc/xxxxxxxxxxxxxx overall progress: 6 out of 6 tasks 1/6: running [==================================================>] 2/6: running [==================================================>] 3/6: running [==================================================>] 4/6: running [==================================================>] 5/6: running [==================================================>] 6/6: running [==================================================>] verify: Service converged Then once the new images were deployed, just had to register the new docker containers with the new names with MilkyWay@home. :~# docker run --rm --network boinc boinc/client boinccmd_swarm --passwd xxxxxxxxxxxxx --project_attach http://milkyway.cs.rpi.edu/milkyway/ <insert keys here, no you can't see mine> ===== Client at M.Y.I.P2 ===== ===== Client at M.Y.I.P2 ===== ===== Client at M.Y.I.P3 ===== ===== Client at M.Y.I.P4 ===== ===== Client at M.Y.I.P5 ===== ===== Client at M.Y.I.P6 ===== Boom, activity see after a few minutes, and they showed up under "View Computers". After that, it was just a matter of clean up, by merging the old computer ids with the new computer ids. And yeah I know its a complicated and odd way to run BOINC, but idle hands and all... Figured I would share incase anyone else was running BOINC in docker. -PD PS: Please excuse the redacted parts, etc...etc.. |
©2025 Astroinformatics Group