Welcome to MilkyWay@home

Posts by Martin

1) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 75036)
Posted 8 Feb 2023 by Martin
Post:
My Radeon VII is starving

I checked the tasks for your computer now few times at different times of the day and everytime I checked there were over 300 in progress tasks assigned to your computer. Don't you have them on your computer? 300 is IIRC the limit per GPU and a constant cache of 300+ tasks should be enough to feed even your GPU. Sure, not for very long, but I guess 13 is the amount your GPU does between each request?



Yes, everything has been "Ok" for almost 2 days now.

On the 6th, I enabled event log option Sched_op_debug and soon saw that the 13 tasks I was getting regularly at spaced out intervals were estimated by the servers to take my gpu approx 510 seconds to complete. What the servers didn't realize, the gpu was crunching 3 tasks at a time and they were using the ~39-40 second task completion time for all tasks, as an estimate for one single task crunching at a time.

After seeing that (and realizing what it was doing) I soon started running 1 task at a time. The 20-22 seconds it takes for 1 separation task to finish must have started changing the servers completion time estimates. It wasn't long before the In progress cache started increasing. After it got to about 120 or so, I switched back to running BOINC 7.21.0 and started running 3 tasks at a time again.

Now, as you saw, the cache is hovering steadily around 300 again. And that is happening because I am getting quite frequent downloads of 7-13 tasks at a time, which is how the machine was running prior to whatever caused the servers to 'slowdown' task deliveries here.

Martin
2) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 75028)
Posted 7 Feb 2023 by Martin
Post:
I finally just re-installed BOINC, program dir and data dir on the WD Element drive. This allows me to change to any C: drive without interfereing at all with BOINC, as long as the OS running is a Windows 64-bit version.

I haven't figured out the task downloading issues yet. But maybe my downloads have tanked because this is what Mikey was trying to alert me about; that the servers might think there is something 'fishy' with this user/pc/setup after the OS changes and disk size changes ?

Martin
3) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 75027)
Posted 6 Feb 2023 by Martin
Post:
06/02/2023 14:45:51 | Milkyway@Home | Scheduler request completed: got 26 new tasks

Just got 26 tasks, all that was needed to fill up my cache. So it's possible to get more than 13. Perhaps this might help you in case the issue won't go away by itself.



Sorry, I didn't explain the differences very well.

BOINC 7.21.0 is a modified 7.20.2 designed to bypass the servers wait time between task downloads. Joseph Stateson made the mod and you can read about it in the Windows > "GPU in non-continuous operation" thread. Joseph explains his mod, "All I have done is modify the scheduling algorithm in BOINC to bypass that 91 second delay that the Milkyway server wants."

7.21.0 was working beautifiully on 3 Feb and one typical hour of task downloads from the event log that day looks like this:

03-Feb-2023 10:02:00 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:03:38 [Milkyway@Home] Scheduler request completed: got 6 new tasks
03-Feb-2023 10:06:53 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:08:30 [Milkyway@Home] Scheduler request completed: got 7 new tasks
03-Feb-2023 10:11:44 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:13:22 [Milkyway@Home] Scheduler request completed: got 7 new tasks
03-Feb-2023 10:16:36 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:18:14 [Milkyway@Home] Scheduler request completed: got 6 new tasks
03-Feb-2023 10:21:29 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:23:06 [Milkyway@Home] Scheduler request completed: got 8 new tasks
03-Feb-2023 10:26:21 [Milkyway@Home] Scheduler request completed: got 13 new tasks
03-Feb-2023 10:27:58 [Milkyway@Home] Scheduler request completed: got 7 new tasks
03-Feb-2023 10:31:13 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:32:50 [Milkyway@Home] Scheduler request completed: got 6 new tasks
03-Feb-2023 10:36:05 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:37:42 [Milkyway@Home] Scheduler request completed: got 9 new tasks
03-Feb-2023 10:40:56 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:42:32 [Milkyway@Home] Scheduler request completed: got 6 new tasks
03-Feb-2023 10:45:46 [Milkyway@Home] Scheduler request completed: got 17 new tasks
03-Feb-2023 10:47:22 [Milkyway@Home] Scheduler request completed: got 7 new tasks
03-Feb-2023 10:50:35 [Milkyway@Home] Scheduler request completed: got 16 new tasks
03-Feb-2023 10:52:11 [Milkyway@Home] Scheduler request completed: got 6 new tasks
03-Feb-2023 10:55:30 [Milkyway@Home] Scheduler request completed: got 15 new tasks
03-Feb-2023 10:57:07 [Milkyway@Home] Scheduler request completed: got 8 new tasks

Today, 6 Feb, the same hour looks like this:

06-Feb-2023 10:07:14 [Milkyway@Home] Scheduler request completed: got 13 new tasks
06-Feb-2023 10:19:03 [Milkyway@Home] Scheduler request completed: got 13 new tasks
06-Feb-2023 10:38:30 [Milkyway@Home] Scheduler request completed: got 13 new tasks
06-Feb-2023 10:53:56 [Milkyway@Home] Scheduler request completed: got 13 new tasks

The amount of tasks being downloaded (when I do receive any tasks) is not the central problem, it's the frequency of getting new work that is the main issue. On Feb 3rd during the 10 o'clock hour, I received 264 new tasks. On the 6th, only 52.

My Radeon VII is starving ! My task cache (In Progress) was floating around 300 before with new work always available, now it is (0); the VII is sitting idle most of the time.

Martin
4) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 75023)
Posted 5 Feb 2023 by Martin
Post:

It's not that complicated, don't need to figure out anything actually. Simply run the installer on each Windows installation and point to the data directory on the shared drive, one dir for 32-bit and one for 64-bit. Program files can just go to the current C: or if you want to save some space, same like the data on shared drive. The installer does everything for you, that's not Linux. ;-)

The actual question is how the projects servers will react to OS changes, but I don't recall any issues with that after upgrade from Win7 to Win10.



It works ! Just as you said.

The shared data directory stays intact during new BOINC installs and between changes of the C: drive. So, if BOINC is running or not, just shutdown (or restart), install BOINC if needed on new C:, run BOINC.

The only 'issue' I've noticed is the servers are now sending just 13 tasks on each successful work request. The 91 second x2 wait between getting work seems to be enforced again, in spite of having BOINC 7.21.0 running. Prior to the first C: change, I did complete all tasks (~300) before proceeding. Now I'm running out of tasks faster than receiving a new batch of 13.

Thanks again, Link !

Martin
5) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 74993)
Posted 2 Feb 2023 by Martin
Post:
You can hook up a 2nd drive either internally or externally and do a total install of Boinc on it but some things still have to go in the Windows installation, Windows is bad about that!! If you can figure out which files those are and copy them to the right places every time your C Drive changes it should work.

It's not that complicated, don't need to figure out anything actually. Simply run the installer on each Windows installation and point to the data directory on the shared drive, one dir for 32-bit and one for 64-bit. Program files can just go to the current C: or if you want to save some space, same like the data on shared drive. The installer does everything for you, that's not Linux. ;-)

The actual question is how the projects servers will react to OS changes, but I don't recall any issues with that after upgrade from Win7 to Win10.


Link, that almost sounds too easy !

I think for now I'll skip the 32-bit BOINC option. May add too many complications.

Thanks !

Martin
6) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 74990)
Posted 2 Feb 2023 by Martin
Post:

It does NOT sound like cheating to me!! You can hook up a 2nd drive either internally or externally and do a total install of Boinc on it but some things still have to go in the Windows installation, Windows is bad about that!! If you can figure out which files those are and copy them to the right places every time your C Drive changes it should work. There is a Boinc Alpha software testing email group and this is what it says at the bottom of each email:
You received this message because you are subscribed to the Google Groups "boinc_alpha" group.
To unsubscribe from this group and stop receiving emails from it, send an email to boinc_alpha+unsubscribe@ssl.berkeley.edu.
To view this discussion on the web visit https://groups.google.com/a/ssl.berkeley.edu/d/msgid/boinc_alpha/CAFtuuLQLATEcTeXLh%3DwxhUtBQokyU9ueHg-QdEoyfTC93hfTfg%40mail.gmail.com.

Hopefully you can figure out to ask to join from one of those links and then ask which files get installed outside the Boinc folders and where they need to go. I'm guessing I used something like this to subscribe..boinc_alpha+subscribe@ssl.berkeley.edu but it's been so long since I did it I'm not sure.



Great info, Thanks !

I'll check it out.

Martin
7) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 74985)
Posted 1 Feb 2023 by Martin
Post:
If you are thinking of downloading tasks on one pc and then running them on a different pc there are safeguards against doing that as it's one of the ways people used to cheat big time back when Seti was young. YES there are people that do it today but I don't know how or if it's allowed or ignored at certain projects.


NO, sIr !! I have no intention of cheating or doing anything that might be construed as cheating.

My idea is to run BOINC on a single desktop from a single central location, the single desktop will just have a different C: drive occasionally. That's the idea. No networking in or out of the central BOINC location, no separate machines attempting to operate as one or anything like that.

If that seems like cheating, let me know and I'll forget the whole idea.

Martin
8) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 74978)
Posted 31 Jan 2023 by Martin
Post:
Thanks, Link!

I think I'll continue to ponder this idea some more before starting the "experiment". If I do proceed, I'll try to start as simply as possible and report what happens.

Martin
9) Questions and Answers : Windows : Make any "C:" drive run BOINC from a central location (Message 74976)
Posted 31 Jan 2023 by Martin
Post:
I have many Windows drives (3.5"hdd, SSD, NVMe) (Win XP 32, 7 64x and up) I use as "C:" when doing different things.

I would like to have BOINC fully installed on an external USB backup drive or a simple thumb drive? and have the necessary BOINC bits installed on each of my drives pointing to the central location so I can run BOINC regardless of which drive is "C:" at the moment (only one drive at a time will be accessing the central location.)

Would a simple desktop shortcut on each drive be all that's needed ?

Martin
10) Message boards : Number crunching : Future of Milkyway@Home (Message 74956)
Posted 29 Jan 2023 by Martin
Post:
Yes.... what is the alternative?
I assume that nothing will change but over time people will upgrade their GPU's but that will result in less work being completed for this project.

I'm hoping my Radeon VII's kick on for a while longer but any replacement card, won't be as good for M@H.



There are a couple possible alternatives;

I believe the AMD MI50's and MI60's are comparable to VII. The MI60 has 32GB of HBM2 memory (and a 1/2 FP64 divider, which gives it a 7+ TB FP64 rating, much faster than the VII's 1/4 FP64 divider). MI50 has 16GB of HBM2 memory just as the VII does.

Both are rated at 300 Watts, same as VII.

Info from the TechPowerUp and other various web pages.

I have my VII undervolted <200 Watts and it still performs fine.

Martin
11) Message boards : Number crunching : AMD VII: Occasional a task never finishes and is "hot spot" too high? (Message 74925)
Posted 19 Jan 2023 by Martin
Post:
I finally got the AMD performance software to work. Problem was after driver install, I had 5 "ghost" GPU cards. I had to edit the coproc_info.xml file, remove the extra 5 GPUs and then mark the file read only so that BOINC would not be able to add the "extra" GPUs' back in.

The driver I used was "win10-radeon-pro-software-enterprise-21.Q2.1"
Possible using a device driver cleaner and a re-install might fix the copro_info.xml problem. I got duplicate GPUs due to BOINC ( or clinfo) seeing two drivers instead of one so it marked my system as having two opencl platforms and I had almost 500 error'ed out tasks before I could suspend the project and fix the problem.

What driver(s) are you using? Does one card have one version and a different card have another?

Anyway, this is the display of 4 of the 5 gpus. The 5tth would not fit in the screen capture. Note the junction temp, the so-called "hot spot".
Tuning is only available for the VII. I have not tried any tuning yet.
Is it the same app you are using?



No, I'm using an AMD Adrenalin edition, 22.6.1.0. I only have the one GPU and it also is driving the monitor. I have not had any big problems with this software.
12) Message boards : Number crunching : AMD VII: Occasional a task never finishes and is "hot spot" too high? (Message 74923)
Posted 19 Jan 2023 by Martin
Post:
I am running 4 work units per GPU: One AMD VIi and several AMD S9xxx boards

About once every 4-5 days a task hangs up on the VII. It is easily fixed by suspending and then resuming the task. The VII is the fastest board, 2x as fast as S9150 and the problem is only on the VII. I used a boinctask "rule" to automatically suspend any MW task taking over 5 minutes which allows the card to continue processing 4 tasks as is normal instead of 3 and a hung task.

1 - Has anyone seen a problem like this before?

2 - GPUz has a feature that shows the "hot spot". My RTX-2080Ti and the VII card are the only ones that reports "hot spot". My other Nvidia and AMD cards lack that feature, probably too old. The VII has 3 fans and is in an open frame rack and there is a box fan cooling the rack. Its "hot spot" runs 102-107c. The RTX-2080Ti is in a case "Area51" that is cramped. It shows 80c for its hot spot. Do these values seem ok?


Some reviews I've read have mentioned that the VII is way over volted out of the box.

I undervolt my VII and do only 3 tasks at a time. Powering the VII with just 1006mv and adjusting the fan curve so noise is hardly noticeable, the temps haven't recently been above 88c for the hotspot (from what I've read, there are several "hotspot" sensors on the GPU chip and the chip's logic reports the hottest one at the moment) and 63c for memory. Ambient temp 73-74f.

I use HWMonitor mostly to keep an eye on what's happening system wide and I'm always trying to tweak performance bit -by-bit.
13) Questions and Answers : Unix/Linux : CL_OUT_OF_HOST_MEMORY with AMD RX 6600 XT on Xubuntu 20.04 (Message 74797)
Posted 14 Dec 2022 by Martin
Post:
Maybe a dumb idea, but could you run Windows in a Virtual Machine and put BOINC and AMD's cl compatible drivers on it ?

Martin
14) Message boards : Number crunching : Daily graphs of server_status (Message 74612)
Posted 31 Oct 2022 by Martin
Post:
The downward trend has resumed


I wonder if they are rebooting the Server on a regular basis to make it happen?



I haven't noticed any stutters in receiving work, but now I see that Waiting for validation is down over half a million just since this morning !

Trends are Good, when going in the right direction...
15) Message boards : Number crunching : Daily graphs of server_status (Message 74598)
Posted 30 Oct 2022 by Martin
Post:
Waiting for validation is down nearly 20k in just the last 4 hours.
16) Message boards : Number crunching : Validation Pending too many tasks (Message 74578)
Posted 27 Oct 2022 by Martin
Post:
"Waiting for Validation" is down ~500,000 since the last time I looked at it, on 10/23 it was around 5.8 million..

Looks like the queue is clearing out nicely. So far.

Martin
17) Message boards : Number crunching : Validation Pending too many tasks (Message 74516)
Posted 20 Oct 2022 by Martin
Post:
Suddenly, current tasks I've just returned are showing Completed and validated !

Martin




©2024 Astroinformatics Group