Welcome to MilkyWay@home

Posts by Keith Myers

41) Message boards : Number crunching : ATI RX 580 GPU crunching? (Message 75185)
Posted 22 Mar 2023 by Profile Keith Myers
Post:
Yes, the MW Separation app is rather poorly optimized. It won't use very much of the gpu at all at any time. It begs running at least 2X on the card to get the gpu utilization up closer to 90%. 2X for Nvidia cards is about max becase of how the Nvidia drivers have to virtualize their OpenCL component to run the Separation app.

AMD or ATI cards OTOH, typically can run at least 4X or maybe 5X depending on how much memory the card has. The app doesn't really use all that much memory. I don't have any running currently to tell you exactly how much the app uses. I remember its about 1.6GB I think but don't quote me.
42) Questions and Answers : Unix/Linux : Problem switching from Windows to Ubuntu: ROCm question (Message 75182)
Posted 22 Mar 2023 by Profile Keith Myers
Post:
I know that AMD cards are a pain to work with, but are aware of this set of gpu utilities focused primarily on AMD cards? Would give you fan and clock control for one thing and some great displays of card info and graphs.

https://github.com/Ricks-Lab/gpu-utils

I've helped the developer out extensively on its ability to work on Nvidia cards also.
43) Questions and Answers : Unix/Linux : Problem switching from Windows to Ubuntu: ROCm question (Message 75171)
Posted 20 Mar 2023 by Profile Keith Myers
Post:
I haven't heard of anybody being able to use those mining motherboards efficiently for BOINC crunching. The 1X slots are just too restrictive.
44) Questions and Answers : Preferences : milkyway_separation 1.46 Linux x86_64 double OpenCL (Message 75170)
Posted 20 Mar 2023 by Profile Keith Myers
Post:
You can always look in your client_state.xml file for the app_name.
45) Questions and Answers : Windows : task is running infinite like in a loop (Message 75161)
Posted 16 Mar 2023 by Profile Keith Myers
Post:
The N-body tasks prefer to utilize all your cpu at all times. So if you are trying to run other cpu tasks or using your computer for other things at the same time, the tasks can stall out as you've noticed.

To see whether your gpu is being used simply look in the Boinc Manager for any running gpu tasks.

The Windows Task Manager can also show the gpu usage if you toggle its pages to show gpu utilization.
46) Questions and Answers : Unix/Linux : Compute errors (Message 75093)
Posted 1 Mar 2023 by Profile Keith Myers
Post:
Make sure the software-updater didn't pull the Nvidia drivers out from underneath the running task.

There seems to be an urgent security update of the hosts today with new Nvidia driver 525.89.02 being pushed out today.

Already caught two hosts with all the downloads in place already and the updater just awaiting an acknowledgement to proceed.
47) Questions and Answers : Unix/Linux : CL_OUT_OF_HOST_MEMORY with AMD RX 6600 XT on Xubuntu 20.04 (Message 75083)
Posted 23 Feb 2023 by Profile Keith Myers
Post:
I would state that it is more than likely that most BOINC crunchers crunch on more than one project. I woud guess there are very few crunchers, or very much in the minority that only crunch Milkyway.

So not having the most efficient gpu for just MW is not a consideration. There are a few projects that only offer CUDA applications so that leaves any AMD card out of consideration.

Based on your supposition for a MW only cruncher, I would state it is more efficient to use the much older AMD generations of gpus that have high FP64 rankings. They would be most cost effective but wouldn't be useful for other than crunching purposes.
48) Questions and Answers : Unix/Linux : CL_OUT_OF_HOST_MEMORY with AMD RX 6600 XT on Xubuntu 20.04 (Message 75081)
Posted 22 Feb 2023 by Profile Keith Myers
Post:
I could care less about theoretical FP64 specifications. I would just examine the actual 1X computation times for both cards. You won't see the 4090 card turning in 2X the computation time of the 7950 XTX.
49) Questions and Answers : Unix/Linux : CL_OUT_OF_HOST_MEMORY with AMD RX 6600 XT on Xubuntu 20.04 (Message 75076)
Posted 22 Feb 2023 by Profile Keith Myers
Post:
Don't feel like you just lack the knowledge to figure out OpenCL on the Radeon 6900XTX cards. Even Michael Larabel at Phoronix, who is an actual Linux wiz, couldn't get the ROCm drivers to run OpenCL tests without these "out of memory" errors.

https://www.phoronix.com/review/nvidia-rtx4080-rtx4090-compute

Besides many of the binary-only (CUDA) benchmarks being incompatible with the AMD ROCm compute stack, even for the common OpenCL benchmarks there were problems testing the latest driver build; the Radeon RX 7900 XTX was hitting OpenCL "out of host memory" errors when initializing the OpenCL driver with the RDNA3 GPUs. So with those issues plus the AMD ROCm compute stack still being hit or miss depending upon the particular consumer GPU, this article ended up just being a generational look at the NVIDIA compute performance on Ubuntu Linux.


I really feel anyone that is still trying to tuff it out getting the newer AMD cards to do BOINC OpenCL projects is just a glutton for punishment.

Much simpler to use Nvidia cards which 'just work' and get on with crunching. There really is no difference in FP64 capabilities anymore in the latest generation of consumer cards from either camp.
50) Message boards : Number crunching : I gotta remember... (Message 75025)
Posted 6 Feb 2023 by Profile Keith Myers
Post:
Go back to the /boot/grub/grub.cfg file and see what the -- set root= statement has for the UUID for the Linux OS set root= statements.

I think you will find either one of your Windows1 chain or Windows chain UUID's match one of the Linux OS set root= values.

That is why you are getting the Linux OS booting when you select the Windows grub choice.

I don't think you should have TWO chain statements. One of them is incorrect and its entry should be removed from the grub.cfg file.

I think maybe the second entry which probably is identified as chain1 is the extraneous one. But verify first which one has the same set root= value as one of your Linux entries.

Then it is as simple as removing the entry from the file. If it is the second statement, just cut everything within the curly braces.

I am guessing here as I never have had to deal with any Windows boot partition in the grub before.
51) Message boards : Number crunching : I gotta remember... (Message 75021)
Posted 5 Feb 2023 by Profile Keith Myers
Post:
You looked at the wrong grub file. The one in /etc/default is just the configuration file for the actual grub.cfg file in /boot/grub.

That file is the actual grub file that produces the GRUB2 boot menu and controls the boot process. The /boot/grub/grub.cfg file identifies the target boot partitions and OS drives by their UUID number.

That file is the one I am saying has the invalid UUID number for the Windows bootloader partition. The menu item for the Windows OS must be pointing at the Linux drive.

Somehow I think that os-prober did not run or picked up the wrong UUID for the Windows drive. Can you run sudo os-prober from the command line and post its output?
52) Message boards : Number crunching : Run Multiple WU's on Your GPU (Message 75015)
Posted 4 Feb 2023 by Profile Keith Myers
Post:
Unless he is running a PCIE Gen 4 or higher gpu, it does not really matter which gpu he uses. The great thing about the Epyc platform is the tons of PCIE lanes and bandwidth it provides to PCIE cards and the 5 or more PCIE slots spaced consistently to allow the gpus to be put in any slot you desire.

The first generation Epyc or Threadripper boards are limited to PCI Gen 3 or lower. You need to move to the 3rd generation boards to pick up Gen 4 speeds. But even at Gen 3 speeds, since the slots you get are at least X8 and normally a full X16 lane width, the speeds don't really matter.
53) Message boards : Number crunching : I gotta remember... (Message 75014)
Posted 4 Feb 2023 by Profile Keith Myers
Post:
I only suggested fixing the Windows bootloader as I thought you said you were unable to get into Windows. Now you say you can boot Windows from the grub menu with no issues.

You must have a scrambled grub in boot. Probably the Windows UUID is pointing at the Linux drive.

You will have to look at the main grub and fstab files to see what's what.
54) Message boards : Number crunching : Run Multiple WU's on Your GPU (Message 75009)
Posted 3 Feb 2023 by Profile Keith Myers
Post:
Thanks Keith. I have a Supermicro EPYC (Gen 1) board on the way. Looking forward to ~4.2 TFlops of FP64 in one machine :)

Nice! Always good to see more Epycs in use in BOINC.

For the earlier generations, you can get a lot of horsepower from cheap, pulled older Epycs from Ebay.
55) Message boards : Number crunching : Run Multiple WU's on Your GPU (Message 75007)
Posted 3 Feb 2023 by Profile Keith Myers
Post:
I have 2x W8100's waiting for a new motherboard (I'm currently using just one with an 5800X+A320 board). Does the syntax change if I wanted to run 2x tasks per GPU? So 4 concurrently across 2 GPU's.

No. The app_config entry for 0.5 gpu usage applies to all detected gpus in BOINC. As long as BOINC sees both gpus, then each will run at 2X tasks per card.

You can get even more direct control over each gpu with more elaborate config file writing but that does not seem to be needed in your situation.
56) Message boards : Number crunching : I gotta remember... (Message 75006)
Posted 3 Feb 2023 by Profile Keith Myers
Post:
Well I don't know much about Windows. Is it required to use Secure Boot for that OS?

Can you disable Secure Boot in the BIOS and see if that changes the problem?

I am not sure of the order of the events that led to this problem, I believe you said it started with installing drivers for a headset. Since Linux does not deal with "drivers" per se, I assume this means you installed Windows drivers for the headset.

So this leads me to believe that the Grub partition does not have anything to do with the inability to boot Windows now that the grub file has a Windows entry.

So I think that repairing the Windows bootloader may be the next logical step.

So you need to boot the Windows install medium and drop to a command window then issue these commands:

bootrec /fixmbr
bootrec /fixboot
bootrec /rebuildbcd

Then reboot and using the grub menu for Windows, see if you can get into it.

Also I forgot to ask if you ever tried changing the boot target in the BIOS? You said you tried the F12 key at POST to try and get into Windows but you can also preselect the Windows boot partition in the BIOS directly as the boot target and when you exit the BIOS with a Save and Exit, it should boot Windows directly. Did you ever try that approach and what was the result?
57) Message boards : Number crunching : I gotta remember... (Message 75003)
Posted 3 Feb 2023 by Profile Keith Myers
Post:
No the location of the EFI partition is not relevant. It is typically at the beginning of a new drive, but could be at the end or in my case on one host, in the middle between two OS partitions.
I believe there is a restriction when the block count goes over a certain number for large drives but I could be wrong there.

Fixing grub is not going to fix the ACPI error messages. They are harmless. But not being able to see the Windows drive is the issue.

The whole point of having Windows on its own separate drive is to not mess with its bootloader so it will always be able to boot. The GRUB2 partition normally goes onto the drive you are installing Linux to.

It sounds like OS-Prober in the grub file is not being run to discover the Windows drive. I think you said you reinstalled Linux Mint. If that is so, then the reinstallation is what caused the problem since new Linux installations default to disabling OS-PROBER

So boot into your Linux Mint installation and navigate to /etc/default and open a command terminal. Then we will change/add an entry in the grub file.

cd /etc/default

sudo nano grub

Look for a os-prober statement that reads:
#GRUB_DISABLE_OS_PROBER=false

This is causing the OS-PROBER to not be run.

Then uncomment the statement to remove #

This will cause the statement to be to be run and OS-PROBER will run during grub

If the statement isn't there, then add this:

GRUB_DISABLE_OS_PROBER=false

Save the file with CTRL-O

The run update-grub

sudo update-grub

reboot

This should rewrite the boot grub.conf file to make the entry to probe for Windows OS' and should add the Windows drive into the grub menu
58) Message boards : Number crunching : I gotta remember... (Message 74995)
Posted 3 Feb 2023 by Profile Keith Myers
Post:
Hi Siran,
The one thing that CSM legacy mode is for it to enable booting a non-UEFI OS. It sounds like your Window installation might be installed on a MBR drive and not a UEFI compatible GPT drive.

When you disabled CSM mode, that locked you out of the Windows drive I think.

But ACPI errors are normally innocuous and do nothing but spam the message logs. I have never heard of them causing a system to fail to boot. The error is caused by cpus that support the CPPC flag. Both AMD and Intel cpus of recent generations do. The kernel devs are working on a fix to get rid of the error messages supposedly.

I think you said you have each OS on separate drives. Your FAT32 EFI partition that contains your grub menu should be on your Mint OS drive in the first 512MB partition on the drive with your Mint OS occupying the rest of the drive in the next partition after the grub partition.

Since you said you can boot the Live USB stick you should be able to reinstall grub. Try the second part of this page while booted from your Live USB.

https://community.linuxmint.com/tutorial/view/2283
59) Message boards : News : New Poll Regarding GPU Application of N-Body (Message 74955)
Posted 29 Jan 2023 by Profile Keith Myers
Post:
I think you are confused. There is no N-body gpu app so no N-body gpu code.

The N-body app is a multi-threaded cpu application only.
60) Questions and Answers : Unix/Linux : CL_OUT_OF_HOST_MEMORY with AMD RX 6600 XT on Xubuntu 20.04 (Message 74928)
Posted 20 Jan 2023 by Profile Keith Myers
Post:
I believe is it is just a permissions issue with the Rocr drivers which have the OpenCL component in a different location from the legacy AMD OpenCL drivers.

You would have to get some AMD compute experts to chime in and verify that. Remember reading about the issue somewhere, on some project but don't know where to point you to.


Previous 20 · Next 20

©2024 Astroinformatics Group