Message boards :
Number crunching :
Request help updating app_info.xml for Linux
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,562,776 RAC: 0 |
I had a similar problem. Finally I got tired of trying to figure it out and just let it run. The second WU started some time after the first. I suggest using the app_config file again and just let it run for a while and see if the second WU eventually starts. |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
I wanted to thank everyone for their suggestions. What I've decided to do is to work off the existing WUs on all of the machines and upgrade the OS to the latest version (Linux Mint 14 from 12) since I see the latest BOINC (7.0.65) in that repository. After upgrading the BOINC installation and running for a few days, I'll see how things look. Regards, Steve |
Send message Joined: 8 May 09 Posts: 3319 Credit: 520,360,933 RAC: 22,692 |
I wanted to thank everyone for their suggestions. Can't you just upgrade Boinc without having to upgrade the whole OS? |
Send message Joined: 23 Nov 09 Posts: 29 Credit: 17,119,258 RAC: 0 |
You should be using gedit to make the file, and you want to make sure it is properly aligned/formatted. You also need to make sure it has proper permissions. I'm not sure why, but when I copied an identical set of parameters from the forum which removes all the formatting, it didn't work. However, after formatting/aligning it properly it worked. The moral of the story is to write it yourself rather than copy it from here and paste it. |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
Actually, it's not a big deal since these are Boinc-only machines so it's not a case of having to do any backups or re-copying of databases etc.. Once the system is re-installed that will also eliminate any compatibility questions among versions that may have crept in over the years. Also, as one person noted, with Linux you have to watch the file permissions so this will let me start with a clean slate. It looks like the final Cosmology wu's will be done tomorrow so we'll see what happens then. Regards, Steve |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
I do use gedit/Pluma (same thing). Good point, on the file permissions issue. These installations are years old and through the updates, something may have gotten changed. Doing a fresh re-install will let me start off clean. Regards, Steve |
Send message Joined: 8 May 09 Posts: 3319 Credit: 520,360,933 RAC: 22,692 |
Actually, it's not a big deal since these are Boinc-only machines so it's not a case of having to do any backups or re-copying of databases etc.. I have a batch of Windows based Boinc only machines too and still back them up once a month making an image of the system. I back up all of my machines, most onto one machine, but two internally as they are more important to me. I use a Windows program called Macrium Reflect, it has few options and can be a bit ornery at times, but usually works without a hitch. It does NOT do different versions as updates are done, it is a full image every time. On most of my machines it takes less then an hour once month to do it. Then with a generic Linux cd I can put a new drive in a machine and have it restored and up and running in less then an hour, just where it was when it died. I reuse old harddrives so this lets me get back up and running quickly when they crash. I was thinking there should be a small FREE Linux program to do the same for you. I back up everything to a 2tb drive and then go thru and delete the older ones regularly. I only keep the latest backup, as you said they are Boinc only machines. |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
I've gotten my two BOINC machines changed over to Linux Mint 15 and BOINC 7.1 and wanted to provide an update as well as well ask one or two final questions. I have had some success. To recap, all graphics cards are Nvidia Fermi's and each machine has at least 8GB of RAM. Machine 1 has the following 2 Fermi cards: Sun 19 May 2013 05:24:58 AM EDT | | CUDA: NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 5.0, compute capability 2.1, 767MB, 691MB available, 1009 GFLOPS peak) Sun 19 May 2013 05:24:58 AM EDT | | CUDA: NVIDIA GPU 1: GeForce GT 430 (driver version unknown, CUDA version 5.0, compute capability 2.1, 1024MB, 1009MB available, 269 GFLOPS peak) Sun 19 May 2013 05:24:58 AM EDT | | OpenCL: NVIDIA GPU 0: GeForce GTX 460 (driver version 304.88, device version OpenCL 1.1 CUDA, 767MB, 691MB available, 1009 GFLOPS peak) Sun 19 May 2013 05:24:58 AM EDT | | OpenCL: NVIDIA GPU 1: GeForce GT 430 (driver version 304.88, device version OpenCL 1.1 CUDA, 1024MB, 1009MB available, 269 GFLOPS peak) The system is processing two concurrent MW 1.02 cuda wu's simultaneously, so success there. However, no matter what I do with the cc_config.xml or the app_config.xml, I can't process more than one N-body Simulation 1.09 cuda wu at a time. In fact, it looks like it only sends them to GPU 1 which has more video ram as if that might be required. It makes me wonder if the cuda N-Body Simulation GPU WUs are so computationally intensive that you can't run muliple simultaneous WUs of the 1.09? I made sure that all file ownerships were the same (BOINC) so that's not an issue. My first app_config.xml was: <app_config> <app> <name>milkyway</name> <gpu_versions> <gpu_usage>.5</gpu_usage> <cpu_usage>.05</cpu_usage> </gpu_versions> </app> </app_config> When that didn't make a difference with the 1.09's, I remembered from the app_info.xml's that the N-Body's had their own section and file name so I changed the app_config.xml to: <app_config> <app> <name>milkyway</name> <gpu_versions> <gpu_usage>.5</gpu_usage> <cpu_usage>.05</cpu_usage> </gpu_versions> </app> <name>milkyway_nbody</name> <gpu_versions> <gpu_usage>.5</gpu_usage> <cpu_usage>.05</cpu_usage> </gpu_versions> </app> </app_config> No errors noted by BOINC but also no change to the processing of 1.09's unless it takes some time to cycle through the system. If so, it makes it hard to troubleshoot problems if the changes don't reflect fairly quickly. I'm going to run down the cache and then let it re-load one more time. If it's still not running multiple concurrent 1.09 wu's I may stop doing the N-Body's since it really ties up the GPU and not only denies it to the MW 1.02 cuda wu's but also to E@H. Situation is the same with machine #2 which is a Fermi 550TI with 2GB of VRAM. Thanks again to everyone for their suggestions. Regards, Steve |
Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0 |
You're missing an opening <app> to match the second closing </app>, in the second file. N-Body apps don't use GPUs at all (despite what it says), so you're missing nothing by not running two at once. |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
Richard, Thanks for the correction, it's been incorporated. I have to ask, are you really sure about no GPU WUs for the N-Body Simulation 1.09 because I not only seen a notation of (opencl_nvidia) after some of the WU's but the one that's waiting also says (0.05 CPU + 1 NVIDIA GPU). It's waiting right now on some E@H GPU WUs to run but I could have sworn that when it was running none of the available E@H were. Also, I was wondering why the N-Body WU's 1.09 that are not marked as cuda all seem to have very short expected durations (most under an hour) while the cuda-marked ones are all well over 20 and even 30 hrs! I almost felt like I was having a Cosmology@Home flashback! Thanks for any further info you can provide. Enquiring minds want to know! :) Regards, Steve |
Send message Joined: 8 Feb 08 Posts: 261 Credit: 104,050,322 RAC: 0 |
nbody exes are multithreaded but cpu only. Why there are still additional exes for linux linked indicating they would make use of the gpu has been has been the question here for some time. The app_config defines how many WU's of one app can run on the gpu at the same time. Trying to use it for a cpu only app can only mess things up. Since I think you can't use app_config and app_info at the same time (someone correct me if I am wrong), I fear you have to use the app_info again to make settings for separation and nbody. The nbody part in your app_info needs some cleanup and correction: <app> <name>milkyway_nbody</name> </app> <file_info> <name>milkyway_nbody_1.09_x86_64-pc-linux-gnu__mt</name> <executable/> </file_info> <app_version> <app_name>milkyway_nbody</app_name> <version_num>1.09</version_num> <plan_class>mt</plan_class> [color=green]<avg_ncpus>4</avg_ncpus> <max_ncpus>4</max_ncpus> <cmdline>--nthreads=4</cmdline> <file_ref> <file_name>milkyway_nbody_1.09_x86_64-pc-linux-gnu__mt</file_name> <main_program/> </file_ref> </app_version> The first green marked lines are for BOINC to know how many cpus to reserve and the last green line tells nbody how many cpus to use. So the setting above would use 4 cpus for nbody. If you reduce the green lines to <cmdline></cmdline> only, it will run single threaded (1 WU per cpu). With separation and nbody properly defined in the app_info, you should be able to run 2 separation tasks on your gpu and at the same time nbody tasks single or multithreaded on your cpus. Note 1: I am running Win and haven't crunched a nbody WU for some time. Note 2: Maybe you need to change the commandline above to <cmdline>--nthreads=4 --disable-opencl</cmdline> |
Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0 |
Thanks for any further info you can provide. Enquiring minds want to know! :) I'm afraid the only further information I can provide is a private message exchange which took place on 2 April 2013, the day the most recent Linux applications were deployed on the server. I received an acknowledgement and thanks (which I'm not making public) for my reply, but I have seen no sign of any change to the N-Body deployment in the seven weeks since this exchange. I'll leave it to speak for itself, without comment. When I started releasing the format was to use gpu classes and I kept the convention. Since the idea is to get a gpu version of nbody out I thought it was a place holder for turning on that option. I have been asking the people who originated this convention for their ideas on the best way to proceed. I am hoping to deprecate the gpu versions until we move past testing on the gpu versions. I will update the list when we change the format. |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
Thanks for the feedback. I've modified my app_info.xml and will give it a try. Right now the two machines are using the app_config.xml from the recent re-installation. Thanks again for the help. Regards, Steve |
Send message Joined: 29 Oct 10 Posts: 89 Credit: 39,246,947 RAC: 0 |
Thank you for providing the backstory. I certainly share your sentiments on a number of levels. As an interested supporter, but not a programmer, who has dedicated some decent boxes to BOINC, I want my spending to be used in the most efficent manner that supports all of the projects that I participate in. It may not amount to much from a NASA grant perspective, but it's something to me and I'm happy to provide it as long as the projects hold up their end. I guess that's why I'm so interested in being able to process the multiple simultaneous WU's whether it's through app_config or app_info files. If an project app won't support that and then wants to tie up my GPU with it's 2GB of VRAM but is only using 300MB of it and in the process denying the GPU's concurrent use to other projects, that hurts everybody in terms of wasted resources. Thanks again for the feedback. Regards, Steve |
©2024 Astroinformatics Group