1)
Message boards :
News :
Scheduled Maintenance Concluded
(Message 65898)
Posted 18 Nov 2016 by Michael H.W. Weber Post: if the app processes all of the bundle BUT fails on the last of the 5, with Well, if that is correct, then Jake has to go back to the bench and improve the server logic with respect to the validation code. Michael. |
2)
Message boards :
News :
Scheduled Maintenance Concluded
(Message 65887)
Posted 17 Nov 2016 by Michael H.W. Weber Post: Is anyone running a 390 or 390X (290 or 290x may have the same problem) The 290X can't run MW tasks in parallel regardless of what driver I use. By contrast, the 280X can (but I don't use it because I have both a 290X and a 280X in the same machine). Michael. |
3)
Message boards :
News :
Scheduled Maintenance Concluded
(Message 65823)
Posted 15 Nov 2016 by Michael H.W. Weber Post: Just released the GPU version. It is a 32-bit application that works on 64 bit machines. Let me know if there are any issues. Well, first of all congratulations that you finally made it happen! The returned tasks validate as before the bundling efforts, i.e. many are instantly valid, a majority is inconclusive - but then shifted to the valid bucket (this behavior I never understood, by the way...). They should take about 5x longer than normal work units since you are crunching 5. In fact, a 280X requires 9 secs for a single WU. The 5x bundle completes in 38 secs which is quicker than 5-fold. Same with the 290X: 13 secs for a single task, 58 secs for a bundle of 5 tasks. So, the computation is certainly more time efficient. Moreover it is better for the GPU hardware, because it does not cool down and heat up as frequently as before but is kept on a rather constant operation temperature. Finally, I am unsure whether bundling of only 5 tasks will solve the DDoS-like attack issues on your server. You can easily increase the bundle size by another factor of 10 or even 100 and then disallow server contacts below a reasonable time threshold. But let's see. As soon as you find that the 'GPU people' run out of work again, you might want to increase the bundle size as suggested. And thanks again for taking our concerns serious. As a result, I am quite sure you will be flooded with new results. Michael. |
4)
Message boards :
News :
Scheduled Maintenance Concluded
(Message 65733)
Posted 13 Nov 2016 by Michael H.W. Weber Post: I think that there is no need to suspend the project while these problems persist. Configure your MilkyWay@home preferences to only accept CPU work and not GPU work. Ehm, the whole discussion over here is all about GPU tasks which, because of their short duration and the limited amount which a single machine can retrieve, are hammering the server down in a DDoS attack-like fashion. Michael. P.S.: Although there is the "no load issue" regarding GPU core and RAM clocks (going down to 300/150 MHz as soon as an MW task is initiated), both types of WUs do validate - I checked one of each of the two types on my AMD cards. |
5)
Message boards :
News :
Scheduled Maintenance Concluded
(Message 65709)
Posted 12 Nov 2016 by Michael H.W. Weber Post: Are the individual units in the new bundle of 5 units bigger than the old units? The 1.42s are working for me. However, even though they are taking up to 3 times as long to complete, (both GPU and CPU) the Credit applied is the same as the previous version - 1.39? Runtime on 290X went from 15 seconds to 27,066 seconds and I still get same 26.73 credits, no thanks : http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=1886485669 As I said above, please check your GPU core and RAM clocks: Engaging on v1.42 tasks results in immediate GPU core clock reduction to 300 Mhz and GPU RAM clock to 150 MHz on both 280X and 290X AMD graphics boards. I will have to suspend MW until this issue is resolved. Michael. |
6)
Message boards :
News :
Scheduled Maintenance Concluded
(Message 65690)
Posted 11 Nov 2016 by Michael H.W. Weber Post: All tasks with version <= 1.41 produce errors, only. Engagig on v1.42 tasks results in immediate GPU core clock reduction to 300 Mhz and GPU RAM clock to 150 MHz on both 280X and 290X AMD graphics boards. When reaching 100% of estimated run time, the bundle tasks on my 290X reset to zero progress and appear to restart although the total runtime duration is not reset (I guess from the file name that it might be a bundle of 5, so this will hopefully repeat 5-fold and then upload). By contrast, the constraints tasks do complete and upload in the expected manner on my 280X. Runtime with reduced core and RAM clock is 741.63 secs as opposed to 9 secs with standard clocks. The result file first ends up in the "inconclusive" bunch - as is usual. So, there is still something wrong with both of these tasks types with respect to clock resetting. I checked with Einstein: Once a new Einstein WU starts after having finished a MW one, the core and RAM clocks go up to regular speed. So, the down clocking is conducted by MW client. Please inform us once you have solved the clocking issue. Michael. |
7)
Message boards :
Number crunching :
Is this project OVER, semi abandoned or DYING?
(Message 65608)
Posted 7 Nov 2016 by Michael H.W. Weber Post: -Click- Michael. |
8)
Message boards :
Number crunching :
Massive server issues and Wu validation delays
(Message 65561)
Posted 29 Oct 2016 by Michael H.W. Weber Post: I will take into consideration changing how we bundle work units (hopefully packing 4-10 workunits together work be nice), but at the moment that is technically challenging since the framework we have set up for workunits and their generation does not allow for it (imagine a lot of coding a bugs for at least 6 months). Packing 4-10 tasks into one bundle won't solve the problem when taking runtimes per WU of 9 secs into consideration. This is what you need to do: 1. Modify the WU generator to define criteria by which WUs are bundled. 2. Modify the application such that the bundled WUs are processed one by one individually. 3. Modify the validator to validate the WUs. To keep things simple, use the wrapper approach: For 1: Pack as many tasks into one .zip bundle until the estimated run time is below 5 hrs or until you have 200 tasks at maximum. For 2: Using the wrapper you only need to list the required program calls in the job.xml; initially unpack the .zip bundle on the client machine and later re-pack everything again as a .zip after computation completion. For 3: Adapt the validator to read the .zip returned by the client and compare the individual result files with a second result (or call a different validation algorithm). That's it. No further year of fiddling around required. :) Michael. P.S.: As I offered ealier, contact me or my team mates for code samples from RNA World. But don't wait too long. |
9)
Message boards :
Number crunching :
Massive server issues and Wu validation delays
(Message 65546)
Posted 28 Oct 2016 by Michael H.W. Weber Post: A few clarifications and additional thoughts to address some of the postings above: (1) Was was talking about GPU tasks, ONLY. (2) The idea of running task in parallel is nice e.g. if you use systems which have a single 280X or a bundle of these. In my case however, I run combinations of 290X and 280X cards in the same machine. That's possible as they use the same driver and this approach (a) combines different capabilities in the same system, (b) saves me hardware (power supplies) and (c) increases electricity efficiency. By contrast to the 280X, however, the 290X does not allow processing multiple MW WUs in parallel. Well, to be precise, of course you can make it do this. But then most of these end up not getting validated - so you simply waste your electricity. In short: Bundling is not a generally applicable solution. Moreover, it does not at all reduce the server load: By contrast, it rather worsens it because as stated above, running tasks in parallel is more time efficient than running them individually. Hence, in the same time frame, more work is requested from the server. (3) Using scripts in an attempt to counteract the obvious issues of the MW server configuration can't be the right choice: A DC project has to expect that its volunteers are either not able to create such solutions or just don't want to spend their time on such things. In short: Participation has to be kept simple. The solution has to be implemented on the project server side end and not at the user's end. In fact, there is no need to discuss this much further. I have given clear suggestions which measures will help relieve the server. A bundling of WUs is mandatory when increasing the server connection delay to ensure that machines do not idle around during these increased intervals. Try these suggestions and then we will see. Michael. |
10)
Message boards :
Number crunching :
Massive server issues and Wu validation delays
(Message 65539)
Posted 27 Oct 2016 by Michael H.W. Weber Post: What I am offering is a clear description of the problem plus a solution. For the latter, no grant is required. Three first measures: 1. Increase client delay, such that connections are allowed only every 30 minutes. Yes, not nice but server stability is of priority in the current situation. 2. WU run time duration requires increase to at least 1 hour per WU, better would be 5 hrs. If a simple increase per WU is impossible for whatever reason, bundle 100 or more tasks in a single packet. 3. Keep your database small: Produce WUs only when the server has no more WUs ready for delivery. Delete WUs once they have been completed - do not save them for long. With these measures we run such projects on a laptop even during worldwide challenges. Without any grant. Just try it. ;) Michael. |
11)
Message boards :
Number crunching :
Massive server issues and Wu validation delays
(Message 65537)
Posted 27 Oct 2016 by Michael H.W. Weber Post: Since around three days I am again trying to support your project by using 4 GPUs (2x 280X, 2x 290X). Although I had addressed these things earlier in this forum, to date you still have not solved the following issues: 1. GPU WUs are to short (280X: 9 sec/ WU, 290X: 13 Sec/WU). 2. Your server hands out only a limited number of WUs at a time. At least the automatic detection of the 290X GPUs you have implemented. Thanks for that (although it took you years to do so). Now, your server has repeated massive database issues around every 15-30 minutes resulting in: 1. failure to upload result data 2. failure to download new WUs (which results in idling machines) 3. failure to login to my account 4. inability to report this problem to you because your forum does not work either. On top of that, your validator does not seem to keep up with the incoming results. Of the 11446 tasks I was capable of uploading to your server within the past 3 days (and this is only a tiny, tiny fraction of what would have been possible if your server wouldn't crash every other minute), only 1244 were validated. The rest 'hangs in the air' waiting for validation (credit aquisition is accordingly delayed). I do not know what is the reason for all of these issues, but if you like people to support this project, you need to address these issues quickly. Not in a year, please. I think I do have an explanation for your server problem, though: Because your WUs are so small, your server can't keep up with the connections made by all the clients working on your tasks. At the RNA World distributed computing project, we solved exactly the same problem by simply bundling the tiny WUs to larger WU packets. This massively reduced the number of connection requests and also helped deliver more work to the clients. PLEASE think about that when you get free advice from people running their own DC projects and servers. :) If you like, contact Yoyo from our admin & project team and ask for code details on how we bundle the tasks. And remember: I suggested this long back, too. What you are currently doing is a self-induced DDoS-attack on your own server(s). Michael. |
12)
Message boards :
Number crunching :
GPU WU runtimes too short: Wast of compute power
(Message 65144)
Posted 15 Sep 2016 by Michael H.W. Weber Post: Multiple tasks don't run in parallel on 290X cards for unknown reasons. So bundling is the proper choice to solve the problem. Moreover, even if you run 6 WUs in parallel, they are also completed in a few minutes, so the problem detailed above persists. And by the way, for 280X cards, you may run even up to 12 tasks in parallel (tested). Michael. |
13)
Message boards :
Number crunching :
Issues with & proper support of AMD R9 290X GPUs
(Message 65123)
Posted 9 Sep 2016 by Michael H.W. Weber Post: Regarding the number of invalids: Are you successfully running multiple tasks together on those other projects? I've seen comments about not being able to do that with that card. I will look into that. Some think it's a driver issue. I would not be surprised if that problem is also present here. A driver issue can be excluded as detailed above. Michael. |
14)
Message boards :
News :
GPU Issues Mega Thread
(Message 65122)
Posted 9 Sep 2016 by Michael H.W. Weber Post: I've always gotten the impression that if you want to crunch for MilkyWay, it is best to run Nvidia cards... Well, if double precision is of importance, you should never choose NVIDIA cards but instead AMD as these are significantly more potent in this discipline. To the best of my knowledge, the best model on the consumer market still is the R9 280X (and here e.g. the Toxic version from Sapphire). An exception from the rule is the NVIDIA Titan Black series but that one is so expensive that you can afford many 280X cards for the price of one Titan Black such that these will again outperform the Titan Black. Also, there are no significant driver issues with AMD graphics cards as long as you are using Windows as OS. The core problem with MW is that it will just not recognize some of the AMD cards properly while other projects have absolutely no problem doing that (Einstein, Collatz, POEM, Primegrid, SETI tested). Michael. |
15)
Message boards :
Number crunching :
Issues with & proper support of AMD R9 290X GPUs
(Message 65114)
Posted 8 Sep 2016 by Michael H.W. Weber Post: Any comments? Michael. |
16)
Message boards :
Number crunching :
Issues with & proper support of AMD R9 290X GPUs
(Message 65104)
Posted 5 Sep 2016 by Michael H.W. Weber Post: Due to its superior double precision (DP) performance (second best of AMDs consumer cards), AMDs R9 290X GPU is a valuable card worth being supported properly by the Milkyway@home project. So far, however, this card is not even recognized as a graphics board by this project when using Windows 7. Instead, one has to manually setup the following app_info.xml file and copy it to the Milkyway@home project folder: <app_info> ...followed by manual download of the corresponding executables from the Milkyway@home website (to be also stored in the project folder): milkyway_1.36_windows_x86_64.exe After restart of BOINC, Milkyway@home will finally start to compute tasks. Single tasks. One after the other. Task duration is around 16 seconds! Long ago, I asked for longer GPU tasks here in this forum, because initiation of a task every 16 seconds is a massive waste of compute time and requires permanent internet connection for constant up- and downloads as the number of tasks per machine is severely limited, too. Nothing has happened. For AMDs R9 280X, which is the same board family and the most performant GPU with respect to DP, GPU recognition by Milkayway@home is, by contrast to the 290X, fully automated. This card can process several tasks in parallel, so I thought, it should also be possible with the 290X. Pustekuchen! With the 280X the following app_info.xml does the job to run 8 tasks simultaneously, thereby significantly increasing throughput: <app_config> When I include this file into my work folder for the 290X card, some tasks do validate, others do not. The majority does not validate. They mostly are initially categorized as inconclusive and then are directed into the "bad box". What I want to know is, why there is this massive fraction of invalid tasks? I have tested a second R9 290X which behaves absolutely identical. Both cards work properly with ALL other tested distributed computing GPU projects. Specifically, I tested Primegrid, Folding@home, POEM@home, Collatz Conjecture, Einstein@home and SETI@home. Since I use the latest AMD drivers (and have also tested older ones from the outdated Catalyst series) I therefore conclude two things: Neither my hardware nor the driver are the cause of the issue. Hence, something is wrong with Milkyway@home or my manual configuration as detailed above. In order to nail the problem, I will post three exemplary result files from my 290X card as follows. A valid task: Aufgabe 1764179469 An inconclusive task: Aufgabe 1764185669 An invalid task: Aufgabe 1764165427 Note that these are results generated when running 8 tasks in parallel. My system is an Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz [Family 6 Model 42 Stepping 7] CPU running Windows 7 Ultimate x64 with an MSI R9 290X Lightning GPU. The machine is equipped with 16 GB of RAM and one CPU core is kept empty to fire the GPU with maximum performance. It is an excellent card and I am highly disappointed that this project makes so little out of it. It almost appears as if you guys have enough compute power for free. If that is the case, just let me know and I won't bother you any further as there are many projects out there which are in need of ressources. I should also note that the configuration file above previously was a different one which I had to manually update. Suddenly the older one did not work anymore. Without notice from the project on its website. If you expect people to participate in your project in large sums, then you need to take utmost care to keep things as simple as possible. A person new to Milkyway@home will most likely never get an R9 290X to run for you given the manual intervention required to do so. Michael. |
17)
Message boards :
Number crunching :
GPU WU runtimes too short: Wast of compute power
(Message 64329)
Posted 11 Feb 2016 by Michael H.W. Weber Post: Hey Michael, Well, summer is over and I patiently waited for another 8 months. How is the implementation of longer GPU WUs progressing? Michael. |
18)
Message boards :
Number crunching :
GPU WU runtimes too short: Wast of compute power
(Message 63702)
Posted 11 Jun 2015 by Michael H.W. Weber Post: Well, thanks for the feedback. I think it is an important issue and I hope it can be solved soon. Michael. |
19)
Message boards :
Number crunching :
GPU WU runtimes too short: Wast of compute power
(Message 63692)
Posted 10 Jun 2015 by Michael H.W. Weber Post: Is it possible to extend the individual runtimes of GPU WUs, please? My AMD 290X completes its WUs within 16 to 47 sec per WU (depending on the WU type). Then there is a limited number of WUs that can be downloaded, too. The time required to power down and to restart the next WU takes a good fraction of the short running WUs, so making these WUs as short as they are currently, wastes a significant proportion of computation time. On top of that there is a high frequency of internet connections required. The system is virtually permanently up- or downloading work. All these issues could be resolved if the WUs could be "bundeled" to take, say, around 30 min to a few hours of compute time per WU. Would that be an option? Michael. |
20)
Message boards :
Number crunching :
AMD R9 290X does not receive any GPU work
(Message 63372)
Posted 13 Apr 2015 by Michael H.W. Weber Post: Finally, my 290X receives WUs. These run very short, though. I realised completion durations ranging only from 14 to 49 seconds per WU, so far. Strange... Michael. |
©2024 Astroinformatics Group