1)
Message boards :
Number crunching :
anyone running a 2200/2400G? what performance are you getting?
(Message 68115)
Posted 8 Feb 2019 by Richard Haselgrove Post: Hi everyone, I've been working with Bill in the SETI@Home thread, and we've tracked down this error to a faulty (greatly inflated) 'GFLOPS Peak' value returned to BOINC by the ATI OpenCL driver. You can see the value - about 43 ExaFLOPS, about 10,000X too big - in the opening lines of your Event Log after startup. I've submitted a formal bug report to ATI today, and we're working urgently on a hotfix version of BOINC which will trap and subdue the wayward flops value. Watch out for further announcements. |
2)
Message boards :
News :
Nbody Release 1.54
(Message 64190)
Posted 18 Dec 2015 by Richard Haselgrove Post: N-Body tasks won't be even trying to use your GPU. Try reading here. |
3)
Message boards :
Number crunching :
Run Multiple WU's on Your GPU
(Message 64187)
Posted 17 Dec 2015 by Richard Haselgrove Post: 'Read configuration' works fine for replacing one value with another, or adding a value where none existed before. It doesn't work always for removing a value once it has embedded itself in the system, but in most cases you can just leave it there. It's the way they coded it. |
4)
Message boards :
Number crunching :
Error message: App version needs OpenCL but GPU doesn't support it
(Message 64181)
Posted 15 Dec 2015 by Richard Haselgrove Post: Your computer runs Windows 10. Your working NVidia driver will have been replaced by a cut-down - limited functionality - driver supplied by Microsoft. You don't have any control over this, but you can go to NVidia and download/install the full-feature driver you need. Expect this to happen again. |
5)
Message boards :
Number crunching :
Run Multiple WU's on Your GPU
(Message 64179)
Posted 15 Dec 2015 by Richard Haselgrove Post: It does actually say at the very bottom of the Application configuration documentation If you remove app_config.xml, or one of its entries, you must reset the project in order to restore the proper values. Obviously, you wouldn't want to do that while you had any work cached. |
6)
Message boards :
News :
New Release- Nbody version 1.52
(Message 64167)
Posted 12 Dec 2015 by Richard Haselgrove Post: Guess what, even though it is still marked as not selected in my account : Yes, it is. You left the final box checked, so in English you answered "If no work for selected applications is available, accept work from other applications?" with 'yes' - you accept work from unselected applications, like N-Body. Clear the final check-box if you really don't want them. |
7)
Message boards :
Number crunching :
Aborted by User, but not
(Message 64128)
Posted 29 Nov 2015 by Richard Haselgrove Post: Use CPU Enforced by version 6.10+ no |
8)
Message boards :
Number crunching :
Aborted by User, but not
(Message 64106)
Posted 16 Nov 2015 by Richard Haselgrove Post: Nvidia driver was and is current ver 358.91 dated 9 NOV 2015. Where did you download/install that from? Microsoft or NVidia? |
9)
Message boards :
Number crunching :
Aborted by User, but not
(Message 64101)
Posted 15 Nov 2015 by Richard Haselgrove Post: If you probe a little deeper, you can see better diagnostic information. Your most recent example was task 1342970392. That says additionally: Client state Aborted by user BOINC is failing to see your GPU properly. Your computer 410714 is running Windows 10, which hasn't really stabilised yet. In particular, Windows 10 has a habit of updating your hardware drivers whether you want it to or not: and the drivers Microsoft supplies may not always include the ecosystems (like OpenCL runtime support) that scientific computing requires. I suggest your first step might be to replace the current drivers for your NVIDIA GeForce GTX 570 with certified drivers downloaded directly from http://nvidia.com |
10)
Message boards :
Number crunching :
app_info.xml to run MW@H GPU WU's for R9 3xx cards
(Message 64085)
Posted 10 Nov 2015 by Richard Haselgrove Post: Or, what's wrong with this setup? :) You're talkng about, and your BOINC client has found, a file called app_config.xml But the file contents you have posted are appropriate (more-or-less) for a file called - as the opening tag suggests - app_info.xml Always refer to the documentation: Application configuration Anonymous platform I think you want the second of those. |
11)
Message boards :
Number crunching :
any way to change the data drive?
(Message 64058)
Posted 5 Nov 2015 by Richard Haselgrove Post: And uninstalling the BOINC programs doesn't delete your data folder. The whole process - uninstall, move folder, reinstall with manual selection of new location - takes about a minute (once you understand the process), and doesn't even lose tasks in progress. |
12)
Message boards :
News :
Fix for stderr.txt Truncation and Validation Errors
(Message 63834)
Posted 26 Jul 2015 by Richard Haselgrove Post: I just went to the BOINC website .. and the most recent version they show is 7.4.42?? http://boinc.berkeley.edu/download.php v7.6.6 is still a test version, available via the download all versions page. |
13)
Message boards :
Number crunching :
What is the cause of these 'validate errors'
(Message 63820)
Posted 21 Jul 2015 by Richard Haselgrove Post: took a look at my stats Inconclusives: that is a consequence of the way this project is configured, using Adaptive Replication. They will be validated eventually, and the number of tasks chosen for validation by wingmates will go down as the other errors reduce. Invalids: the only invalid tasks showing on your account now are from 15 July or earlier, when you were using BOINC v7.6.2 |
14)
Message boards :
Number crunching :
What is the cause of these 'validate errors'
(Message 63810)
Posted 17 Jul 2015 by Richard Haselgrove Post: Before I go, BOINC v7.6.6 is now available via the Download All page. |
15)
Message boards :
Number crunching :
What is the cause of these 'validate errors'
(Message 63809)
Posted 17 Jul 2015 by Richard Haselgrove Post: Well, I ran v7.6.6 for 48 hours (~3,300 tasks) - not a single error at my end, just one "can't validate" because too many wingmates failed, like in yesterday's screenshot. Then I regressed to v7.6.3, and within an hour got another 17-Jul-2015 12:35:04 [---] [slot] cleaning out slots/11: handle_exited_app() Again, I captured both the screenshot and the contents of the orphaned stderr.txt - it was complete, unlike task 1191232622. OK, I think that provides conclusive evidence of cause and effect - I think my work in this thread is done. Moving on to pastures new. |
16)
Message boards :
Number crunching :
What is the cause of these 'validate errors'
(Message 63807)
Posted 16 Jul 2015 by Richard Haselgrove Post: Heading close to 2,000 without error now. One additional problem at this project: the administrators have set quite a low 'maximum errors' threshhold. Two validate errors together, plus one other glitch, and the whole workunit is killed. Once BOINC v7.6.6 (or its successor) is fully tested and released as 'recommended', I'd suggest you start a push to get as many people as possible to upgrade. |
17)
Message boards :
Number crunching :
What is the cause of these 'validate errors'
(Message 63804)
Posted 15 Jul 2015 by Richard Haselgrove Post: Not a single validate error, from over 500 tasks processed under BOINC v7.6.6 since this morning. |
18)
Message boards :
Number crunching :
What is the cause of these 'validate errors'
(Message 63801)
Posted 14 Jul 2015 by Richard Haselgrove Post: David has applied a possible fix for this: client (Win): when read stderr.txt, wait for write lock to be release first. and Rom has built a installer to test it. I've built a new version of 7.6 with David's latest change to address this issue. Those of you who have some experience already with v7.6.2 might like to try this and see how it compares - bearing in mind that at this point it is totally untested. (That's our job!) I'm clocking off the the night, but I'll switch back tomorrow morning and add to the testing effort. Edit - additional comment from David: I checked in a workaround in which the client waits until Windows programmers are invited to look at http://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commitdiff;h=f2d690029c6dab9d586a9ba1a2e0af03dc7f3c70 |
19)
Message boards :
Number crunching :
What is the cause of these 'validate errors'
(Message 63799)
Posted 14 Jul 2015 by Richard Haselgrove Post: After intensive work with Keith Myers and others (mainly in the SETI message board thread Stderr Truncations), I think I've finally traced and recorded the full life-cycle of these little beasties. The easiest starting point is the debris left behind. The task completed, and for 'some reason' (we'll come back to that later) BOINC couldn't delete one of the files. So it left it for later, and moved to another slot for the next task. In the message log, that looks like 14-Jul-2015 15:49:11 [---] [slot] cleaning out slots/2: handle_exited_app() 14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/astronomy_parameters.txt 14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/boinc_finish_called 14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/boinc_task_state.xml 14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/init_data.xml 14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/milkyway_separation__modified_fit_1.36_windows_x86_64__opencl_nvidia_101.exe 14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/separation_checkpoint 14-Jul-2015 15:49:11 [---] [slot] removed file slots/2/stars.txt 14-Jul-2015 15:49:11 [---] [slot] failed to remove file slots/2/stderr.txt: Error 32 14-Jul-2015 15:49:11 [Milkyway@Home] Computation for task ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9901989_0 finished 14-Jul-2015 15:49:11 [---] [slot] cleaning out slots/2: get_free_slot() 14-Jul-2015 15:49:11 [---] [slot] failed to remove file slots/2/stderr.txt: Error 32 14-Jul-2015 15:49:11 [Milkyway@Home] [slot] failed to clean out dir: unlink() failed 14-Jul-2015 15:49:11 [---] [slot] cleaning out slots/10: get_free_slot() 14-Jul-2015 15:49:11 [Milkyway@Home] [slot] assigning slot 10 to de_80_DR8_Rev_8_5_00004_1434551187_13360920_0 Note that the timestamps match. According to MSDN, error 32 is ERROR_SHARING_VIOLATION - BOINC couldn't delete the file, because Milkyway was still writing to it. On the website, we see task 1187921853: Name ps_modfit_fast_15_3s_136_sim1Jun1_1_1434554402_9901989_0, Received 14 Jul 2015, 14:50:08 UTC - again it matches (my timezone is UTC+1). The stderr on the website ends ... - no final result or call to boinc_finish But I just had time to copy stderr.txt to another part of my hard disk: That copy ends ... Again, note that the Integration time, Average time per iteration, and Integral 0 time all match (they vary from task to task), and that the call to boinc_finish timestamp matches the message log. If BOINC had waited until the last few lines had been appended to stderr.txt, as they later were, before preparing the report for the server, I have every reason to believe this would have been a valid report. It took at least 3,200 tasks to reach that point (and I think a few of the early ones have already been purged). I'll take a pause from this project for a while, and let the GPU chew on a nice restful GPUGrid task (17 hours with none of this frantic uploading and downloading). But I'll come back and test any fix that David can come up with. |
20)
Message boards :
News :
server issues
(Message 63665)
Posted 3 Jun 2015 by Richard Haselgrove Post: If we are - finally - to pay some attention to the server, could I remind you of three messages where I've posted about the BOINC server code being outdated? Message 63188 - unfinished web update, corrupts < and > in [ pre ] and [ code ] blocks. Message 63274 - php warning when 'don't move stickies to top' is selected. BOINC message 62439 - recent ATI cards aren't recognised as being OpenCL capable. And you'll know about the connection errors and timeouts since I started drafting the above. |
©2024 Astroinformatics Group