Welcome to MilkyWay@home

Posts by Sunny129

41) Message boards : Number crunching : Lots of crunching errors since today (Message 55848)
Posted 18 Oct 2012 by Sunny129
Post:
They are testing new searches, if you have any issues with those, better post in the News part of this forum, there are threads to each of the new searches. The edge_1 and free_1 had error rates of about 5-10% according to the posts over there, so no reason to abort all of them, that does really not help them to find the issue.

man am i glad i found this thread and this info. errors started showing up on my machine back on the 13th. i started to worry, so i switched over to Collatz Conjecture. now that i know that the errors are expected as of late, and that there's nothing wrong on my end, i'll jump back in the fray.
42) Message boards : News : testing work generation with 'ps_separation_14_2s_null_3' (Message 54778)
Posted 13 Jun 2012 by Sunny129
Post:
well it appears this thread has been dormant for a full week now...shame i gotta bring it back. i was happy to say that my HD 6950 had racked up ~9,000 consecutive valid tasks over the last 10 or so days, but today i got another "ps_separation_14_2s_null_3" error, specifically a "ps_separation_14_2s_null_3_1338573431_354_2" task. i don't know if its anything meaningful or not, but i felt it was my duty as a project participant to at least report it, since its one of the sub-types of tasks Matt is testing in this thread. i'm hoping it was just a rare glitch, but i have my doubts b/c the stderr output file looks like it crunched normally.

i'll mention that i had a completed task get marked invalid today as well. the stderr output again looked like it ran normally. so i looked at the wingmen and realized that all 4 of them got errors, rendering my result useless. though i know there was no fault on my end, i thought i'd post about it b/c its a different type of task than the one discussed above...specifically, this task is a "de_separation_14_2s_05_3_test_1_rand_1339497601_758925" task.

anyhow, just thought i'd let the developers and testers know...i'm not reading into it too much at this point, but i'll post more if the errors start rolling in...
43) Message boards : News : testing work generation with 'ps_separation_14_2s_null_3' (Message 54679)
Posted 6 Jun 2012 by Sunny129
Post:
i believe he is referring to the HD 3800 series GPU lineup.
44) Message boards : News : testing work generation with 'ps_separation_14_2s_null_3' (Message 54632)
Posted 2 Jun 2012 by Sunny129
Post:
just got another "ps_separation_14_2s_null_3_..." error, though this is only my 2nd error since the server went back up several hours ago. by taking a quick look at my validated results, i'm fairly confident that the above type of task is having a 100% failure rate on my machine. i'm seeing valid "de_separation_14_2s_05_3_..." and "ps_separation_14_2s_null_3_v4_..." tasks, but no valid "ps_separation_14_2s_null_3_..." results.


EDIT - make that 3 "ps_separation_14_2s_null_3_..." errors now.
45) Message boards : News : testing work generation with 'ps_separation_14_2s_null_3' (Message 54618)
Posted 2 Jun 2012 by Sunny129
Post:
i've run ~10 tasks since the server went back up, and only one errored out (a null_3 task). my first de_separation task crunched without error though.
46) Message boards : News : testing work generation with 'ps_separation_14_2s_null_3' (Message 54598)
Posted 1 Jun 2012 by Sunny129
Post:
ok, the ps_separation_14_2s_null_3_v3 are crunching to completion without errors...so it seems all is well for the time being.
47) Message boards : News : testing work generation with 'ps_separation_14_2s_null_3' (Message 54596)
Posted 1 Jun 2012 by Sunny129
Post:
From what I can tell, it looks like the newly generating 'ps_separation_14_2s_null_3_v3' workunits are crunching and validating, so I think we're in the clear from here on out.

thanks for the update Travis. i wouldn't know yet, as i immediately suspended all MW@H work as soon as i saw WU's erroring out. now that i've just discovered the nature of the problem, i can resume crunching the remaining MW@H tasks in my queue (even though i know they'll error out). once those tasks have cleared my host, i can test the ps_separation_14_2s_null_3_v3 WU's and confirm whether or not the errors are gone...that is, if someone doesn't beat me to it.
48) Message boards : News : testing work generation with 'ps_separation_14_2s_null_3' (Message 54594)
Posted 1 Jun 2012 by Sunny129
Post:
thank god others are having the same problem LOL. i've been pulling my hair out for the last hour trying to figure out why tasks are essentially running to completion and then erroring out at the last second...i feel much better now that i know its a server-side issue.
49) Message boards : Number crunching : Notice from server: Your app_info.xml file doesn't have a usable version of milkyway@home N-Body Simunlation (Message 54558)
Posted 31 May 2012 by Sunny129
Post:
You probably have your project preferences set to receive cpu tasks including N-Body Simulation, but your app_info.xml does not contain any CPU applications. Once you go on anonymous platform, only the applications in it will be used. If you want to run GPU only, change your project preferences to get rid of the messages.

that probably isn't going to fix his problem. i have my web preferences set to allow Separation tasks, but not n-body tasks. i also only have AMD (ATI)GPU enabled (CPU and nVidia GPU are disabled)...and yet i still get the "Notice from server: Your app_info.xml file doesn't have a usable version of milkyway@home N-Body Simulation" message quite regularly...

OP, its just a message, and it has no adverse affects on crunching. i haven't been able to get rid of mine either, but i know it isn't hurting anything, so i just ignore it.
50) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53964)
Posted 9 Apr 2012 by Sunny129
Post:
Ahhh yes you are correct more than 4 gig would not be helpful under XP!

One thing I did, I install Win7 on my pc's more than once due to using used parts all the time, I got an 8 gig usb stick and after making it bootable copied the Win7 dvd onto it. It makes Win7 load up almost twice as fast as using the dvd, it also means I don't have to put a dvd drive in each machine early on.

thanks for the suggestion...that's definitely something i'll want to do, b/c i plan on converting more than one machine over to Win7 X64 in the near future...
51) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53954)
Posted 8 Apr 2012 by Sunny129
Post:
I would first upgrade your memory to at least 8 gig but probably 16 gig if it were my pc. There have been studies done that say a pci-e slot is not the slow-down people think it is, so that may not be your problem.

well i'm running WinXP x32 right now, so there's no sense in moving to 8GB or 16GB of memory until i upgrade to a 64-bit OS first. nevertheless, i do currently have 8GB of memory, so as soon as i make the jump to Win7 x64, it'll start recognizing more than just 3.25GB. also, i didn't intend to imply that PCIe bandwidth was part of my problem...its just one of the upgrades i'd like to make at the same time i move to Win7 x64 on an SSD...that way i can run a 3rd GPU in the future if i so desire...

at any rate, i'm not gonna worry about the problems i was having before b/c the may or may not still exist after i switch to Win7 x64. we'll see if that eliminates them when the time comes (it may be a while before i have time to upgrade this machine), and/or creates any new problems...but i'll deal with it then.
52) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53944)
Posted 7 Apr 2012 by Sunny129
Post:
you can zero out your debts by putting this into your cc_config.xml file:
<zero_debts>1</zero_debts>. It is recommended that you only use it once then remove it, but I know some people who just left it in there.

thanks for the additional advice, but i've made some changes that won't allow me to test this right now. i've since reverted back to BOINC v6.12.43 and MW@H Separation v0.82 for a few reasons:

1) while not being able to maintain MW@H and S@H caches w/ either BOINC v6.13.xx or v7.xx.xx wasn't the end of the world, it was enough to bother me. b/c there would sometimes be dead time between the completion of the last task in the cache and the downloaded/start of new wrk, my production was becoming unacceptably low for the hardware i'm running.

2) for some reason, the Separation v1.02 tasks are causing Catalyst driver resets, which i have been unable to get to the bottom of.

so in the mean time, i simply run MW@H v0.82 sometimes, and S@H sometimes (though obviously not at the same time b/c BOINC v6.12.xx doesn't recognize the <exclude_gpu> command, and i can't run a higher BOINC version without having problems maintaining a cache). obviously this requires some babysitting, but i have little choice at this point. i'm hoping to upgrade the motherboard (something w/ more PCIe bandwidth than what i have now) and OS (Win7 x64) soon, reload BOINC and my projects, and see if that eliminates any of my existing problems or creates new ones.
53) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53910)
Posted 4 Apr 2012 by Sunny129
Post:
well i tried suspending all projects but Milkyway@Home, and the "drained cache" problems persisted. the same thing happened when i suspended all projects but SETI@Home. so i enabled the <coproc_debug> and <dcf_debug> logging flags to see if it would reveal anymore clues through the BOINC event log. the <coproc_debug> flag didn't report anything out of the ordinary, and i really didn't let BOINC run long enough to note how DCF was changing over time. that said, for Milkyway@Home, my estimated run times are right in line w/ my actual run times, and so it was no surprise to see a DCF of ~1...so why my MW@H cache would drain completely before refilling is beyond me. SETI@Home on the other hand has a DCF of 0.015445 for this host, and the last check i made before that , it was at 0.02xxxx...so its as though the DCF keeps getting smaller. if this figure seems out of line for an HD 5870 capable of 2720 GFLOPS, then i'd be interested in doing whatever is necessary to correct that value...although i'm skeptical that it'll do anything for me, seeing as how my estimated run times and DCF for Milkyway@Home are right where they should be, and yet i'm still having problem maintaining a cache.
54) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53902)
Posted 3 Apr 2012 by Sunny129
Post:
The new Boinc version 7 is NOT clear on the words they are using in the cache area, the early versions anyway. If you are coming directly from version 6 to version 7 you should reverse the numbers you had in version 6. The newer version 7 releases are clearer about what each set of numbers does. But basically the 1st set of numbers now says "minimum work buffer", while the 2nd set says "max additional work buffer", this is in version 7.0.23. Essentially in version 6 you reverse those two with the smaller being in the 1st section and the bigger being in the 2nd part. BUT in version 7 that says to maintain only a VERY small minimum work buffer, which is not what most of us want. I just reversed the numbers and my cache has not changed from the one version to the next.

well before i was running BOINC v6.12.41, where the "connect about every x.xx days" field was set to 0 and the "additional work buffer" field was set to 5 days. so now that i'm running BOINC v7.0.24, i have the "minimum work buffer" field set to 5 and the "additional work buffer" set to 0...in other words, the numbers are now reversed as you suggested. i made this change a good 6 hours ago, and have noticed no positive changes. my host has contacted the server several times since then to report finished tasks, but has not requested any new work, so nothing has changed...then again, the small queue of SETI tasks i had since this morning is only now winding down to zero - the last SETI WU is being crunched right now. perhaps the cache behavior will change after the next request for work (which won't happen until this last task finishes and reports). *EDIT* - the last SETI WU just completed, uploaded, and reported, but i did not receive a new cache of WU's b/c the server is down for maintenance...so i guess i'll have to wait a while before i see any new tasks...

i may have to further experiment with Kashi's suggestions and suspend one of my two GPU projects (and possibly even my CPU projects) to see how it affects the cache behavior of the one project left running.
55) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53891)
Posted 2 Apr 2012 by Sunny129
Post:
I have a "full" queue from SETI with 7.0.18, I have my preferences set globally at
Maintain enough tasks to keep busy for at least(max 10 days). 5 days
... and up to an additional 0.5 days 

well those are similar to my settings (i had the minimum work buffer set to 5 days, and the additional buffer set to 2)...though i had only made those changes host-side via the BOINC manager's settings. i've since implemented those settings server-side via my web preferences as well...although i was always under the impression that the local host settings override the web preference settings, and therefore make it unnecessary to set any of the web preferences that are also made available for edit through the BOINC manager itself.

regardless, i've implemented the settings in both places just to be sure, and unfortunately i'm still seeing the same behavior as before - my SETI and Milkyway caches are draining completely before refilling (as opposed to downloading fewer tasks at a time, more often, and maintaining X number of tasks in the queue at all times).

any other ideas why the work buffers aren't working like they should?

its like i'm stuck in a bad dream...either i use BOINC v7.x.xx to gain access to the <exclude_gpu> function and sacrifice normal scheduling and queue characteristics, or i go back to BOINC v6.12.xx to get normal scheduling and queues and sacrifice the ability to use the <exclude_gpu> function (which i need in order to run both Milkyway@Home and SETI@Home GPU apps at the same time on the same machine). if only BOINC v6.13.xx were a happy medium between BOINC v6.12.xx and v7.x.xx, but its not - v6.13.xx exhibits the same buffer/queue problems for me that v7.x.xx does.

*UPDATE* - just finished a run of Multibeam tasks, at which point a single AP task was downloaded. the same thing happened the last time i got an AP task. so not only can i not maintain a queue of S@H tasks, but i'm getting no more than 1 AP task at a time...which really poses a problem for my host should the server go down...i could be out of work (both AP and BM) for ridiculous amounts of time...
56) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53889)
Posted 2 Apr 2012 by Sunny129
Post:
well i made the switch to BOINC v7.0.24 last night...mind you, the only reason i did it was to get BOINC to recognize the <exclude_gpu> function in order to force GPU 0 to run Milkyway@Home only, and GPU 1 to run SETI@Home only...not b/c the update was necessary in order to run the new OpenCL-based Separation v1.02 tasks. v7.0.24 has correctly renamed the parameter field names of interest from "connect about every x.xx days" & "additional work buffer x.xx days (max. 10)" to "minimum work buffer x.xx days" & "max. additional work buffer x.xx days." but even with the minimum work buffer and the max additional work buffer set to 5 days & 2 days repectively, i'm still experiencing the same behavior as before - both the Milkyway and SETI task queues are being drained completely before either project server allows more work to be downloaded. i should note that i believe this is being enforced server-side, b/c if i manually update either project before the queues have run down to zero, my host will report whatever tasks have completed, but won't "request any new tasks."

is anyone using BOINC v7.0.15 or later able to build up a queue and maintain it, or are we all experiencing this behavior with our hosts?
57) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53884)
Posted 1 Apr 2012 by Sunny129
Post:
With development BOINC 7.0.xx versions, Connect about every x.xx days has now effectively become Minimum work buffer. In fact in the later 7.0.xx versions it has been renamed. If you leave it at 0 days which was previously recommended for an always on connection it will not download any new tasks until your cache is empty. The value needs to be set above 0 if you want to download work before the cache is empty.

This new method of controlling work download may cause tasks to run in high priority mode on projects that have a short deadline. The higher the value you use for Connect about every x.xx days/Minimum work buffer, the more likely it is that tasks may go into high priority mode. It depends on the deadline, for example with projects with a short deadline of 2 days, tasks may run high priority with a value above 0.7-0.8 days. This can cause trouble if they run out of order while leaving other tasks "waiting to run". In later BOINC versions (after 7.0.14, I think) high priority tasks run in "earliest deadline first" order which overcomes this problem on most projects. WCG can still upset the applecart though if a computer becomes a trusted host and gets sent repair tasks with a shorter deadline than normal tasks. These repair tasks may start to run in high priority mode as soon as they are downloaded.

thanks for the details. i remember reading something about that in another thread not too long ago, but i couldn't remember exactly what was said or which thread it was in. so again, thanks for making it clear here. i will give BOINC v7.0.15 (or later) a try this evening and treat the "Connect about every x.xx days" parameter as the minimum work buffer instead.
58) Message boards : Number crunching : tasks being sent to wrong gpu card (Message 53881)
Posted 1 Apr 2012 by Sunny129
Post:
i'm a bit late to the party, but i can also confirm that the <exclude_gpu> function works. here are my cc_config and my event log:

<cc_config>
<options>
<exclude_gpu>
<url>http://setiathome.berkeley.edu/</url>
<device_num>0</device_num>
</exclude_gpu>
<exclude_gpu>
<url>http://milkyway.cs.rpi.edu/milkyway/</url>
<device_num>1</device_num>
</exclude_gpu>
</options>
</cc_config>


3/31/2012 9:57:56 PM | | ATI GPU 0: AMD Radeon HD 6900 series (Cayman) (CAL version 1.4.1664, 2048MB, 2031MB available, 6144 GFLOPS peak)
3/31/2012 9:57:56 PM | | ATI GPU 1: ATI Radeon HD 5800 series (Cypress) (CAL version 1.4.1664, 2048MB, 2023MB available, 5440 GFLOPS peak)
3/31/2012 9:57:56 PM | | OpenCL: ATI GPU 0: Cayman (driver version CAL 1.4.1664, device version OpenCL 1.1 AMD-APP (851.4), 1024MB, 2031MB available)
3/31/2012 9:57:56 PM | | OpenCL: ATI GPU 1: Cypress (driver version CAL 1.4.1664, device version OpenCL 1.1 AMD-APP (851.4), 1024MB, 2023MB available)
3/31/2012 9:57:56 PM | | ATI GPU 0 is OpenCL-capable
3/31/2012 9:57:56 PM | Milkyway@Home | Found app_info.xml; using anonymous platform
3/31/2012 9:57:56 PM | SETI@home | Found app_info.xml; using anonymous platform
3/31/2012 9:57:56 PM | SETI@home | Config: excluded GPU. Type: all. App: all. Device: 0
3/31/2012 9:57:56 PM | Milkyway@Home | Config: excluded GPU. Type: all. App: all. Device: 1


as you can see, i'm using the <exclude_gpu> function to exclude the HD 5870 from Milkyways@Home so it can focus on SETI@Home and running the display. at the same time, i'm using the function to exclude the HD 6950 from SETI@Home and display duties so it can focus solely on Milkyway@Home. i run 2 MW@H tasks simultaneously on the 6950, and either 2 E@H Astropulse tasks or 2 E@H Multibeam tasks simultaneously on the 5870. sometimes a single Astropulse task will run along side a single Multibeam task.

...i'm not really sure why it doesn't show ATI GPU 1 as OpenCL capable as well, even though it clearly states above that my HD 5870 also makes use of OpenCL. not a big deal though...i've already run some OpenCL-based v1.02 tasks on it successfully. i should note that i was running BOINC v6.12.41 and was pleasantly informed by the manager under the messages tab that it didn't recognize a command in the cc_config file. going back to where i originally referenced it, i found out that only BOINC v6.13.xx and later recognized the <exclude_gpu> command. i actually skipped v6.13.xx and updated straight to v7.0.12 since it recognizes OpenCL-capable devices. my major gripe with it so far is the altered functionality of the project caches. you see, whereas with v6.12.41 i could maintain a decent-sized, but not overwhelming cache. with v7.0.12 it draining the caches of all of my projects until they're nonexistent, and only then does my host bother to do communication w/ the project server to get more work. if i manually update a project while any amount of work is left in its cache, it'll just say "not reporting or requesting work" in the event log. is anyone else experiencing these kinds of shenanigans w/ BOINC v7.0.12? perhaps a different v7.0.xx might solve it? or might i have to go back to v6.13.xx to eliminate this behavior? and if so, which of the v6.13.xx versions are stable while crunching the new Separation v1.02 tasks?

TIA,
Eric
59) Message boards : Number crunching : No ATI work (Message 53658)
Posted 13 Mar 2012 by Sunny129
Post:
thanks
60) Message boards : Number crunching : No ATI work (Message 53651)
Posted 13 Mar 2012 by Sunny129
Post:
what challenge? i see nothing about a challenge anywhere in the forums...unless i overlooked it.


Previous 20 · Next 20

©2024 Astroinformatics Group