Welcome to MilkyWay@home

Posts by SuperSluether

1) Message boards : Number crunching : Out Of Work? (Message 65418)
Posted 10 Oct 2016 by Profile SuperSluether
Post:
What is the underlying cause for the recent problems?

e.g. Was there a recent change to the server that caused this, or is MilkyWay being overloaded with new crunchers now that Einstein and Poem shut down their GPU projects?
2) Message boards : Number crunching : Out Of Work? (Message 65399)
Posted 6 Oct 2016 by Profile SuperSluether
Post:
Considering "Milkyway@home" only has 82 tasks ready to send right now, and I grab 60 tasks every time I request more work, I'd say the work generator is either not keeping up, or something else is wrong.

Personally I think we're overloading the project. Trying to log-in just to post here gave me an SQL error saying there were too many connections, and that no account with my e-mail address existed.
3) Message boards : Number crunching : Milkyway@Home preventing other applications from running (Message 62644)
Posted 28 Oct 2014 by Profile SuperSluether
Post:
"It just makes sense that BOINC is going to spend a LOT of time processing Milkyway work until Milkyway "catches up". "

And that has been one of my major gripes about the scheduler for quite a while

I really don't care what the total credits or recent averages are

When I set 33% resource share for a project, what I expect is 33% FROM THAT POINT FORWARD, until I change it

(I do agree that micromanaging tends to create a lot more headaches than it fixes, tho)


I don't know for sure (and every project is a little different) but I think BOINC credit is something like work done times the time it took equals how much credit. (the exceptions are projects like Bitcoin Utopia that give out way too much credit) Because credit is usually based on how much work is done, BOINC might be trying to use that to give an equal amount of scientific data to each project, which is different from just giving the same amount of processing time.
4) Message boards : Number crunching : Some "strange" crunchers (Message 62643)
Posted 28 Oct 2014 by Profile SuperSluether
Post:
I know this is an old thread, but has anyone else noticed that all of the aborted tasks are for a GPU? All of the CPU tasks complete, but each and every GPU task, whether Nvidia or ATI, have been aborted by the user. Maybe these users don't want GPU tasks for this project and don't know how to get rid of them?
5) Message boards : Number crunching : Exceeded disk limit - N-body (Message 62441)
Posted 1 Oct 2014 by Profile SuperSluether
Post:
A little off-topic, but what's going on with all these errors? I know one of the applications just updated recently, but is it really causing all these problems?
6) Message boards : Number crunching : Anyone else only getting half their usual work buffer since the feeder error? (Message 62440)
Posted 1 Oct 2014 by Profile SuperSluether
Post:
That's pretty darn fast, even for a good GPU! Are you sure the tasks are validating and not just erroring out? I'm getting off-topic...

After the feeder comes back from an outage, it's usually overloaded. It has to rebuild the work cache for the scheduler as requests come in. Yesterday, the entire MilkyWay project was running out of work at about 400 tasks per application. Today it's back at about 37,000 for the N-Body Simulation.

I don't have a graphics card, but maybe the application that runs on GPUs is just running out of work, and doesn't want to give all the work to the same person. (tasks have to validate with more than 1 person)
7) Message boards : Number crunching : Milkyway@Home preventing other applications from running (Message 62439)
Posted 1 Oct 2014 by Profile SuperSluether
Post:

Yup it sounds like deadlines are killing you right now, I suspect the short MW deadlines aren't letting the units switch. I would lower your overall cache size and see if Boinc doesn't start downloading fewer units from each project, but swapping back and forth at the 60 minute mark.


On my machine there hasn't been a switch away from the current nbody task for over 7 hours using the 6 out of 8 cores I've allocated to Boinc, even though another project that has unfinished tasks has a nearer deadline. It quite puzzling - as if these tasks are programmed to prevent other tasks running.

- Richard.


It's just an overly-complicated programmatical thing that BOINC does. (is programmatical a word?) BOINC runs benchmarks to see how fast your processor is, so it does its best to get tasks reported on time. If BOINC thinks a task is nearing the deadline, it'll run those tasks at "high priority" and say so in the manager.

Just throwing out an example, Collatz Conjecture was a project of the month for my team. When I re-attached all my other projects a few days ago with every project at the same resource share, the work buffer filled with just one project. After a while (maybe the hour switch?) another project filled the buffer, and so on.

From what I've seen, it's best to let BOINC do its thing. The 'switch applications every 60 minutes' doesn't seem to work all the time, and it doesn't necessarily mean switching tasks. (there's a difference between tasks and their applications) And even though my resource shares are all the same, I think BOINC tries to get the same crunch rate for each, which is why Collatz hasn't had any tasks for a few days because it already had a lot of crunching done.
8) Message boards : Number crunching : CPU/GPU Comparison (do we need CPU apps when GPU app is available) (Message 62378)
Posted 23 Sep 2014 by Profile SuperSluether
Post:
but you are talking about alienating your user base, ie your work force. And that is most likely not a viable option for a volunteer project.


Exactly! BOINC Projects are all volunteer based. You can't pick and choose with something like this.
9) Message boards : Number crunching : CPU/GPU Comparison (do we need CPU apps when GPU app is available) (Message 62377)
Posted 23 Sep 2014 by Profile SuperSluether
Post:
Even with GPU apps, CPUs are the most common and hardworking. Aside from what mikey said about GPU vs CPU, older computers can't use onboard graphics, and newer computers still have slow onboard chips. Even though GPUs are fast, removing CPU apps would result in a lot less work coming in. That and the fact that the only people running the project would be 'rich' people with good cards.
10) Message boards : Number crunching : Deadline is too short !!! (Message 62194)
Posted 18 Aug 2014 by Profile SuperSluether
Post:
Some of the tasks lately have had outrageous processing time. I'm not sure what happens to overdue tasks on this project, but I would think that you could just abort it if it looks like it's not going to finish in time. Once the deadline has passed, the task is resent to somebody else anyway.
11) Message boards : Number crunching : N-Body long processing time (Message 62185)
Posted 16 Aug 2014 by Profile SuperSluether
Post:
It looks like it's just a rediculous estimated time. I have an i7 4770 running at 3.5 ghz, and I've only ever gotten 1 task that took that long. It was taking too long, so after about 12 hours I aborted it manually.
12) Message boards : Number crunching : New N-Body Release 1.42 (Message 62184)
Posted 16 Aug 2014 by Profile SuperSluether
Post:
I haven't looked at the error report, but I've been getting computation errors with N-Body Simulation 1.42 as well. They run on 8 CPUs and error out after about 15 minutes. Probably just a glitch. I heard that some computers trash loads of tasks every day. :)
13) Message boards : Number crunching : Memory Leak (Message 62173)
Posted 15 Aug 2014 by Profile SuperSluether
Post:
I'm not running Milkyway now but I'm sure there was a memory leak when I last ran Milkyway Separation (not Modified Fit) OpenCL tasks, to the extent where a single task used the whole 2GB I allowed for BOINC and other tasks were suspended for lack of memory. When restarted from a checkpoint, the same task would run with only 90MB, but increasing again. The rate of memory loss depended on the work chunk frequency setting in the Milkyway preferences.


At least some of what you are seeing as a 'leak' is by design, as the unit progresses, and finds things, it takes more memory to process that data then at the start of the unit. If it goes back to a checkpoint it should then build up the memory usage again to a similar place it was before.


No, that's not it. A memory leak occurs when a program stores something but doesn't need or use it. All tasks through BOINC should checkpoint to the harddrive at regular intervals to save processed data, avoiding this problem.
14) Message boards : Number crunching : Memory Leak (Message 62160)
Posted 13 Aug 2014 by Profile SuperSluether
Post:
We recommend that you tell your AV software to ignore the BOINC data folder as they have all seemed to flag BOINC apps falsely lately.


It's not flagging it as a virus, it's detecting an application error. Either way, it doesn't seem to matter.
15) Message boards : Number crunching : Memory Leak (Message 62157)
Posted 13 Aug 2014 by Profile SuperSluether
Post:
I recently started using Webroot SecureAnywhere as my antivirus program. In a system analysis, Webroot has detected a possible memory leak for milkyway_separation_modified_fit_1.30_windows_x86_64.exe. Additionally, sometimes a possible handle leak is also detected for this process.

Is this error serious, or nothing to worry about? If I need to do something, is there a way to fix the error myself? I have 8GB of RAM on my system, so I don't know if the memory and handle leaks will affect my system because the tasks finish so quickly. This might cause problems on older computers with less RAM though.




©2024 Astroinformatics Group