Welcome to MilkyWay@home

Posts by mmonnin

1) Questions and Answers : Web site : Question about new team.. (Message 70901)
Posted 23 Jun 2021 by mmonnin
Post:
Just make it. Some projects never pick up the BOINC teams.
2) Message boards : Number crunching : Not receiving enough with WUs (Message 70900)
Posted 23 Jun 2021 by mmonnin
Post:
I always run several BOINC clients when running MW to allow for 300 tasks on each client. Radeon VII spits them out in a couple of hours. Stagger the clients a bit and the card will always stay busy.
3) Message boards : MilkyWay@home Science : Galactic Rotation Curves Question. (Message 70887)
Posted 19 Jun 2021 by mmonnin
Post:
I never cared for this reasoning that the outer stars are orbiting too fast and that there must be more mass outside of the visible areas of galaxies. ESA Gaia mission shows the orbits of stars are not circular or constant speeds around the galactic center. Some have orbits like comets which would seem to throw off the idea that the stars are rotating with similar periods and thus too fast in the outer regions.
https://www.youtube.com/watch?v=z6xdKs9KlCQ
4) Message boards : Number crunching : GPU upgrade time (Message 70374)
Posted 16 Jan 2021 by mmonnin
Post:
MW scales right along with FP64 performance. Not shaders. This is an old list but a good reference for most of the top FP64 cards since double precision keeps being cut.
https://www.geeks3d.com/20140305/amd-radeon-and-nvidia-geforce-fp32-fp64-gflops-table-computing/

A 3080 has 465.1 FP64 GFLOPs performance which is in between a Radeon HD 5850 and Radeon HD 5870. So yeah a NV 30xx cards will suck here just like their desktop parts always have
5) Message boards : Number crunching : Intel Xeon Phi Coprocessor (Message 69330)
Posted 5 Dec 2019 by mmonnin
Post:
I don't think there is any BOINC app at any project for this GPU.
6) Message boards : Number crunching : Delay in getting new work units untill all work units have cleared (Message 69143)
Posted 30 Sep 2019 by mmonnin
Post:
We have been monitoring the situation, and it seems like the community has found fixes to some of the problems you are experiencing.

Jake said that the problem appeared to be some obscure BOINC setting somewhere, and had asked BOINC forums about it. It looks like this issue disappears in the new beta of a BOINC client, so they must have patched whatever was causing problems. When that is released, hopefully the problem will be resolved.

- Tom


What fixes are those? MW work runs out, waits a couple of minutes then the server finally gives us more work. The server should be give us more work the entire time, not wait until our MW queues are empty to provide more work.
7) Message boards : Number crunching : How many CPUs? (Message 69142)
Posted 30 Sep 2019 by mmonnin
Post:
The BOINC Client manages this, not a project. mt apps will sometimes use 1 thread during the initial setup phase prior to the science app starting to do work then BOINC to stop other tasks once the science app starts to use more CPU. The exception I have seen are when there are some tasks close to deadline and in high priority mode.

The client needs a full CPU thread to reserve a thread so 0.0497 of a CPU does not count towards your 6 available threads.
8) Message boards : Number crunching : Rx570 vs. gtx 1080, 1080ti, 2080 (Message 69116)
Posted 24 Sep 2019 by mmonnin
Post:
PrimeGrid GFN are FP64 as well. The last NV consumer card with high FP64 was the Titan Black I believe. Now NV leaves that to the Pro cards like Tesla.
9) Message boards : Number crunching : Delay in getting new work units untill all work units have cleared (Message 69109)
Posted 23 Sep 2019 by mmonnin
Post:
Setup a 2nd gpu project with a zero resource share so when MW runs out it will get tasks and your gpu won't be idle while waiting for MW to refill the cache. PrimeGrid and Collatz are two projects that almost always have tasks, at PrimeGrid you can pick wu's that run very quickly like the MW wu's do and at Collatz if you do use the optimization codes the wu's will run much faster as well.


This is just a workaround, not a solution.. developers must fix it

This is for sure a server-side misconfiguration, in 15years of boinc it never happened with any other project


I do this for every client and its always a good idea no matter your main project.
10) Message boards : Number crunching : WUs not downloaded in time - rig is idling - doing no work ... (Message 69098)
Posted 21 Sep 2019 by mmonnin
Post:
when you ron out of WU the first deferred communication is always more or less 1.40minutes.. in this first stage it uploads the latest results (and don't downloads anything)

then begin the second deferred communication of more or less 12minutes (can't remember).. the gpu is idling and then since no results to report, the downloads of 300wu begin

this loop is always the same on all hosts.. next time i'll run out of WU i will post the exact times..


There is nothing to actually upload. Set the client to no networking and the tasks go straight to Waiting to Report and not Uploading. There are no data files to download or upload for this project per task.
11) Message boards : Number crunching : Rx570 vs. gtx 1080, 1080ti, 2080 (Message 69066)
Posted 19 Sep 2019 by mmonnin
Post:
AMD RX cards perform MUCH better at E@H than MW@H. That is where I ran my RX580.

This project favors cards with high FP64 compute power. So AMD 78xx/R9, NV Titan Black, AMD Radeon VII, NV Titan V. Most of the top PCs run one of those 4 generations of GPUs.
https://milkyway.cs.rpi.edu/milkyway/top_hosts.php

I have most of your cards, the RX, 1080 and Ti but will never run them at MW@H as they have low FP64 compute power. But if you want, run as many simultaneous tasks to keep the utilization pegged. The CPU port of <gpu_versions> just dedicates that much CPU to NOT run CPU tasks. It in no way affects the actual CPU usage of the GPU app. If you put <cpu_usage>4</cpu_usage> BOINC will not run 4 CPU tasks and leave those 4 CPUs for the GPU even if the GPU task uses 0.1 CPU threads in Task Mgr.
12) Message boards : Number crunching : de_modfit_80_bundle4_4s_south4s - error messages (Message 69065)
Posted 19 Sep 2019 by mmonnin
Post:
0130 Thursday (Central EU time) the reinstallation of studio driver manually seems to have fixed this issue.
task is running ok.


The very 1st suggestion was the fix...
13) Message boards : Number crunching : Delay in getting new work units untill all work units have cleared (Message 69064)
Posted 19 Sep 2019 by mmonnin
Post:
https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4424

Already discussed.
14) Message boards : Number crunching : WUs not downloaded in time - rig is idling - doing no work ... (Message 69050)
Posted 17 Sep 2019 by mmonnin
Post:
Yes, this was reported in May. From the last result there needs to be a 10min period of no requests until the clients can get more work.
https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4424&postid=68441#68441

I'm about to setup a script to turn off networking for like 11min, resume and do a project update, allow for 30min or so then repeat.
15) Message boards : Number crunching : de_modfit_80_bundle4_4s_south4s - error messages (Message 69037)
Posted 15 Sep 2019 by mmonnin
Post:
Post results from the command 'clinfo' in CMD prompt. You may have installed NV drivers but Win10 probably overwrote it.
16) Message boards : Number crunching : MilkyWay takes a backseat to Einstein ??? (Message 68992)
Posted 28 Aug 2019 by mmonnin
Post:
I always run 1 client for CPUs and a separate client for GPU work on a single PC. That'll fix caching issues between CPU and GPU tasks.
17) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68483)
Posted 6 Apr 2019 by mmonnin
Post:
200 computed tasks in less 10 minutes? It is not possible even for fastest machines. Very best computers with few modern powerful GPUs working in parallel dedicated to the single project of MW can do 200 tasks "only" in ~20-40 min.


Beg to differ. If I have a host with 8 RTX 2080 TI cards or similar, I can easily crunch through 200 tasks in ten minutes. There are many hosts with mining rig pedigrees that have multiple gpus. I have a minimum of 3 cards in every host.


Then your task count is higher with more cards. 200 is the limit for 1 GPU and the statement was in regards to 1 single GPU. Only a TV or 7 is crunching in that time with 200 tasks per GPU.

It seems like its hard enough to get the admins to realize the issue wasn't how many task can be downloaded at once but the timeout issue completely preventing tasks from downloading at all. Please stay on topic instead of e-peening about omg my gpus can do it in 10minutes.
18) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68455)
Posted 29 Mar 2019 by mmonnin
Post:
Well for my 280x it takes over 2 hours but there are quite a few requests within 10min. I wish for 30min. :) I'd guess a PC would need to run 1 task for over 10min to not run into the issue.

These 4 lines come every 2-4 task completions . Some complete, none are downloaded. Queue runs dry. Moo/Collatz take over 10min for a task and a new set of MW tasks arrive.

362062 Milkyway@Home 3/28/2019 11:01:04 PM Sending scheduler request: To fetch work.
362063 Milkyway@Home 3/28/2019 11:01:04 PM Reporting 2 completed tasks
362064 Milkyway@Home 3/28/2019 11:01:04 PM Requesting new tasks for AMD/ATI GPU
362065 Milkyway@Home 3/28/2019 11:01:06 PM Scheduler request completed: got 0 new tasks

Can the server distinguish between auto updates like those above and user updates so that the former could have a lower limit than possible user spam? The log file mentions a user update but does the server know?
19) Message boards : Number crunching : Download Stalled? (Message 68452)
Posted 28 Mar 2019 by mmonnin
Post:
Recently been having trouble getting new tasks, because some N-Body won't download - and I haven't opted for N-Body tasks for a couple of years? Is there a simple fix for this?

3/28/2019 3:03:11 PM | Milkyway@Home | Reporting 4 completed tasks
3/28/2019 3:03:11 PM | Milkyway@Home | Not requesting tasks: some download is stalled
3/28/2019 3:03:13 PM | Milkyway@Home | Scheduler request completed
3/28/2019 3:13:16 PM | Milkyway@Home | Sending scheduler request: Requested by project.
3/28/2019 3:13:16 PM | Milkyway@Home | Not requesting tasks: some download is stalled
3/28/2019 3:13:17 PM | Milkyway@Home | Scheduler request completed
3/28/2019 3:23:18 PM | Milkyway@Home | Sending scheduler request: Requested by project.
3/28/2019 3:23:18 PM | Milkyway@Home | Not requesting tasks: some download is stalled
3/28/2019 3:23:20 PM | Milkyway@Home | Scheduler request completed
3/28/2019 3:33:23 PM | Milkyway@Home | Sending scheduler request: Requested by project.
3/28/2019 3:33:23 PM | Milkyway@Home | Not requesting tasks: some download is stalled
3/28/2019 3:33:25 PM | Milkyway@Home | Scheduler request completed


Have you tried resetting the project?
20) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 68450)
Posted 28 Mar 2019 by mmonnin
Post:
Hey guys,

So the current set up allows for users to have up to 200 workunits per GPU on their computer and another 40 workunits per CPU with a maximum of 600 possible workunits.

On the server, we try to store a cache of 10,000 workunits. Sometimes when a lot of people request work all at the same time, this cache will run low.

So all of the numbers I have listed are tunable. What would you guys recommend for changes to these numbers?

Jake


It's not any of these settings. When the server allows work, we get work. But there is a timeout to prevent users from spamming projects with frequent requests. These tasks are so quick tasks are constantly uploading. About ever 30-35 seconds for me. So we are constantly requesting too frequently until all tasks are done, the delay passes and then we can get more work.

I still have a PC that has not contacted the server after the upgrade. The sched_reply_milkyway.cs.rpi.edu_milkyway.xml file does not have this line at all in the old version.

<next_rpc_delay>600.000000</next_rpc_delay>

https://boinc.berkeley.edu/trac/wiki/ProjectOptions#client-control

For reference, the entire old version of the file minus some user info.

<scheduler_reply>
<scheduler_version>707</scheduler_version>
<dont_use_dcf/>
<master_url>http://milkyway.cs.rpi.edu/milkyway/</master_url>
<request_delay>91.000000</request_delay>
<project_name>Milkyway@Home</project_name>
<project_preferences>
<resource_share>10</resource_share>
<no_cpu>1</no_cpu>
<no_ati>0</no_ati>
<no_cuda>0</no_cuda>
<project_specific>
<max_gfx_cpu_pct>20</max_gfx_cpu_pct>
<gpu_target_frequency>60</gpu_target_frequency>
<nbody_graphics_poll_period>30</nbody_graphics_poll_period>
<nbody_graphics_float_speed>5</nbody_graphics_float_speed>
<nbody_graphics_textured_point_size>250</nbody_graphics_textured_point_size>
<nbody_graphics_point_point_size>40</nbody_graphics_point_point_size>
</project_specific>
<venue name="home">
<resource_share>50</resource_share>
<no_cpu>0</no_cpu>
<no_ati>1</no_ati>
<no_cuda>1</no_cuda>
<project_specific>
<max_gfx_cpu_pct>20</max_gfx_cpu_pct>
<gpu_target_frequency>60</gpu_target_frequency>
<nbody_graphics_poll_period>30</nbody_graphics_poll_period>
<nbody_graphics_float_speed>5</nbody_graphics_float_speed>
<nbody_graphics_textured_point_size>250</nbody_graphics_textured_point_size>
<nbody_graphics_point_point_size>40</nbody_graphics_point_point_size>
</project_specific>
</venue>
</project_preferences>

<result_ack>
    <name>de_modfit_sim19fixed_bundle4_4s_NoContraintsWithDisk260_3_1533467104_9447502_1</name>
</result_ack>
<result_ack>
    <name>de_modfit_sim19fixed_bundle4_4s_NoContraintsWithDisk260_1_1533467104_9241919_1</name>
</result_ack>
</scheduler_reply>


Next 20

©2021 Astroinformatics Group