Welcome to MilkyWay@home

Posts by JStateson

1) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69289)
Posted 4 hours ago by ProfileJStateson
Post:
Great work @JStateson!
From what understand, the Seti guys only made the custom Boinc for Linux. If we're running windows then we'd have to recompile our own, like you did. I'm running Milky on W10, so my cap is 900. But again, like you said, we can set multi instances to grab works if we anticipate a long down time.


Thanks VietOZ!

With 6 GPUs I am averaging just over 7 seconds per work unit so 900 units last only about 2 hours. I am currently running 2 clients on that same system with each client getting 900 units. This will last for a total of 4 hours. I could run 6 clients and spoof the number of GPUS to allow me to crunch through about 12 hours of down time. I accidently deleted 900 work units setting up the second client but know how to do it correctly and am working on script to automate the extra clients.
2) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69286)
Posted 10 hours ago by ProfileJStateson
Post:
I simply just alter the coproc file so I can get the max 900 then run an update command for every 92 seconds.

For linux:
watch -n 92 boinccmd --project http://milkyway.cs.rpi.edu/milkyway/ update


for windows:
:top
"C:\Program Files\BOINC\boinccmd" --passwd PASSWORD --project http://milkyway.cs.rpi.edu/milkyway/ update

TIMEOUT /T 92

goto top


Yes, that works and you might also use
"C:\Program Files\BOINC\boinccmd --host hostname:port -passwd….." 
for remote systems

It is also possible to get more than 900 but that is only useful if the project goes offline for maintenance as you can continue to crunch until it comes back online. It is easier to do that with a change to boinc but it could still be done using those same update commands, a change to the coproc file and multiple clients. It would be better if the project could handle this problem.

The SETI GPU users club has a secret Boinc client they share among themselves to bypass project and Boinc download restrictions. I did not want to join their club so it became a challenge for me to come up with the same type of mod to the client. I am making all my changes public on GitHub for anyone to see.
3) Questions and Answers : Windows : Connecting to my cruncher remotely (Message 69275)
Posted 2 days ago by ProfileJStateson
Post:
Hey guys,

I've got quite some problems connecting to my machine with no monitor attached.

This is what I have tried:

RDP - GPU computing stops
TeamViewer - Once I close the RDP session I can't connect anymore
TightVNC - Whenever I connect I just see a black screen
Anydesk - I get the Windows Login screen, then I logon and it says "Waiting to reconnect" or something like that and Nothing happens anymore

Is it really that difficult to connect to a Windows 10 machine?


Splashtop works fine for windows and uses CUDA on systems with NVIdia chips. Limit of 5 free servers. No Linux support but do have iPhone. If a remote system is on a different subnet you can pay $11 a year for truly "remote" or "mobile" access.

I also use realvnc which has 5 free but does not support different subnets or mobile. I never got the Linux version to work but I am not a Linux expert and ssh seems to be ok for what I used Linux for.
4) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69272)
Posted 2 days ago by ProfileJStateson
Post:


The latest version available for download is 7.14.2

Where can i download these new binaries?


http://stateson.net/bthistory/boinc_x64_for_milkyway.zip

The following procedure assumes that your original boinc.exe is at "/Program Files/boinc"

I do not have an install procedure so it must be installed manually

Extract the boinc.exe file from the zip archive and save it at /Downloads or where convenient
It can only be executed from the program directory so trying "boinc.exe --version" will tell you files are missing

You must stop boinc from executing before replacing it.
To stop boinc, First bring up the boinc manager, then exit the boinc manager and specify to stop programs from executing

After stopping boinc you should rename the original program from boinc.exe to old_boinc.exe

Copy the new program into the /Program Files/boinc folder


Starting up the boinc manager should also start up boinc. Check to see if the version is 7.15.0 for the new program. After a few minutes of looking at the event message you should notice a download of a few files. Eventually the number of work units waiting to be processed will rise up and hover near the maximum. The only time it will drop to 0 is when the project goes off-line. On my system the count stays between 850 - 890 all the time.



I have shortcuts for starting and stopping boinc but the normal startup for boinc must be removed from the windows registry or a conflict arises. PM me if you want to do this. They are not needed to get this milkyway version to work.

Let me know if a problem and I can put together a better set of instructions.

[EDIT]
for 32 bit systems (I have no way of testing this and no longer have xp, vista, 7 or 8)
http://stateson.net/bthistory/boinc_x32_for_milkyway.zip
5) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69268)
Posted 3 days ago by ProfileJStateson
Post:
I can't fully understand how it works.

Can you please detail the setup process?


I used VS2013 to build a new version of boinc "7.15.1" The boinc people have put together a nice package for building windows and also Linux.

I made a change to the source code of cs_schedule.cpp to delay boinc from asking the milkyway project for more data on every upload. It lets a minimum of 256 seconds go by before asking for more data. This allows one or two uploads to occur before data is requested and that seems to be what is needed to bypass the "you asked for data too soon" or the 91 second required limit.

The changes to the program are here
https://github.com/JStateson/MilkywayNewWork
all you need (ha ha) is Visual studio 2l013 or earlier and download from Berkeley the sources and windows dependencies. I can help you through the download from Berkeley but you will have to find your own VS2013 iso file. I don't recommend a torrent that pulls from any eastern block countries. No telling what "extra stuff" was included in the package. Once you download and get the original built, then download my changes and add them in.

If you don't want to do this you can downloads the boinc executables files I put there at GitHub. You will have answer a lot of "are you sure" questions as I did not buy any certificates that "bless" the download. It needs to go at program files\boinc if 64 bit or get the 32 bit one for the x86 program folder.. You might want to rename the original executable to boinc_original.exe

I delete the Linux one because it was built for my 18.04 and probably would not work on other Linux. It is actually a lot easier to build the Linux version as there is no need to hunt down a 7+ year old compiler or get all the windows dependencies.

It is not necessary to replace the cc_config.xml with the one I put at GitHub.

what the change does is about every four minutes or so it will ask for data from milkyway and will download enough to bring your total count up to 900 or whatever the limit.it.

Lemme know if a problem. I tested it with world community grid to make sure other projects are not affected. It only delays the milkyway project, no others af affected. I have no way of testing the 32bit version nor do I have a copy of xp, vista or win7 to test on so if a problem let me know.
6) Message boards : Number crunching : Number crunching with AMD S9100 (Message 69261)
Posted 3 days ago by ProfileJStateson
Post:
Your computers are hidden, hard to advise. think you are S.O.L on getting better video. i gave up getting hd7950 to work with my s9100. the s9000 pair perfectly but only one vid out. The fan to use is NMB BG0903-B047-VTL on s9000 but must be open mining rig not case. there is a mini display port hidden behind the retainer panel (the grill) on s9100 but i never got it to work.
7) Questions and Answers : Windows : General thinking about GPU computation (Message 69259)
Posted 4 days ago by ProfileJStateson
Post:

the good point of view on my opinion is I consume at about 115 W for this computation power with a GPU load of 98-99 % per WU with only 64-65 ° C of temperature (separation with ati GPu) - thats good in preserving electronic device

Does someone whant to share his/her thinking?

Diego


Yea, I can share my problem:

I have six of the "firepro"
https://milkyway.cs.rpi.edu/milkyway/top_hosts.php
There are 3 people ahead of me on the stats list. Two have a pair of VII and one has a pair of Titans. My 6 firepro (five s9000, one s9100) draw 1050 watts on a 220v line in the garage.

However,, I bought all 6 of those on ebay for under what a single VII cost. Titans are even more expensive. I have no idea what the power draw of those boards are at the wall.

When I complain to my GF that she failed to turn on the lights at night she complains about my milkyway gridcoin mining system that cannot pay for its electricity.

I may switch to SETI in which case I will slowly disappear off the stats list.
8) Message boards : Number crunching : How many CPUs? (Message 69248)
Posted 11 days ago by ProfileJStateson
Post:

Would you happen to know the entire app_config.xml statement/expression? I am not very familiar with these files.

Thanks!


At one time the eVga forum had a thread going about app_config files for various projects.

https://forums.evga.com/BOINC-app_configxml-file-settings-for-GPU-Projects-m2213432.aspx

I learned a few things there but that thread has not been updated for 2 years.

You can google for "boinc app_config.xml" and look for various projects

I had the idea once of helping put together a wiki for all the various app_config files for all projects but it would required help from the project principals to get, for example, a list of command line arguments to the project's app. I am not sure if some of the moderators even know what options the apps have.
9) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69244)
Posted 13 days ago by ProfileJStateson
Post:
Got the ubuntu version to work.

both win32, win64 and ubuntu executable are at

https://github.com/JStateson/MilkywayNewWork
10) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69243)
Posted 13 days ago by ProfileJStateson
Post:

Are you sure this is a cc_config.xml file and not an app_config.xml file? When I put it in my Boinc directory it gives me error messages:

11/9/2019 9:19:22 AM | | Unrecognized tag in cc_config.xml: <mw_low_water_pct>
11/9/2019 9:19:22 AM | | Unrecognized tag in cc_config.xml: <mw_high_water_pct>
11/9/2019 9:19:22 AM | | Unrecognized tag in cc_config.xml: <mw_wait_interval>


the old program will give that error message as it does not know about the new variables added
You have to use the new .exe
Rename the old one to boinc_old.exe or whatever
11) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69241)
Posted 13 days ago by ProfileJStateson
Post:

In your work flow file you seem to be sending more units then you get back from MW but the 'wait_interval' does seem to be working for you. Do you think a longer interval will result in the number being returned and the number being sent to you by MW will even out.



The delay just needs to be big enough to satisfy the projects requirement of "shut up for x seconds". Looking at "sched_reply_milkyway.cs.rpi.edu_milkyway.xml" I see
<request_delay>91.000000</request_delay>

So one could go down to 92.
The 256 keeps me at a total of near 900. Been running on two machines overnight, both win, and the total tasks remain near 900. I believe the client queues them up in order of arrival (FIFO) so there should be no stale work units even if the total count never drops before about 850 -880.

Working on a Linux version.

I have only tested this with one project "milkyway" but the mod I made tests for that project and does not muck with scheduling for othe projects.

The other system I am testing this on has pair of rx560, one each rx580 and hd7950 so is slower. It has a total count of between 890 and 905 work units.
12) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69238)
Posted 13 days ago by ProfileJStateson
Post:
That master[-slave project is on hold as I got a simpler way.

The problem I found is that one cannot upload results every time when asking for data on a fast system so I made a mod to check if the elapsed time from the last upload was greater then 256 seconds and only uploaded results when that happens. This actually worked too good as I started to get over 900 total work units so I had back off to allow more results to be uploaded. Currently, on my system, there are around 800 units at any one time which is nice if Milkyway goes offline as I can continue to crunch.

Program is here
https://github.com/JStateson/MilkywayNewWork

check out the document sample_mw_work_flow.txt

!!!!!!!!!!!!!Use at your own risk.!!!!!!!!!!!!!!!
13) Message boards : Number crunching : Long crunch time on new N-Body simulations? (Message 69233)
Posted 16 days ago by ProfileJStateson
Post:
Is there a difference in runtimes between the work units? A recent 4-core unit took longer than a 3-core unit!


I no longer run n-body. They have had some problems in the past.

What was the difference in time?

You might want to run a resource monitor to see how many cores it is using.

I have a tool here that I used to do a credit comparison on your 3 valid results. Ideally for the same amount of credit the cpu time should be the same
        Run Time     CPU Time     Credit
         (sec)         (sec)
            11.6          43.1          1.0
            26.1          74.6          1.0
            21.2          59.5          1.0

normalizing can show problems with credit calculations
or which gpu devices are faster or slower than others


Looks like the top task had fewer CPUs available than the bottom two
14) Message boards : Number crunching : Finally getting new tasks only seconds after running out. May not be worth the hassle. (Message 69225)
Posted 19 days ago by ProfileJStateson
Post:
Ideally the project should download a few tasks on every upload but life is not fair.

The best I could do previously was to send that "UPDATE" message about 2-3 minutes after the last task was completed. That gave an average of 7 minutes of idle time unlike the 12-15 without that update. With 6 GPUs each handling 5 concurrent tasks at 55 seconds per work unit I was losing roughly 50 tasks every 2 or so hours probably 500 a day minimum using just that 7 minutes. I spent a long time looking at this and it quickly became a "challenge" even though the amount of credit was small.

I had to create a pair of clients: "slave" and "master". Both start up within seconds of one another and both exit when idle. There is actually a command to do that "boinc.exe --exit_when_idle" which was convenient. All I had to do (there was more*** of course) was to have each program send the message "allow_new_tasks" to the other and each program was in a "goto" loop The idea being the slave would introduce itself to Milkyway but not ask for data.. The master would start right in and a soon as the last work unit was crunched tell the slave it was time to start and vice-versa.

The scripts and the raw output are here
https://stateson.net/images/mw_chatter_m_s.txt

However, I pasted the important stuff below

Slave task started at:

03-Nov-2019 15:12:30 [---] Running under account jstateson...
---
---
45 minutes later it ran out of data, exited and started right back up
---
03-Nov-2019 15:57:59 [Milkyway@Home] Reporting 5 completed tasks
03-Nov-2019 15:57:59 [Milkyway@Home] Not requesting tasks: "no new tasks" requested via Manager
03-Nov-2019 15:58:00 [Milkyway@Home] Scheduler request completed
03-Nov-2019 15:58:00 [---] exiting because no more results
03-Nov-2019 15:58:00 [---] Time to exit
03-Nov-2019 15:58:00 [---] Starting BOINC client version 7.15.0 for windows_x86_64

the Message "allow new tasks" was sent by the slave to the master at the exact time of 15:58:00 as shown below
the master was started 12 seconds after the slave

03-Nov-2019 15:12:42 [---] Starting BOINC client version 7.15.0 for windows_x86_64
---
---
---
03-Nov-2019 15:12:44 Initialization completed
03-Nov-2019 15:15:07 [Milkyway@Home] project resumed by user
03-Nov-2019 15:58:00 [Milkyway@Home] work fetch resumed by user
03-Nov-2019 15:58:01 [Milkyway@Home] Sending scheduler request: To fetch work.
03-Nov-2019 15:58:01 [Milkyway@Home] Requesting new tasks for AMD/ATI GPU
03-Nov-2019 15:58:06 [Milkyway@Home] Scheduler request completed: got 900 new tasks


=================improvement============
15:57:59 the slave is out of data
15:58:06 the master got 900 tasks

About 7 seconds of idle time and I manually counted of about 8 seconds before all 6 GPUs had 5 tasks running on each one.

Anyway, if anyone wants to try this the scripts I used are listed in the above url.
***unfortunately, it was not possible to implement this without a few modifications to the boinc client using VS2013
I can put the source code changes at GitHub if anyone wants to try build the app. I can use some help with remaining debugging and some features such as buffering up unprocessed work units to survive a scheduled off line period.
15) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69217)
Posted 21 days ago by ProfileJStateson
Post:
PrimeGrid would work well for you if you pick the GFN-15 units, then after the 10 minute wait MilkyWay will refill your cache and you will be crunching here again.


Primegrid is not on the Gridcoin whitelist, discussion here

Clearly, the principals involved have a fear of miners faking prime number to get credits which is rightly called the "gridcoin derangement syndrome". However, that does not stop them from accepting gridcoin contributions in addition to paypal. Reminds me of the "A-listers" who took private jets to lecture us on climate change. The typical celebrity uses more electricity in one month than first Worlders do in 2 or 3 years.
16) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69214)
Posted 22 days ago by ProfileJStateson
Post:
I may have to use multiple clients when I get my 20 GPU rig running. 900 WUs for 20 GPUs would run out very quickly, although that would be fine if they fixed the problem of not getting new work until you've stopped reporting the completed ones, otherwise I'd have them sat idle quite often.


I agree, it would be best if the project would upload a few new tasks each time results get downloaded. From my own experience building the client under both win & Linux, I know it is difficult to figure out what is going on plus the latest usable windows compiler is VS-2013. It is really difficult to maintain source code for all the different systems that boinc can run on and I didn't appreciate the effort until I tried making a few changes.

I bought up some feature I thought might be useful here but the head-shed didn't seem to like it.
https://github.com/BOINC/boinc/issues/3337

I was able to add the following to the boinc client
   --force_hostname <name>        use this as hostname
    --set_password <password>      rpc gui password
    --mwBackoff N                  seconds to force project backoff
    --spoof_gpus N                 fake number of gpus


However, my "spoof_gpus" is NOT the same as the ones the SETI GPU users group have in their secret boinc app. It just allowed a single client to claim ownership for all the actual GPUs instead of just the one it is assigned in cc_config after I excluded the others. I am not sure if I need this but I noticed it got me a lot more than I expected for a new client. It will not get more than 900 from MW in any event.

I have put together a script that will create any number of clients on a single system. It is designed for SETI only and I plan to use it on the next SETI CRUNCH EVENT as I found that some users were archiving and processing work units months before the event started. I don't plan on using my script here, only next year at that SETI crunch-a-thron.
17) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69212)
Posted 22 days ago by ProfileJStateson
Post:
I tried to test it by aborting all tasks in progress, maybe I confused it,


That should have run your batch file.

You might be able to debug the problem by picking another project and setting resources to 0 so only one task at a time. Wont have to wait several hours unless you pick gpugrid. However, it wont work if the project sends a second one before the first one finishes.

Plants-vs-zombies can run with milkyway. no problem. It can be made interesting if you try to pass all the rooftop level with only 1 sun. If Interested I have a list of challenges for PVZ you can try. I cannot play any FPS due to motion sickness so I stick to PVZ and the win7 spider solitaire.
18) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69210)
Posted 22 days ago by ProfileJStateson
Post:
Hey guys,

So the current set up allows for users to have up to 200 workunits per GPU on their computer and another 40 workunits per CPU with a maximum of 600 possible workunits.

On the server, we try to store a cache of 10,000 workunits. Sometimes when a lot of people request work all at the same time, this cache will run low.

So all of the numbers I have listed are tunable. What would you guys recommend for changes to these numbers?

Jake


I'd like the limit to be only related to the number of GPUs. Eg. if I had (and I will shortly) a host with 20 GPUs, I could get 20 x 300 units, not just 600 (which would be only 30 per GPU).


Not sure where those number came from but I have never seen more than 900 work units on my 6 GPU system but get 600 on my two GPU system.
I tried to figure this out as follows:
GPU    Cores    Thread      Number acquired       How calculated
---  -------    ---------   ---------------        ------------------
6       4           8             900                have no idea
2      12          24             600                 ditto

However, all the effort put into this project and others (especially SETI) to prevent overloading the server by throttling users is for naught when users can "spoof" the number of GPUs. One can even get around the maximum number of task per device by having multiple clients on the same system.
19) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69208)
Posted 22 days ago by ProfileJStateson
Post:
I posted too soon and was unable to correct before you replied. My fingers get ahead of my thoughts. I have not figured out which key does it, but it is possible to post without clicking the OK button with the mouse.

Should work with "00d,00:02:30" for the time field and value of 1 is ok. Make sure you run the "Check" button at least once and then make sure it is active.


I've never managed to post with a key on this forum, but I've had a similar problem with emails in Opera. By default it has HUGE numbers of keystrokes assigned to tasks. It's very badly designed, as single keypresses do stuff. I would prefer always something like CTRL-F to make something happen. With single keypresses, they can occur when I think I'm typing an email, and because the wrong thing was selected, I've now performed 10 unknown tasks instead of typing a sentence.

Why 2:30? AFAIK the timer on MW is 1:31.

The active tick annoyed me. So many things in Boinc and Boinctasks you can set something up then you have to activate it too. Nothing is sensible anymore. I remember when you set things up in a dialog box then pressed ok or cancel. Nowadays ok is assumed in windows, for example there's no ok or cancel in control panel anymore, you have to hope it saved it. And if you changed your mind, tough!

Anyway, nothing you've suggested I think will cause it to run the program. I wasn't getting a "too soon" complaint from MW, it just wasn't starting the batch file for some reason. But it did start it a lot later after some minutes, not sure why. Something is causing it not to immediately detect a lack of WUs in MW.


I am not privy to inner workings of BT but I believe it works on a transition from having tasks to not having so it wont bother calling your batch file if it never saw any work units in the first place. As a consequence it does not sent out subsequent commands to run the batch file if nothing shows up again (project went off line).
20) Message boards : News : 30 Workunit Limit Per Request - Fix Implemented (Message 69206)
Posted 22 days ago by ProfileJStateson
Post:
You have 0 under "Time", it needs to be 160 seconds at least else you get the "last request too soon"


I've changed it to 1 minute 40 seconds, as the MW backoff appears to be 1 minute 31 seconds judging by "Project requested delay of 91 seconds " in the messages.

At "value" you need to have a positive number as BT has to see the project is empty for at least one second.


I don't understand, I already have a positive value don't I? Of 1 second. As in "if less than 1 second of work left, then run the program".


I posted too soon and was unable to correct before you replied. My fingers get ahead of my thoughts. I have not figured out which key does it, but it is possible to post without clicking the OK button with the mouse.

Should work with "00d,00:02:30" for the time field and value of 1 is ok. Make sure you run the "Check" button at least once and then make sure it is active.


Next 20

©2019 Astroinformatics Group