Welcome to MilkyWay@home

Posts by Travis

41) Message boards : News : another scheduler update (Message 58746)
Posted 11 Jun 2013 by Profile Travis
Post:
3. hosts with ATI GPUs aren't getting workunits...

I'm in that boat. My 5870 host got it's last WU at about 1950 UTC.


Have you tried to grab work recently? I just made a couple more updates.

If it's not, what's the error message, if any?
42) Message boards : News : another scheduler update (Message 58744)
Posted 11 Jun 2013 by Profile Travis
Post:
I fall under this catagory.
2. hosts with ATI GPUs that don't have the compute capability are getting GPU workunits.


As of now?

Also, what's the error message (if it's printing out one)?
43) Message boards : News : another scheduler update (Message 58743)
Posted 11 Jun 2013 by Profile Travis
Post:
4.) Winbox CPU host not getting anything.


Looks like you just got some? From the scheduler (I XXXed out your IP and host id just in case you're hiding those):

2013-06-11 19:32:15.5706 [PID=22629] Request: [USER#5696] [HOST#XXXXX] [IP XXXXXXXX] client 6.12.34
2013-06-11 19:32:15.5955 [PID=22629] [send] [HOST#XXXX] app version 321 is reliable
2013-06-11 19:32:15.5955 [PID=22629] [send] set_trust: random choice for cons valid 1165: yes
2013-06-11 19:32:15.5955 [PID=22629] [send] [AV#385] not reliable; cons valid 0 < 10
2013-06-11 19:32:15.5955 [PID=22629] [send] set_trust: cons valid 0 < 10, don't use single replication
2013-06-11 19:32:15.5955 [PID=22629] [send] [HOST#XX] app version 398 is reliable
2013-06-11 19:32:15.5955 [PID=22629] [send] set_trust: random choice for cons valid 76: yes
2013-06-11 19:32:15.5955 [PID=22629] [send] [HOST#XX] app version 418 is reliable
2013-06-11 19:32:15.5955 [PID=22629] [send] set_trust: random choice for cons valid 17442: yes
2013-06-11 19:32:15.5955 [PID=22629] [send] [AV#430] not reliable; cons valid 0 < 10
2013-06-11 19:32:15.5955 [PID=22629] [send] set_trust: cons valid 0 < 10, don't use single replication
2013-06-11 19:32:15.5955 [PID=22629] [send] [AV#436] not reliable; cons valid 0 < 10
2013-06-11 19:32:15.5955 [PID=22629] [send] set_trust: cons valid 0 < 10, don't use single replication
2013-06-11 19:32:15.5955 [PID=22629] [send] [AV#438] not reliable; cons valid 1 < 10
2013-06-11 19:32:15.5955 [PID=22629] [send] set_trust: cons valid 1 < 10, don't use single replication
2013-06-11 19:32:15.5955 [PID=22629] [send] [HOST#XX] app version 445 is reliable
2013-06-11 19:32:15.5956 [PID=22629] [send] set_trust: random choice for cons valid 23: yes
2013-06-11 19:32:15.5956 [PID=22629] [send] [HOST#XX] app version 451 is reliable
2013-06-11 19:32:15.5956 [PID=22629] [send] set_trust: random choice for cons valid 148: yes
2013-06-11 19:32:15.5956 [PID=22629] [send] [HOST#XX] app version 485 is reliable
2013-06-11 19:32:15.5956 [PID=22629] [send] set_trust: random choice for cons valid 510: yes
2013-06-11 19:32:15.5956 [PID=22629] [send] [AV#3000002] not reliable; cons valid 0 < 10
2013-06-11 19:32:15.5956 [PID=22629] [send] set_trust: cons valid 0 < 10, don't use single replication
2013-06-11 19:32:15.5956 [PID=22629] [quota] effective ncpus 4 ngpus 1
2013-06-11 19:32:15.5956 [PID=22629] [quota] max jobs per RPC: 400
2013-06-11 19:32:15.5956 [PID=22629] [quota] Overall limits on jobs in progress:
2013-06-11 19:32:15.5956 [PID=22629] [quota] CPU: base 3 scaled 12 njobs 0
2013-06-11 19:32:15.5956 [PID=22629] [quota] GPU: base 40 scaled 40 njobs 38
2013-06-11 19:32:15.5956 [PID=22629] [send] Not using matchmaker scheduling; Not using EDF sim
2013-06-11 19:32:15.5956 [PID=22629] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-06-11 19:32:15.5956 [PID=22629] [send] AMD/ATI GPU: req 81174.22 sec, 0.00 instances; est delay 0.00
2013-06-11 19:32:15.5956 [PID=22629] [send] work_req_seconds: 0.00 secs
2013-06-11 19:32:15.5956 [PID=22629] [send] available disk 2.82 GB, work_buf_min 86400
2013-06-11 19:32:15.5957 [PID=22629] [send] active_frac 0.945916 on_frac 0.996949
2013-06-11 19:32:15.5957 [PID=22629] [send] CPU features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni cx16 syscall nx lm svm sse4a osvw ibs skinit wdt page1gb rdtscp 3dnowext 3dnow
2013-06-11 19:32:15.5984 [PID=22629] [version] looking for version of milkyway
2013-06-11 19:32:15.5984 [PID=22629] [version] Checking plan class 'ati14'
2013-06-11 19:32:15.5984 [PID=22629] [version] Couldn't open plan class spec file '../plan_class_spec.xml'
2013-06-11 19:32:15.5984 [PID=22629] [version] ati14 ATI app projected 51.07G peak 5775.35G 0.963 CPUs
2013-06-11 19:32:15.5984 [PID=22629] [quota] [AV#485] scaled max jobs per day: 10510
2013-06-11 19:32:15.5984 [PID=22629] [version] [AV#485] (ati14) setting projected flops based on host elapsed time avg: 437.54G
2013-06-11 19:32:15.5984 [PID=22629] [version] [AV#485] (ati14) comparison pfc: 437.57G et: 437.54G
2013-06-11 19:32:15.5985 [PID=22629] [version] Best app version is now AV485 (437.61 GFLOP)
2013-06-11 19:32:15.5985 [PID=22629] [version] Checking plan class 'opencl_amd_ati'
2013-06-11 19:32:15.5985 [PID=22629] [version] plan_class opencl_amd_ati uses OpenCl version 0
2013-06-11 19:32:15.5985 [PID=22629] [version] [opencl] GPU/Driver/BOINC revision doesn not support OpenCL
2013-06-11 19:32:15.5985 [PID=22629] [quota] [AV#418] scaled max jobs per day: 27442
2013-06-11 19:32:15.5985 [PID=22629] [version] [AV#418] (opencl_amd_ati) setting projected flops based on host elapsed time avg: 363.93G
2013-06-11 19:32:15.5985 [PID=22629] [version] [AV#418] (opencl_amd_ati) comparison pfc: 364.08G et: 363.93G
2013-06-11 19:32:15.5985 [PID=22629] [version] Comparing AV#418 (363.92 GFLOP) against AV#485 (437.61 GFLOP)
2013-06-11 19:32:15.5986 [PID=22629] [version] Checking plan class 'opencl_nvidia'
2013-06-11 19:32:15.5986 [PID=22629] [version] plan_class opencl_nvidia uses OpenCl version 0
2013-06-11 19:32:15.5986 [PID=22629] [version] [AV#416] app_plan() returned false
2013-06-11 19:32:15.5986 [PID=22629] [version] [AV#485] (ati14) setting projected flops based on host elapsed time avg: 437.54G
2013-06-11 19:32:15.5986 [PID=22629] [version] [AV#485] (ati14) comparison pfc: 437.57G et: 437.54G
2013-06-11 19:32:15.5986 [PID=22629] [version] Best version of app milkyway is [AV#485] (437.54 GFLOPS)
2013-06-11 19:32:15.5986 [PID=22629] [send] est delay 0, skipping deadline check
2013-06-11 19:32:15.5987 [PID=22629] [version] returning cached version: [AV#485]
2013-06-11 19:32:15.5987 [PID=22629] [send] est delay 0, skipping deadline check
2013-06-11 19:32:15.6013 [PID=22629] [send] Sending app_version milkyway 2 102 ati14; projected 437.54 GFLOPS
2013-06-11 19:32:15.6014 [PID=22629] [send] est. duration for WU 380375116: unscaled 45.24 scaled 47.97
2013-06-11 19:32:15.6014 [PID=22629] [send] [HOST#XX] sending [RESULT#498050348 de_separation_79_DR8_rev_2_1370993394_149_0] (est. dur. 47.97 seconds)
2013-06-11 19:32:15.6017 [PID=22629] [version] looking for version of milkyway_nbody
2013-06-11 19:32:15.6017 [PID=22629] [version] [AV#475] Skipping CPU version - user prefs say no CPU
2013-06-11 19:32:15.6017 [PID=22629] [version] Checking plan class 'mt'
2013-06-11 19:32:15.6017 [PID=22629] [version] Multi-thread app projected 10.50GS
2013-06-11 19:32:15.6017 [PID=22629] [version] [AV#481] Skipping CPU version - user prefs say no CPU
2013-06-11 19:32:15.6017 [PID=22629] [version] returning NULL; platforms:
2013-06-11 19:32:15.6017 [PID=22629] [version] windows_x86_64
2013-06-11 19:32:15.6017 [PID=22629] [version] windows_intelx86
2013-06-11 19:32:15.6017 [PID=22629] [version] returning cached version: [AV#485]
2013-06-11 19:32:15.6017 [PID=22629] [send] est. duration for WU 380375117: unscaled 33.83 scaled 35.88
2013-06-11 19:32:15.6017 [PID=22629] [send] [WU#380375117] meets deadline: 47.97 + 35.88 < 1036800
2013-06-11 19:32:15.6017 [PID=22629] [version] returning cached version: [AV#485]
2013-06-11 19:32:15.6017 [PID=22629] [send] est. duration for WU 380375117: unscaled 33.83 scaled 35.88
2013-06-11 19:32:15.6017 [PID=22629] [send] [WU#380375117] meets deadline: 47.97 + 35.88 < 1036800
2013-06-11 19:32:15.6034 [PID=22629] [send] Sending app_version milkyway 2 102 ati14; projected 437.54 GFLOPS
2013-06-11 19:32:15.6036 [PID=22629] [send] est. duration for WU 380375117: unscaled 33.83 scaled 35.88
2013-06-11 19:32:15.6036 [PID=22629] [send] [HOST#XX] sending [RESULT#498050349 de_separation_20_2s_sscon_1_1370993394_150_0] (est. dur. 35.88 seconds)
2013-06-11 19:32:15.6039 [PID=22629] [quota] reached limit on GPU jobs in progress
2013-06-11 19:32:15.6039 [PID=22629] [quota] Overall limits on jobs in progress:
2013-06-11 19:32:15.6039 [PID=22629] [quota] CPU: base 3 scaled 12 njobs 0
2013-06-11 19:32:15.6039 [PID=22629] [quota] GPU: base 40 scaled 40 njobs 40
2013-06-11 19:32:15.6039 [PID=22629] [send] don't need more work
2013-06-11 19:32:15.6048 [PID=22629] Sending reply to [HOST#XX]: 2 results, delay req 61.00
44) Message boards : News : another scheduler update (Message 58740)
Posted 11 Jun 2013 by Profile Travis
Post:
#3. But it just started again.


Made an update, let me know if this let you get some ATI GPU workunits.
45) Message boards : News : another scheduler update (Message 58735)
Posted 11 Jun 2013 by Profile Travis
Post:
Updated the scheduler yet again.

I just want to double check, which of the following are people having (since the update):

1. non-GPU hosts are getting GPU workunits.
2. hosts with ATI GPUs that don't have the compute capability are getting GPU workunits.
3. hosts with ATI GPUs aren't getting workunits.

Is anyone having problems with NVIDIA GPUs? Or is this just an ATI thing?

--Travis
46) Message boards : News : yet another scheduler update (Message 58715)
Posted 11 Jun 2013 by Profile Travis
Post:
Made a few more tweaks, how's this going? Let me know if anything isn't working that was working before we did the server code upgrade.
47) Message boards : Number crunching : MilkyWay not up loading new workunits (Message 58706)
Posted 11 Jun 2013 by Profile Travis
Post:
Please note, while we are now receiving AMD GPU work units, my machines, all day, have ZERO CPU work units. All 88 of my CPU cores are doing absolutely jack-shit all.

I'm just wondering. Is this a "serious" project, or is my $700 CDN per month in electricity costs all pure waste? I want something to result from all this. When a project "dies" because of a lack of work units, it makes me think no one cares to complete this project.

Just saying, it kind of feels like we are on our own and this is all a waste of money and time. :)


They did a Server side update and are now having 'issues', hopefully they will sort it out soon. This is what happens when the Server side programmers want things their way and the projects like things their way. Lots of modifications needed which often results in few Server side updates.


The student who had done most of our server side modifications has basically been MIA, so in doing the server update I've had to figure out everything he did, and move it all over into the new main BOINC code. Now that BOINC is using git as it's version control software, keeping things up to date will be much easier, as we can have our own local repository and pull the main BOINC changes into it, and updating things as needed.

Before when BOINC used SVN this was much more difficult. So once we get everything working again, I think things should be pretty good here on out, and it should be much easier to keep the software up to date, especially as I'll have figured out all the changes that were made here.
48) Message boards : Number crunching : Bunch of new computational errors (Message 58705)
Posted 11 Jun 2013 by Profile Travis
Post:
Now I have the same problem with WU 1.02. I have noticed that the warning that you don't have double precision processor is not being logged any more? Could this be the culprit?


I think I just got this fixed.
49) Message boards : Number crunching : GPU apps delivered to single precision GPU (Message 58704)
Posted 11 Jun 2013 by Profile Travis
Post:
Appears to be fixed. I'm getting the message that my card isn't good enough and I'm not getting GPU WU's.


Awesome! Hopefully this last fix did the trick.
50) Message boards : News : scheduler update (Message 58702)
Posted 11 Jun 2013 by Profile Travis
Post:
I've made some updates to the scheduler which I think should fix the problem with people getting GPU workunits that shouldn't be.

Let me know if this change fixed things.

--Travis
51) Message boards : News : added applications for ati only GPUs (Message 58663)
Posted 10 Jun 2013 by Profile Travis
Post:
I'd still like to know why on a computer that is CPU ONLY I'm getting ATI GPU units that can't run on this computer. I never had a problem only getting CPU WU's until whatever it was you all did last week. I'm getting the ATI units and they are refusing to start now. No surprise since this gfx card has never been able to run WU's. And its a HD 6670 but reads as a HD5700.. I've even reset the project to no avail.

My other computer is running a HD 3870 with an appinfo file and hasn't missed a beat since all this mess started last week.


Domain name FX-4170
Local Standard Time UTC -4 hours
Name FX-4170
Created 30 Apr 2012, 21:55:49 UTC
Total credit 1,676,526
Average credit 3,470.56
Cross project credit
CPU type AuthenticAMD
AMD FX(tm)-4170 Quad-Core Processor [Family 21 Model 1 Stepping 2]
Number of processors 4
Coprocessors AMD ATI Radeon HD 5700 series (Juniper) (1024MB) driver: 1.4.1741 OpenCL: 1.02
Operating System Microsoft Windows 7
Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)
BOINC version 7.0.28
Memory 8093.61 MB
Cache 2048 KB
Swap space 16185.39 MB
Total disk space 365.64 GB
Free Disk Space 301.38 GB
Measured floating point speed 3068.57 million ops/sec
Measured integer speed 9326.21 million ops/sec
Average upload rate 81.19 KB/sec
Average download rate 624.69 KB/sec
Average turnaround time 0.13 days
Application details
Show Tasks 42
Number of times client has contacted server 34023
Last time contacted server 10 Jun 2013, 22:34:40 UTC


We're looking into it, hopefully have it figured out soon. Thanks for the patience with all this.
52) Message boards : News : What users aren't getting GPU workunits? (Message 58662)
Posted 10 Jun 2013 by Profile Travis
Post:
I'm getting ATI WU's on a computer that can't run them and has been set for CPU ONLY for months... They won't even start and so I'm aborting them.


I've just sent out an email to the boinc projects mailing lists, so we're looking into it and hopefully can have a fix for you soon.
53) Message boards : Number crunching : MilkyWay not up loading new workunits (Message 58661)
Posted 10 Jun 2013 by Profile Travis
Post:
This weekend I downloaded the new BOINC Manager...
I've just finished processing several (ALL) of my MilkyWay workunits. The MilkyWay server status reports there are 400+ units ready to be sent out and every thing appears to be running, but my BOINC program manager will not up/download anything from your server. I have several SETI workunits available to process... but nothing is coming down the pike from MW?
Any suggestions... or pointing the finger at the new BOINC manager?
thanks for any help.

MississippiSteve


I think I just put a fix in this, I'm assuming you're running the GPU workunits?
54) Message boards : Number crunching : No more GPU work units? (Message 58655)
Posted 10 Jun 2013 by Profile Travis
Post:
Any idea what caused this catastrophic malfunction? What steps are you guys taking to prevent these issues from arising in the future?


We needed to update the BOINC server code due to a security issue. The version of the BOINC server code we had been using was a few months out of day, so in making the change a bunch of things broke, as the BOINC server code is being constantly updated.

To make matters worse our old version of the BOINC server code had a few hacks in it to get things to work, so we had to port those over, or swap out our hacks for stuff in the main BOINC codebase now.

Was a lot messier than expected, but now that we're back up to date with the server code things should be good for awhile.

You need to recognize that both MilkyWay@Home and extent BOINC are actively developed research projects, and most of the work is done by myself (and I am not even at RPI anymore) and students at RPI. We don't have full time IT people getting paid to support these projects.

I work on milkyway@home mostly in my free time, which is extremely limited now that I'm an assistant professor at university of north dakota and have many other research projects to work on and classes to teach.

I'd like to say things like this won't happen in the future, but I'd be lying. Given the nature of MilkyWay@Home and BOINC, problems are going to happen as we have new students learn things and code gets updated.
55) Message boards : News : What users aren't getting GPU workunits? (Message 58646)
Posted 10 Jun 2013 by Profile Travis
Post:
Win7x64, BOINC 7.0.64, CPU i7-920, 6GB RAM w/ 1 each 7770, 7850, 7950
Preferences
CPU = No
NVIDIA = No
ATI = Yes
All projects selected



Getting any work after the recent update?
56) Message boards : News : potential fix for bad applications being sent out (Message 58641)
Posted 10 Jun 2013 by Profile Travis
Post:
Like arkayn said ...

My 3 x HD3850 AGP cards crunch MilkyWay@Home v0.82 (ati14) tasks with ease, undervolted, and have done so for many moons.

There's an opencl ati version available, any reason this can't be used?
... if only it were possible to run opencl on these cards (sigh).


Just tried to put in a fix for this, let me know if it worked.

--Travis
57) Message boards : News : added applications for ati only GPUs (Message 58639)
Posted 10 Jun 2013 by Profile Travis
Post:
I added the 0.82 applications as 1.02 with the ati14 plan class, so hopefully those who were not getting GPU workunits because they had ATI GPUs which didn't support opencl should be getting them now.

Let me know if this is working.

--Travis
58) Message boards : News : What users aren't getting GPU workunits? (Message 58636)
Posted 10 Jun 2013 by Profile Travis
Post:
What applications do you guys have selected in your project preferences? I'm wondering if maybe you turned off the opencl ATI version?

--Travis
59) Message boards : Number crunching : No more GPU work units? (Message 58629)
Posted 10 Jun 2013 by Profile Travis
Post:
6/10/2013 12:14:12 PM | Milkyway@Home | update requested by user
6/10/2013 12:14:15 PM | Milkyway@Home | Sending scheduler request: Requested by user.
6/10/2013 12:14:15 PM | Milkyway@Home | Requesting new tasks for CPU and ATI
6/10/2013 12:14:17 PM | Milkyway@Home | Scheduler request completed: got 0 new tasks
6/10/2013 12:14:17 PM | Milkyway@Home | No tasks sent
6/10/2013 12:14:17 PM | Milkyway@Home | No tasks are available for MilkyWay@Home
6/10/2013 12:14:17 PM | Milkyway@Home | No tasks are available for Milkyway@Home Separation
6/10/2013 12:14:17 PM | Milkyway@Home | No tasks are available for the applications you have selected.


What version BOINC client are you using? What applications do you have selected for the project to send?
60) Message boards : News : What users aren't getting GPU workunits? (Message 58628)
Posted 10 Jun 2013 by Profile Travis
Post:
I'm trying to figure out what users aren't getting GPU workunits. Is it everyone? Or just some people with certain types of ATI cards?

As far as I can tell, there should be both opencl AMD/ATI and opencl NVIDIA applications available.

--Travis


Previous 20 · Next 20

©2024 Astroinformatics Group