Welcome to MilkyWay@home

Testing Some New Plan Classes

Message boards : News : Testing Some New Plan Classes
Message board moderation

To post messages, you must log in.

AuthorMessage
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 67229 - Posted: 8 Mar 2018, 18:06:31 UTC

Hey Everyone,

I am going to try changing the GPU plan classes to reduce workunits sent to users without double precision gpus. If you notice any issues on your end, please let me know.

Thanks,

Jake
ID: 67229 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tom*

Send message
Joined: 4 Oct 11
Posts: 38
Credit: 309,729,457
RAC: 0
Message 67230 - Posted: 8 Mar 2018, 18:46:10 UTC

Thank you Jake,

most of my workunits failing to successfully complete are due to wingers
without DP.

Bill
ID: 67230 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Slywy

Send message
Joined: 22 Jul 12
Posts: 11
Credit: 1,008,373
RAC: 0
Message 67235 - Posted: 10 Mar 2018, 13:57:25 UTC - in response to Message 67229.  

Hey Everyone,

I am going to try changing the GPU plan classes to reduce workunits sent to users without double precision gpus. If you notice any issues on your end, please let me know.

Thanks,

Jake


I must not have one because I'm suddenly getting hundreds of computation errors.
ID: 67235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dunx

Send message
Joined: 13 Feb 11
Posts: 31
Credit: 1,403,524,537
RAC: 0
Message 67236 - Posted: 10 Mar 2018, 15:55:13 UTC

ID: 67236 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,761,614
RAC: 27,784
Message 67240 - Posted: 11 Mar 2018, 12:26:53 UTC - in response to Message 67235.  

Hey Everyone,

I am going to try changing the GPU plan classes to reduce workunits sent to users without double precision gpus. If you notice any issues on your end, please let me know.

Thanks,

Jake


I must not have one because I'm suddenly getting hundreds of computation errors.


I checked Nvidia and it's listed as single precision.
ID: 67240 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
macgeyer

Send message
Joined: 2 Mar 18
Posts: 9
Credit: 457,043,383
RAC: 0
Message 67241 - Posted: 13 Mar 2018, 19:02:56 UTC

Website had problems last hour and my computer ID: 767940 doesn't get any new task :
13/03/2018 19:57:35 | Milkyway@Home | Scheduler request completed: got 0 new tasks
ID: 67241 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
macgeyer

Send message
Joined: 2 Mar 18
Posts: 9
Credit: 457,043,383
RAC: 0
Message 67242 - Posted: 13 Mar 2018, 19:39:02 UTC - in response to Message 67241.  

Website had problems last hour and my computer ID: 767940 doesn't get any new task :
13/03/2018 19:57:35 | Milkyway@Home | Scheduler request completed: got 0 new tasks


Got new tasks :
13/03/2018 20:37:37 | Milkyway@Home | Scheduler request completed: got 192 new tasks
ID: 67242 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
macgeyer

Send message
Joined: 2 Mar 18
Posts: 9
Credit: 457,043,383
RAC: 0
Message 67243 - Posted: 13 Mar 2018, 19:39:02 UTC - in response to Message 67241.  
Last modified: 13 Mar 2018, 19:39:59 UTC

sorry double posted
ID: 67243 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jon A. Robison

Send message
Joined: 6 Oct 10
Posts: 1
Credit: 6,703,932
RAC: 0
Message 67244 - Posted: 14 Mar 2018, 15:54:03 UTC

OK, yesterday (March 13,2018) I was trying to post that I had 27 "Computation Error" work units and only 2 "Waiting to Report". Over the last several weeks I've been inundated with work units that end up this way. My User ID is 128146. I have changed nothing in my machine (unless MS updates did something) that should cause this. My processor is an INTEL Q9550 and my video board is an ATI Radeon HD 4870 (1Gb GDD5) and I've never had this type problem before. My conclusion is your work units are at fault. I've suspended the current batch of work units and may drop this work all together since my machine doesn't appear to be able to process your data!!!!
ID: 67244 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,761,614
RAC: 27,784
Message 67245 - Posted: 15 Mar 2018, 11:24:59 UTC - in response to Message 67244.  

OK, yesterday (March 13,2018) I was trying to post that I had 27 "Computation Error" work units and only 2 "Waiting to Report". Over the last several weeks I've been inundated with work units that end up this way. My User ID is 128146. I have changed nothing in my machine (unless MS updates did something) that should cause this. My processor is an INTEL Q9550 and my video board is an ATI Radeon HD 4870 (1Gb GDD5) and I've never had this type problem before. My conclusion is your work units are at fault. I've suspended the current batch of work units and may drop this work all together since my machine doesn't appear to be able to process your data!!!!


It might help if you updated to a newer version of Boinc, 7.2.47 is fairly old at this point.
ID: 67245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 67247 - Posted: 15 Mar 2018, 20:27:54 UTC

Hey Jon,

Looks to me like its probably a driver error since the error you are getting is "Failed to compute likelihood." Can you try a different driver version and get back to me?

Jake
ID: 67247 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 67252 - Posted: 17 Mar 2018, 13:26:24 UTC
Last modified: 17 Mar 2018, 13:57:47 UTC

Something strange has happened, not sure if it is your plan change. Over a 30 minute period all my ATI (AMD) Tahati class (7950, S9000) went from just over 3 minutes per WU to 17-19 minutes. The GPU clock went from 900 down to 300 indicating very little load on the GPU. CPU load went from 11 to 30 % I changed the ngpu assignment from 4 down to 1 and suspended all CPU tasks but that had no effect. Time to complete stayed in the 15 minute range even with only 1 WU per S9000.

ATI Pitcairn class (7850) did not show any change, GPU clock was 925 and WUs taking 12-14 minutes. However, this system had a full day of tasks maybe the problem has not shown up yet. Same for RX560, WUs take 18 minutes with no change and a somewhat large cache.

All of the above systems are old core 2 quads with 8gb memory if that makes any difference

[EDIT] When I switched project from Milkyway to Collatz, the GPU clocks shot back up to the 900mhz range like they normally run. Something broke.

[EDIT-2] Just compared one of my slow WUs to my wingman. His took the normal 3 or so minutes to complete the same WU where I took 18 minutes with the GPU clock running at 300 instead of 900
http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=1589386532
Something is not right. The video boards are both tahiti class. However the motherboard and CPU are vastly different.
ID: 67252 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 162
Credit: 1,004,163,109
RAC: 887
Message 67253 - Posted: 17 Mar 2018, 15:12:43 UTC

Sounds more like a driver crash than a project problem. Even if load did go back up with collatz.
ID: 67253 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 67255 - Posted: 17 Mar 2018, 15:38:27 UTC - in response to Message 67253.  
Last modified: 17 Mar 2018, 16:10:06 UTC

Yes, that is what I thought, but it happened on two systems

Asus P5E starting 3/17 at 1:03am to 1:27am went from 3 minutes per wu to 15

MSI P7N SLI FTW starting 3/16 9:56pm to 10:33 went from 3 per wu to 17-20 minutes

The work unit report states 900mhz for the video boards but I suspect that is just the core freq being reported. gpu-z shows 300mhz and low temps in the 30c. Switching to collatz gpu-z shows 900mhz and temps into the 60 and 70 as these are air cooled. Switching back the freq drops to 300 gain for all 7950s and s9000 on both systems.

Could still be a driver problem. Microsoft did release a bunch of stuff Tuesday and systems may have just got around to rebooting. I will look into it.

I brought up afterburner but do not see how to change the clock speed from 300 to a higher number. I know that if tasks are starved for data (cpu cores busy or low on memory) then the gpu clock will drop as the card is not busy enough to run at 900. There are a lot of possibilities. None of my Linux system show a milkyway problem but they have 16 threads to feed a pair of really slow NVidia 1050TIs where the above systems have total of 4 threads (1 per core) for a much more productive (double precision) Tahitis.

Looking at the report there is a lot of info being presented. Perhaps that causes my few threads to not be able to feed the gpu?

[EDIT] Compared much faster against way slower and don't really see any reason for the much longer time to complete. Consistency is totally unlike any milkyway ATI tasks I have seen on the same type of system.
ID: 67255 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jake Weiss
Volunteer moderator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Feb 13
Posts: 580
Credit: 94,200,158
RAC: 0
Message 67256 - Posted: 17 Mar 2018, 16:41:22 UTC

Hey BeemerBiker,

So the runs I put up a couple of days ago have a few extra calculations in them, so they should take a little bit longer. Your credits should have been adjusted accordingly.

Please let me know if you aren't getting an increase in credits for the increased work.

Jake
ID: 67256 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 67258 - Posted: 17 Mar 2018, 16:54:13 UTC - in response to Message 67256.  

Hey BeemerBiker,

So the runs I put up a couple of days ago have a few extra calculations in them, so they should take a little bit longer. Your credits should have been adjusted accordingly.

Please let me know if you aren't getting an increase in credits for the increased work.

Jake


yea, probably will get more credit, thanks, but why are 4 of my Tahiti class GPUs running at 300mhz instead of 850 or 900. They are not being fed properly I suspect. Could be a problem at my end as I don't see anyone else reporting stuff like I am getting.
ID: 67258 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Oct 16
Posts: 162
Credit: 1,004,163,109
RAC: 887
Message 67260 - Posted: 18 Mar 2018, 0:13:53 UTC - in response to Message 67258.  

Hey BeemerBiker,

So the runs I put up a couple of days ago have a few extra calculations in them, so they should take a little bit longer. Your credits should have been adjusted accordingly.

Please let me know if you aren't getting an increase in credits for the increased work.

Jake


yea, probably will get more credit, thanks, but why are 4 of my Tahiti class GPUs running at 300mhz instead of 850 or 900. They are not being fed properly I suspect. Could be a problem at my end as I don't see anyone else reporting stuff like I am getting.


I have a 280x in Win10 running at 1070 mhz.
ID: 67260 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,761,614
RAC: 27,784
Message 67262 - Posted: 18 Mar 2018, 12:56:40 UTC - in response to Message 67255.  

Yes, that is what I thought, but it happened on two systems

I brought up afterburner but do not see how to change the clock speed from 300 to a higher number. I know that if tasks are starved for data (cpu cores busy or low on memory) then the gpu clock will drop as the card is not busy enough to run at 900. There are a lot of possibilities. None of my Linux system show a milkyway problem but they have 16 threads to feed a pair of really slow NVidia 1050TIs where the above systems have total of 4 threads (1 per core) for a much more productive (double precision) Tahitis.


The only way to reset a crashed driver in Windows is to restart the whole pc. If you have Win10 it could have done an update and crashed the driver in the process. There could also be a newer better driver if Win10 was updated.
ID: 67262 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joseph Stateson
Avatar

Send message
Joined: 18 Nov 08
Posts: 291
Credit: 2,461,693,501
RAC: 0
Message 67264 - Posted: 18 Mar 2018, 16:24:28 UTC - in response to Message 67262.  
Last modified: 18 Mar 2018, 16:37:52 UTC

Yes, that is what I thought, but it happened on two systems

I brought up afterburner but do not see how to change the clock speed from 300 to a higher number. I know that if tasks are starved for data (cpu cores busy or low on memory) then the gpu clock will drop as the card is not busy enough to run at 900. There are a lot of possibilities. None of my Linux system show a milkyway problem but they have 16 threads to feed a pair of really slow NVidia 1050TIs where the above systems have total of 4 threads (1 per core) for a much more productive (double precision) Tahitis.


The only way to reset a crashed driver in Windows is to restart the whole pc. If you have Win10 it could have done an update and crashed the driver in the process. There could also be a newer better driver if Win10 was updated.



Yea, problem was drivers. I looked at my P5E and it was waiting to reboot to install drivers to fix whatever Microsoft had done the previous Tuesday. After rebooting exactly 30 milkyway ATI tasks reported an error but all the remaining tasks plus the new downloads were back at their 3 minute normal WU time to complete. Has been 24 hours working just fine. I don't think anything was wrong with those 30 tasks, just the driver change bumped them out. I had tried an S9000 graphics in this system before putting it in the P7N.

The P7N on the other hand did not respond to reboot like my P5E. This system worked fine on collatz ATI tasks but ran at only 300mhz for milkyway. I suspect the same problem with the driver. The driver failed to uninstall (win10x64) even the ATI "cleanup" program was unable to uninstall the AMD software on this Intel system. I deleted both the S9000 video boards from the system manager and rebooted. They were recognized as w8000 video boards but they worked and time to complete is back down to 3 minutes per WU when running 4 on each GPU. Apparently the Adrenalin Radeon driver caused problems with mixed S9000 and 7950 graphics boards. I did not attempt to update whatever Microsoft installed to handle the "w8000" as it is working and I don't want to mess with it any more. Just a coincidence that these problems occurred the same time as the class plans were changed here.

The "S9000" were $160 new, free ship on eBay and I could not pass up a chance to get a new Tahiti system with 6gb mem not just 3. They just cannot be mixed with normal 7950 boards and require DIY cooling.
ID: 67264 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,761,614
RAC: 27,784
Message 67266 - Posted: 19 Mar 2018, 11:34:33 UTC - in response to Message 67264.  

Yes, that is what I thought, but it happened on two systems

I brought up afterburner but do not see how to change the clock speed from 300 to a higher number. I know that if tasks are starved for data (cpu cores busy or low on memory) then the gpu clock will drop as the card is not busy enough to run at 900. There are a lot of possibilities. None of my Linux system show a milkyway problem but they have 16 threads to feed a pair of really slow NVidia 1050TIs where the above systems have total of 4 threads (1 per core) for a much more productive (double precision) Tahitis.


The only way to reset a crashed driver in Windows is to restart the whole pc. If you have Win10 it could have done an update and crashed the driver in the process. There could also be a newer better driver if Win10 was updated.



Yea, problem was drivers. I looked at my P5E and it was waiting to reboot to install drivers to fix whatever Microsoft had done the previous Tuesday. After rebooting exactly 30 milkyway ATI tasks reported an error but all the remaining tasks plus the new downloads were back at their 3 minute normal WU time to complete. Has been 24 hours working just fine. I don't think anything was wrong with those 30 tasks, just the driver change bumped them out. I had tried an S9000 graphics in this system before putting it in the P7N.

The P7N on the other hand did not respond to reboot like my P5E. This system worked fine on collatz ATI tasks but ran at only 300mhz for milkyway. I suspect the same problem with the driver. The driver failed to uninstall (win10x64) even the ATI "cleanup" program was unable to uninstall the AMD software on this Intel system. I deleted both the S9000 video boards from the system manager and rebooted. They were recognized as w8000 video boards but they worked and time to complete is back down to 3 minutes per WU when running 4 on each GPU. Apparently the Adrenalin Radeon driver caused problems with mixed S9000 and 7950 graphics boards. I did not attempt to update whatever Microsoft installed to handle the "w8000" as it is working and I don't want to mess with it any more. Just a coincidence that these problems occurred the same time as the class plans were changed here.

The "S9000" were $160 new, free ship on eBay and I could not pass up a chance to get a new Tahiti system with 6gb mem not just 3. They just cannot be mixed with normal 7950 boards and require DIY cooling.


Sounds like a good deal for someone who knows how to do that stuff, I'm glad you are back and crunching fast again.
ID: 67266 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : Testing Some New Plan Classes

©2024 Astroinformatics Group