Message boards :
Number crunching :
Host with WAY too many tasks.
Message board moderation
Author | Message |
---|---|
Send message Joined: 21 Nov 09 Posts: 49 Credit: 20,942,758 RAC: 0 |
So I was going through my WU's that were sitting around waiting for wingmen to report back and came across http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=247482. At first I thought perhaps it was a bug and the tasks would go down as they timed out / were reported. But it went from having 74xx last night to 76xx when I just checked... So obviously something is wrong here. Not being sure what to do, I figured I'd post it here and someone else would know / a mod / project admin would see it or whatever. |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
All but 3 tasks are under 'In Progress', and none listed as 'Invalid or Error' Info from one of the "inconclusive" wus Name de_separation_19_3s_fix_1_313826_1295527124_0 Workunit 222092203 Created 20 Jan 2011 12:38:48 UTC Sent 20 Jan 2011 12:38:52 UTC Received 20 Jan 2011 14:59:38 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 247482 Report deadline 28 Jan 2011 12:38:52 UTC Run time 687.59375 CPU time 27.64063 stderr out <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> <search_application> milkywayathome separation 0.50 Windows x86 double OpenCL </search_application> Found 1 platforms Platform 0 information: Platform name: NVIDIA CUDA Platform version: OpenCL 1.0 CUDA 3.2.1 Platform vendor: Platform profile: Platform extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll Using device 0 on platform 0 Found 1 CL devices Device GeForce GTX 460 (NVIDIA Corporation:0x10de) Type: CL_DEVICE_TYPE_GPU Driver version: 263.06 Version: OpenCL 1.0 CUDA Compute capability: 2.1 Little endian: CL_TRUE Error correction: CL_FALSE Image support: CL_TRUE Address bits: 32 Max compute units: 7 Clock frequency: 1420 Mhz Global mem size: 2147024896 Max mem alloc: 536756224 Global mem cache: 114688 Cacheline size: 128 Local mem type: CL_LOCAL Local mem size: 49152 Max const args: 9 Max const buf size: 65536 Max parameter size: 4352 Max work group size: 1024 Max work item dim: 3 Max work item sizes: { 1024, 1024, 64 } Mem base addr align: 4096 Min type align size: 128 Timer resolution: 1000 ns Double extension: MW_CL_KHR_FP64 Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 Compiler flags: -cl-mad-enable -cl-no-signed-zeros -cl-strict-aliasing -cl-finite-math-only -DUSE_CL_MATH_TYPES=0 -DUSE_MAD=1 -DUSE_FMA=0 -cl-nv-verbose -DDOUBLEPREC=1 -DMILKYWAY_MATH_COMPILATION -DNSTREAM=3 -DFAST_H_PROB=1 -DAUX_BG_PROFILE=0 -DUSE_IMAGES=1 -DI_DONT_KNOW_WHY_THIS_DOESNT_WORK_HERE=1 Build status: CL_BUILD_SUCCESS Build log: : Considering profile 'compute_20' for gpu='sm_21' in 'cuModuleLoadDataEx_4' Kernel work group info: Work group size = 512 Kernel local mem size = 0 Compile work group size = { 0, 0, 0 } Group size = 64, per CU = 32, threads per CU = 2048 Block size = 14336 Desired = 79 Min sol: 79 25088 Lower n solution: n = 79, x = 25088 Higher n solution: n = 79, x = 25088 Using solution: n = 79, x = 25088 Range: { nu_steps = 640, mu_steps = 1600, r_steps = 1400 } Iteration area: 2240000 Chunk estimate: 79 Num chunks: 79 Added area: 25088 Effective area: 2265088 Integration time: 661.995528 s. Average time per iteration = 1034.368012 ms <background_integral> 0.00050653271838051195 </background_integral> <stream_integrals> 30.29634760623997300000 418.20851865009780000000 1225.58300161937880000000 </stream_integrals> <background_only_likelihood> -3.17210622225932020000 </background_only_likelihood> <stream_only_likelihood> -118.87773635757581000000 -3.83210281007669450000 -10.20086471572446700000 </stream_only_likelihood> <search_likelihood> -2.97032635821775330000 </search_likelihood> 15:59:26 (2084): called boinc_finish </stderr_txt> ]]> Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 21 Nov 09 Posts: 49 Credit: 20,942,758 RAC: 0 |
In case you think it is, it's not my computer. Only reason I saw it was due to the fact that I was checking on who was crunching some of my older pendings. Since the number of tasks they have is going up, while I'm still waiting for them to crunch something from the 13th... I'm wondering if it's some kind of error like the SETI Ghost WU's or something of the like. But no matter if it's an error or someone cheating the system to get way more WU's than they should, it's not good. |
Send message Joined: 10 Mar 08 Posts: 7 Credit: 60,169,291 RAC: 0 |
I would like a computer like this one http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=243441 10000 cpu's that would be nice to help me fill my cache. Are computers like that expensive? one of my wu's is pending "Completed, validation inconclusive", and I'm waiting on this computer, that is how I found it. |
Send message Joined: 24 Dec 07 Posts: 1947 Credit: 240,884,648 RAC: 0 |
PM them and tell them to stop cheating... |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
Up to 9000 wus now. Atleast it should stop at 10000. It is recieving wus every minute when it is requesting them. It seems like a downloading error. Total credit is at: 741,001 and doesn't seem to be jumping up. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 10 Oct 07 Posts: 79 Credit: 69,337,972 RAC: 0 |
hello, how on earth do you convince boinc that any CPU has thousands of cores on it ? i thought that was written in stone your pc only told the truth about it ? best regards Ian ....Please Join team Scotland HERE |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
I would like a computer like this one Something is up with this. It seems more than just falsely adding processors. 155 seperate computers show up with nearly identical stats and they all list as having 10000 processors. There is 30 between each computers last contact, seems was setup to register as a new system so it doesn't show only one computer with all of the credit. 107 million with a rac of 770k. You can see this thread as to having 10000 processors. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 13 Mar 08 Posts: 804 Credit: 26,380,161 RAC: 0 |
This issue has been escalated and is being worked on as I type this note. Thank you for the information. |
Send message Joined: 29 Aug 07 Posts: 81 Credit: 60,360,858 RAC: 0 |
This issue has been escalated and is being worked on as I type this note. Hi Blurf, I know, OT, but nevertheless could you please also escalate issue of too old server build that makes impossible to set MW as backup project (ie. resource share of 0 is still invalid here)? Thank you. BR |
Send message Joined: 13 Mar 08 Posts: 804 Credit: 26,380,161 RAC: 0 |
Hi Blurf, I know, OT, but nevertheless could you please also escalate issue of too old server build that makes impossible to set MW as backup project (ie. resource share of 0 is still invalid here)? Thank you. I've asked Matt to specifically respond to your concern |
Send message Joined: 6 May 09 Posts: 217 Credit: 6,856,375 RAC: 0 |
This issue has been escalated and is being worked on as I type this note. We are looking into this. We'll let you know when we've figured out what's going on. Hi Blurf, I know, OT, but nevertheless could you please also escalate issue of too old server build that makes impossible to set MW as backup project (ie. resource share of 0 is still invalid here)? Thank you. The server code is not yet my domain; I'll inform the mighty Travis of your concern. |
Send message Joined: 6 Nov 09 Posts: 12 Credit: 348,876,876 RAC: 0 |
Not sure how you get 10000 cpu's from a 980 but i have read in another forum that Blox is trying to schedule multiple instances on one rig to get round the limit of tasks in MW. May be somthing to do with that as the link is to one of his rigs. Here is a link to his page http://www.overclock.net/blogs/blox/2050-bloxcache-boinc-caching-batch-file-initialisation.html |
Send message Joined: 11 Dec 09 Posts: 17 Credit: 62,324,991 RAC: 98 |
Just look at that idiot: http://milkyway.cs.rpi.edu/milkyway/hosts_user.php?userid=100412 154 boxes that have been hard-coded to show 10.000 cores, all with 4(!) HD 5800-class GPUs. Having 154 i7-980s and 616 HD 5850s should be enough for everybody (provided they have more than 640k memory) |
Send message Joined: 12 Nov 07 Posts: 2425 Credit: 524,164 RAC: 0 |
Just look at that idiot: I believe they are all the same system being they all have the exact specs. Also they all have a low rac (individually) for having a GPU. Added together it makes sense, 770k rac. Doesn't expecting the unexpected make the unexpected the expected? If it makes sense, DON'T do it. |
Send message Joined: 29 May 09 Posts: 37 Credit: 34,016,951 RAC: 0 |
Hmm, maybe MW should give the guy a job to fix the problems everyone complains about. ;) |
Send message Joined: 29 May 09 Posts: 37 Credit: 34,016,951 RAC: 0 |
Ok, so what about THIS guy? http://boincstats.com/stats/boinc_host_stats.php?pr=bo&st=0&userid=d8bb9fbafc211071566af410be12256f Does it seem as though he's doing the same thing, even though it's on a different project, or am I mistaken? |
Send message Joined: 30 Dec 07 Posts: 311 Credit: 149,490,184 RAC: 0 |
Haha, no not the same, "Mr Nemo" actually has a rather large number of computers running at his "house". Welcome to the Nemo cluster at UWM! Nemo Cluster Report |
Send message Joined: 29 May 09 Posts: 37 Credit: 34,016,951 RAC: 0 |
Well, I see I stand corrected. My gosh that is such a site to see. It seems as though "Nemo" is carrying along through with the name he stands for. Strength and ability, reliability and resolve. (I've used "Nemo" as a logon name for work since the mid 80's and still use it today as a gaming logon to several accounts I have) and to see that the name has become, well, almost to say stardom, I am beside myself in looking at the two links you posted. I see myself kneeling and praising, "Nemo", "Nemo" , "Nemo" with hands outstretched and bowing down, over and over again and again. It's just an awesome thing to see. Now, if I win the lotto any time soon, then there will be some competition rolling. |
Send message Joined: 11 Dec 09 Posts: 17 Credit: 62,324,991 RAC: 98 |
I believe they are all the same system being they all have the exact specs Could it be one -or more- actual and 153 -or less- virtual machines? All on the same actual host(s)? And clogging up tasks by doing so? just my 2 cents... |
©2024 Astroinformatics Group