Message boards :
Number crunching :
FLOPS Estimate
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0 |
I think we still have an issue with the flops estimate. I had a long queue of tasks that was going to blow the deadline so I fiddled with the STD to force cleaning my queue. After I got all tasks done I reset the project and lo and behold I D/L about 10 tasks of stripe s20 and s21 with an estimated time to complete of 6:03 minutes ... These tasks on my Mac Pro take about 25 to 30 min to complete ... so the number seems to be off by at least a factor of 5 ... I know it will settle in when I do process them, but, this will obviously happen on each reset so ... Thought I would let you know what I just experienced ... YMMV :) |
Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0 |
I think we still have an issue with the flops estimate. Agreed. It is unwise to set the FPOP estimate to reflect the apparent power of the fastest hosts. The reason is that as you pointed out, a reset or new attach will start out with a TDCF of one, which will almost always result in an overfetch for the majority of hosts. Personally, I would have set the estimate to 2 or 3 E15. One other point, the bounds value should be set to something a bit higher than what the estimate is set to. This has causes problems with tasks aborting on time when they didn't have to on other projects. Alinator |
Send message Joined: 30 Aug 07 Posts: 2046 Credit: 26,480 RAC: 0 |
I think we still have an issue with the flops estimate. I think the bound is set to 100x what the estimate is right now, so that shouldn't be a problem. I can up the estimate a bit but I think once things settle down it should be pretty accurate. |
Send message Joined: 12 Apr 08 Posts: 621 Credit: 161,934,067 RAC: 0 |
It may also be because I am using 5.10.45 on that machine (Mac Pro) because several of the other projects I want to work with seem resistant to chainging the def file that contains the id string that the new 6.x hosts use for the Intel Macs ... If you upgrade to the 6 series then not only can you not fetch work, you cannot trickle up, nor can you report tasks that you have completed on the now, illegal host ... I know it is carping, but, this is the kind of issue that bites you when you don't do engineering and just hack at the code ... |
Send message Joined: 7 Jun 08 Posts: 464 Credit: 56,639,936 RAC: 0 |
Ahhhh yes, the bounds is set to 100X now. Actually, I meant dividing the current estimates by 4 or 5. Remember that a TDCF of 1 means the task will take the same amount of time the estimate says. So if you set the FPOP estimate based on what the fastest or most optimized hosts can do then they will have TDCF's close to 1 and everyone else greater than one. This is the scenario which can lead to overfetches and blown deadlines when things change in terms of estimated runtime (like we just saw). Alinator |
©2024 Astroinformatics Group