Welcome to MilkyWay@home

FLOPS Estimate

Message boards : Number crunching : FLOPS Estimate
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 9680 - Posted: 4 Feb 2009, 22:12:03 UTC

I think we still have an issue with the flops estimate.

I had a long queue of tasks that was going to blow the deadline so I fiddled with the STD to force cleaning my queue. After I got all tasks done I reset the project and lo and behold I D/L about 10 tasks of stripe s20 and s21 with an estimated time to complete of 6:03 minutes ...

These tasks on my Mac Pro take about 25 to 30 min to complete ... so the number seems to be off by at least a factor of 5 ... I know it will settle in when I do process them, but, this will obviously happen on each reset so ...

Thought I would let you know what I just experienced ... YMMV :)
ID: 9680 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 9686 - Posted: 4 Feb 2009, 23:05:07 UTC - in response to Message 9680.  
Last modified: 4 Feb 2009, 23:06:14 UTC

I think we still have an issue with the flops estimate.

I had a long queue of tasks that was going to blow the deadline so I fiddled with the STD to force cleaning my queue. After I got all tasks done I reset the project and lo and behold I D/L about 10 tasks of stripe s20 and s21 with an estimated time to complete of 6:03 minutes ...

These tasks on my Mac Pro take about 25 to 30 min to complete ... so the number seems to be off by at least a factor of 5 ... I know it will settle in when I do process them, but, this will obviously happen on each reset so ...

Thought I would let you know what I just experienced ... YMMV :)


Agreed. It is unwise to set the FPOP estimate to reflect the apparent power of the fastest hosts.

The reason is that as you pointed out, a reset or new attach will start out with a TDCF of one, which will almost always result in an overfetch for the majority of hosts.

Personally, I would have set the estimate to 2 or 3 E15.

One other point, the bounds value should be set to something a bit higher than what the estimate is set to. This has causes problems with tasks aborting on time when they didn't have to on other projects.

Alinator
ID: 9686 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 9691 - Posted: 4 Feb 2009, 23:36:21 UTC - in response to Message 9686.  

I think we still have an issue with the flops estimate.

I had a long queue of tasks that was going to blow the deadline so I fiddled with the STD to force cleaning my queue. After I got all tasks done I reset the project and lo and behold I D/L about 10 tasks of stripe s20 and s21 with an estimated time to complete of 6:03 minutes ...

These tasks on my Mac Pro take about 25 to 30 min to complete ... so the number seems to be off by at least a factor of 5 ... I know it will settle in when I do process them, but, this will obviously happen on each reset so ...

Thought I would let you know what I just experienced ... YMMV :)


Agreed. It is unwise to set the FPOP estimate to reflect the apparent power of the fastest hosts.

The reason is that as you pointed out, a reset or new attach will start out with a TDCF of one, which will almost always result in an overfetch for the majority of hosts.

Personally, I would have set the estimate to 2 or 3 E15.

One other point, the bounds value should be set to something a bit higher than what the estimate is set to. This has causes problems with tasks aborting on time when they didn't have to on other projects.

Alinator


I think the bound is set to 100x what the estimate is right now, so that shouldn't be a problem. I can up the estimate a bit but I think once things settle down it should be pretty accurate.
ID: 9691 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 9694 - Posted: 5 Feb 2009, 0:01:20 UTC

It may also be because I am using 5.10.45 on that machine (Mac Pro) because several of the other projects I want to work with seem resistant to chainging the def file that contains the id string that the new 6.x hosts use for the Intel Macs ...

If you upgrade to the 6 series then not only can you not fetch work, you cannot trickle up, nor can you report tasks that you have completed on the now, illegal host ...

I know it is carping, but, this is the kind of issue that bites you when you don't do engineering and just hack at the code ...
ID: 9694 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alinator

Send message
Joined: 7 Jun 08
Posts: 464
Credit: 56,639,936
RAC: 0
Message 9698 - Posted: 5 Feb 2009, 0:34:52 UTC - in response to Message 9691.  
Last modified: 5 Feb 2009, 0:36:58 UTC



I think the bound is set to 100x what the estimate is right now, so that shouldn't be a problem. I can up the estimate a bit but I think once things settle down it should be pretty accurate.


Ahhhh yes, the bounds is set to 100X now.

Actually, I meant dividing the current estimates by 4 or 5. Remember that a TDCF of 1 means the task will take the same amount of time the estimate says. So if you set the FPOP estimate based on what the fastest or most optimized hosts can do then they will have TDCF's close to 1 and everyone else greater than one.

This is the scenario which can lead to overfetches and blown deadlines when things change in terms of estimated runtime (like we just saw).

Alinator
ID: 9698 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : FLOPS Estimate

©2024 Astroinformatics Group