Welcome to MilkyWay@home

problem with checkpoints 2

Message boards : Application Code Discussion : problem with checkpoints 2
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 9021 - Posted: 25 Jan 2009, 2:22:51 UTC

I'm pretty sure the application is checkpointing correctly but we're still getting the odd bad workunit. Not quite sure what's causing it, but I'm going to keep looking into the problem.
ID: 9021 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 9024 - Posted: 25 Jan 2009, 2:26:36 UTC - in response to Message 9021.  

This is happening in all the 0.14 compiled apps. Including the Gipsel app (which i think has a slightly different fix to the checkpointing problem). So I'm not quite sure if the issue is still checkpointing or if it's something else.
ID: 9024 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 9041 - Posted: 25 Jan 2009, 3:30:57 UTC - in response to Message 9024.  

Pretty sure I found the issue.

At the end of calculate integrals there is:

        #ifdef GMLE_BOINC
                int retval = write_checkpoint(es);
                if (retval) {
                        fprintf(stderr,"APP: astronomy checkpoint failed %d\n",retval);
                        return retval;
                }
        #endif


So in the rare case that this is the last checkpoint calculated (and there hasn't been a new one from the next integral calculation or likelihood calculation), the app will recalculate an integral. I'm going to do another update because I think this should put the last nail in the coffin to this problem.
ID: 9041 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Application Code Discussion : problem with checkpoints 2

©2024 Astroinformatics Group