Welcome to MilkyWay@home

Problem with running two milkyway work units

Message boards : Number crunching : Problem with running two milkyway work units
Message board moderation

To post messages, you must log in.

AuthorMessage
disturber

Send message
Joined: 11 Jan 15
Posts: 8
Credit: 75,884,093
RAC: 0
Message 64132 - Posted: 29 Nov 2015, 15:27:09 UTC

I have been running two workunits per gpu on two different computers, one running windows 7 and using a R9 280x and the other running windows 10 and using a 7970 card. To get higher production I overlap the two tasks 50% by briefly pausing one of the tasks at the midway point, and then resuming it when it starts calculating a new one.

The issue I have is that after a period of time, both workunits synchronize again, increasing my compute time from 41-42s to 44-45s. It may not seem much but that is almost a 10% increase. It is a hassle to to check on this daily and I may give up in that yet. Does anyone have an idea why this would be happening? As you can see, this is not related to either the OS or the card. And even with different cpus and different OS, if they are clocked the same, will produce identical compute times.
ID: 64132 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile khryl

Send message
Joined: 11 Feb 11
Posts: 57
Credit: 69,475,644
RAC: 0
Message 64133 - Posted: 29 Nov 2015, 16:38:46 UTC
Last modified: 29 Nov 2015, 16:39:13 UTC

i have the same problem on my 280x, they start finishing after 37-38 seconds, and after a while (sooner or later), they take 42 seconds because they somewhat synchronized again.

you could fix that by having run 3 simultaneously (less likelihood of having that happen), but if you want to stick to the 2 wu at the same time solution, i have no clue how to fix that
ID: 64133 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Mumak
Avatar

Send message
Joined: 8 Apr 13
Posts: 89
Credit: 517,085,245
RAC: 0
Message 64134 - Posted: 29 Nov 2015, 19:26:42 UTC

You might enable the Milkyway@Home Separation (Modified Fit) units too and since these take a different amount of time, running both types mixed should get them 'out of sync'.
ID: 64134 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile khryl

Send message
Joined: 11 Feb 11
Posts: 57
Credit: 69,475,644
RAC: 0
Message 64135 - Posted: 29 Nov 2015, 20:52:31 UTC
Last modified: 29 Nov 2015, 20:53:06 UTC

then he will have the problem that 2 workunits at once (with separation modfits) wont be enough to keep the gpu on full load, since he has the same card as i do. i need at least 4 simultaneously with modfits running to have my gpu at 99% (permanently)
ID: 64135 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
swiftmallard
Avatar

Send message
Joined: 18 Jul 09
Posts: 300
Credit: 303,562,776
RAC: 0
Message 64136 - Posted: 30 Nov 2015, 15:40:21 UTC

I had the same issue and solved it by running a CPU project as well as GPU.
ID: 64136 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
disturber

Send message
Joined: 11 Jan 15
Posts: 8
Credit: 75,884,093
RAC: 0
Message 64159 - Posted: 6 Dec 2015, 15:50:17 UTC
Last modified: 6 Dec 2015, 15:51:54 UTC

I started running Einstein cpu workunits. It helps some but still they seem to synchronize over time. I stopped running the separation modfits because they are way too short relative to their load time. The Milkyway units are way too short too IMO. No other project has wu that run only tens of seconds. I have not researched it, but does anyone know why this is, or is it because the 280x is just such a fast card for these jobs?
ID: 64159 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,938,381
RAC: 22,848
Message 64161 - Posted: 7 Dec 2015, 12:41:02 UTC - in response to Message 64159.  

I started running Einstein cpu workunits. It helps some but still they seem to synchronize over time. I stopped running the separation modfits because they are way too short relative to their load time. The Milkyway units are way too short too IMO. No other project has wu that run only tens of seconds. I have not researched it, but does anyone know why this is, or is it because the 280x is just such a fast card for these jobs?


Actually PrimeGrid's PSP units run really short on fast cards too. My guess is fast, capable cards that just hit the sweet spot sometimes.
ID: 64161 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
disturber

Send message
Joined: 11 Jan 15
Posts: 8
Credit: 75,884,093
RAC: 0
Message 64168 - Posted: 12 Dec 2015, 2:58:13 UTC

I know this is not quite on topic for my original post.

Ever since I started running Einstein cpu work, I have been getting a lot of invalids and validation errors. I don't know if the interaction between cpu and gpu causes that, but I am quitting running any cpu jobs. The amount of credit from that work does not compensate for the loss from the errors and invalids.
In addition it does not seem to appreciably effect the synchronization.

Has anyone else found that cpu work effects error rate?
ID: 64168 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile mikey
Avatar

Send message
Joined: 8 May 09
Posts: 3315
Credit: 519,938,381
RAC: 22,848
Message 64172 - Posted: 12 Dec 2015, 11:25:07 UTC - in response to Message 64168.  

I know this is not quite on topic for my original post.

Ever since I started running Einstein cpu work, I have been getting a lot of invalids and validation errors. I don't know if the interaction between cpu and gpu causes that, but I am quitting running any cpu jobs. The amount of credit from that work does not compensate for the loss from the errors and invalids.
In addition it does not seem to appreciably effect the synchronization.

Has anyone else found that cpu work effects error rate?


Do you leave a cpu core free just to keep the gpu fed? If not that could be the problem yes.
ID: 64172 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Problem with running two milkyway work units

©2024 Astroinformatics Group