Welcome to MilkyWay@home

testing new application (milkyway3)

Message boards : News : testing new application (milkyway3)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 39013 - Posted: 22 Apr 2010, 15:44:04 UTC - in response to Message 39000.  

Well, decided to allow another milkyway WU to be crunched through. This WU is massive at 115 estimated hours to crunch.

I'm currently running Mandriva 2010.1 beta so I tend to see a lot of patching and updates in the beta stage. Some patches require restarts, while others require only logouts/logins.

What I've noticed is that before I shut down this last time, I had 23hours done and 102hours to go. did a reboot due to the last patch, took a look and I see I'm back again at the initial value of 115 hours to go again (23 hours done).

My opinion is this 3.0 workunit is not desktop friendly because users are likely going to shut down their computers at night and start up the next day, which means that with checkpoints this far apart, I don't think the average desktop user is going to accomplish a workunit (sempron 2+GHz machine with only a basic built-in graphics card).

At 115hours to be done before Apr29th, this looks like something you would run on a server that's waiting for something to do 24x7, not a desktop that is only run sporadicly when needed.

Another thing to consider is if the average default boinc setup is set to switch every 1 hour between science projects, then we have a problem with this app if it is going to reset back if the save points are too far apart.

(right now this computer is set to change projects after 4000+ minutes due to the out of ordinary save points for cosmos, so I don't know if that is affecting milkyway's savepoints).


I think there might be some problem with the windows binaries that's causing this. It's not really how they're expected to run.

It might just be that the estimated time is off (if the progress is off).
ID: 39013 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
J. G. Peter Goes
Avatar

Send message
Joined: 15 Feb 10
Posts: 5
Credit: 130,492
RAC: 0
Message 39023 - Posted: 22 Apr 2010, 21:21:32 UTC - in response to Message 39012.  

Yes Win XP pro. It runs fine and has two ours to go exactly on schedule but i miss the bar showing its progress....
ID: 39023 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 39024 - Posted: 22 Apr 2010, 21:23:44 UTC - in response to Message 39023.  

Yes Win XP pro. It runs fine and has two ours to go exactly on schedule but i miss the bar showing its progress....


Is this still happening with the newly upgraded applications?
ID: 39024 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
J. G. Peter Goes
Avatar

Send message
Joined: 15 Feb 10
Posts: 5
Credit: 130,492
RAC: 0
Message 39025 - Posted: 22 Apr 2010, 21:31:08 UTC - in response to Message 39023.  
Last modified: 22 Apr 2010, 21:50:23 UTC

Well, i do'nt upgrade it by myself so if this is done automaticly.....
I've got an update (MilkyWay 3 i think) no difference.
****** It shows the bar afterwards******. I think: problem resolved.
By the way, my pc is 24/7 running and can handle more and bigger stuff.
If you like to you are welcome.
ID: 39025 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 39032 - Posted: 22 Apr 2010, 21:50:34 UTC

Bad new on the OS X front as well, check pointing is still not working. I ended up aborting the unit after 18 hours and it was back at 7%. When I had last seen it it was at 15%.
ID: 39032 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 39033 - Posted: 22 Apr 2010, 21:56:19 UTC - in response to Message 39032.  

Bad new on the OS X front as well, check pointing is still not working. I ended up aborting the unit after 18 hours and it was back at 7%. When I had last seen it it was at 15%.


I'll be updating them to the v0.04 binaries shortly, which should fix the checkpointing issue.
ID: 39033 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tackleway

Send message
Joined: 17 Mar 10
Posts: 20
Credit: 5,641,904
RAC: 0
Message 39035 - Posted: 22 Apr 2010, 23:19:05 UTC

Keep getting files which run without progressing % done running windows 7
is it me or is my machine?
ID: 39035 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
J. G. Peter Goes
Avatar

Send message
Joined: 15 Feb 10
Posts: 5
Credit: 130,492
RAC: 0
Message 39036 - Posted: 23 Apr 2010, 0:37:09 UTC

Due to the update, wich took some time, everything is working perfect.
Bars are showing and runnig as hell.
ID: 39036 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GeirM

Send message
Joined: 17 Mar 10
Posts: 1
Credit: 174,375,182
RAC: 13,291
Message 39061 - Posted: 23 Apr 2010, 15:36:56 UTC

I have the same problems as I see a couple have - the test 3 does not work. It says "working" - but no progress. This happens to both my PCs - Win 7 -64bit. Anything I can do - or just abort the WUs?
ID: 39061 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ftpd

Send message
Joined: 21 Nov 08
Posts: 23
Credit: 7,466,082
RAC: 0
Message 39065 - Posted: 23 Apr 2010, 17:17:26 UTC
Last modified: 23 Apr 2010, 17:18:32 UTC

The new application MW3 works OK for GTX260 - GTX 295 - GTX470 and GTX480.

Is it possible to arrange to download only MW3 WU for fermi-cards, because all other WU cancel?

Perhaps in prereferences???

Hope to hear soon!

Ton (ftpd)
Ton (ftpd) Netherlands
ID: 39065 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vlad

Send message
Joined: 17 Aug 08
Posts: 1
Credit: 68,866,318
RAC: 0
Message 39095 - Posted: 24 Apr 2010, 8:20:01 UTC

Уважаемые Друзья!!!!
Что то не работают новые приложения milkyway3 .
Счётчик прогресса стоит на месте 0,000%!!! хотя у меня почти закончилось время обработки. Это относится к версии 0,01 и 0,03. По этому я удалил все новые приложения считая их нерабочими.


С уважением , Vlad.

P.S Очень надеюсь, что Вы найдёте проблему и исправите её.




ID: 39095 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joses
Avatar

Send message
Joined: 8 Jul 09
Posts: 19
Credit: 1,667,175
RAC: 0
Message 39096 - Posted: 24 Apr 2010, 8:38:29 UTC - in response to Message 39013.  

I don't tend to leave the computer running 24x7, but for this particular 115hr WU, thought it best to let it run until completion so that you've got some results before time-out, therefore I'm leaving it on. It's now currently at 73hr done, 36hr to do, so the revised total time is more like 109hr. Still at "high priority" but I think it might just make it in time for the 27th deadline.

I think there might be some problem with the windows binaries that's causing this. It's not really how they're expected to run.

It might just be that the estimated time is off (if the progress is off).


It's been mentioned before when you switch from application to application and it may be closer related to BOINC instead of your estimates. the way I understand it is that BOINC computes an estimate for one science project, and a similar calculation method for another. however, since I'm still seeing a largish 100+ hours, I think BOINC may have given a good estimate by my guess.

On a slight tangent (more work on your end)... I've heard about optimized apps, but I hope you could take advantage of mixed binary compilation since the intel compiler optimizes for the intel, but not so well for competing CPUs like the AMDs....so us AMD CPU users are left at disappointing results. If you know what mixed binaries are it could be something like...

compileIntel general_code.c -> general_code.obj
compileIntel important_rtns.c -> important_rtns1.obj -optimized_for_intel
compileAMD important_rtns.c -> important_rtns2.obj -optimized_for_amds
link general_code.obj+important_rtns1.obj+important_rtns2.obj -> main.exe

I know I'm oversimplifying this above, but it lets us normal users running boinc without optimizing manually enjoy some optimization benefits.

Anyways, guess we'll see if this old sempron makes it in time - with good results.
http://www.joescat.com/boinc/
ID: 39096 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile arkayn
Avatar

Send message
Joined: 14 Feb 09
Posts: 999
Credit: 74,932,619
RAC: 0
Message 39106 - Posted: 24 Apr 2010, 18:11:49 UTC - in response to Message 39095.  
Last modified: 24 Apr 2010, 18:13:10 UTC

Уважаемые Друзья!!!!
Что то не работают новые приложения milkyway3 .
Счётчик прогресса стоит на месте 0,000%!!! хотя у меня почти закончилось время обработки. Это относится к версии 0,01 и 0,03. По этому я удалил все новые приложения считая их нерабочими.


С уважением , Vlad.

P.S Очень надеюсь, что Вы найдёте проблему и исправите её.





Dear Friends!
Something does not work new applications milkyway3.
Counter progress stands still 0,000%! although I have almost completed the processing time. This applies to versions 0.01 and 0.03. On this, I deleted all the new applications, considering them holidays.


Sincerely, Vlad.

PS I hope that you will find the problem and correct it.


Travis is planning on releasing new binaries based on the 0.04 code soon which should hopefully fix most of the issues with the new app.

Тревис планирует выпустить новые файлы на основе 0,04 код, который должен скоро надеюсь исправить большинство вопросов, с новым приложением.
ID: 39106 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joses
Avatar

Send message
Joined: 8 Jul 09
Posts: 19
Credit: 1,667,175
RAC: 0
Message 39108 - Posted: 24 Apr 2010, 18:32:23 UTC - in response to Message 39106.  

Well, I was expecting MW milkyway3_0.01_.... to be finished sometime today, therefore called up boinc manager to see results and it died. the command-line prompt spewed out a lot of complaints....
*** glibc detected *** ./boincmgr: double free or corruption (out): 0xadc00468 ***
======= Backtrace: =========
/lib/i686/libc.so.6(+0x6add1)[0xb6c7edd1]
./boincmgr[0x81801c8]
./boincmgr[0x8180745]
./boincmgr[0x807d9a1]
./boincmgr[0x807db4e]
./boincmgr[0x8377f55]
/lib/i686/libpthread.so.0(+0x5ae5)[0xb6ea0ae5]
/lib/i686/libc.so.6(clone+0x5e)[0xb6cec00e]
======= Memory map: ========
08048000-086a9000 r-xp 00000000 08:07 120154     /opt/BOINC/boincmgr
086a9000-086c3000 rw-p 00661000 08:07 120154     /opt/BOINC/boincmgr
086c3000-086d5000 rw-p 00000000 00:00 0
09d3c000-09f72000 rw-p 00000000 00:00 0          [heap]
adc00000-adc21000 rw-p 00000000 00:00 0
adc21000-add00000 ---p 00000000 00:00 0

...that was just a snippet and there was more stuff after that.
If anyone thinks that these error messages on the command line are useful, I could zip it and send it to someone. Just let me know where to send it.

Tried calling boinc manager again, now see that the MW started crunching again 3 hours ago. I told boinc not to fetch any more code, so I'm guessing MW died and is now getting crunched again. Time done now is 82hr and now got 221hr to do before Apr 29th. My guess is I'm crunching the same data again.

If Travis is going to release 0.04 soon, I'm guessing there is nothing useful in this task 3.0.01 now so I'll probably abort the task later tonight unless someone mentions keep it going.


http://www.joescat.com/boinc/
ID: 39108 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Christopher Herr
Avatar

Send message
Joined: 12 Mar 10
Posts: 7
Credit: 104,940,688
RAC: 0
Message 39180 - Posted: 27 Apr 2010, 11:01:02 UTC

Hello everybody,

i got one of the MilkyWay@Home Version 3 v0.01 WU http://milkyway.cs.rpi.edu/milkyway/workunit.php?wuid=96023388 and it told me yesterday at about 22:00 UTC, that there was no time remaining any more.
There was no progress bar either.
Now it has finally finished with a duration of 15:17 hours (initial estimate was around 12 hours, if memory serves) but with an inconclusive validation as you can see.
Working on a Intel C2D T7200 @ 2Ghz, 3 GB Ram, genuine Win 7 X64, ATI Radeon HD 5450 1 GB GDDR3 (SP regretably and not used here, of course) http://milkyway.cs.rpi.edu/milkyway/show_host_detail.php?hostid=159241.
It was due on 29th of April 9:50. I realize that this is a new app and there might be some "bugs" to work out, no offence. But it is a little awkward nonetheless.
And one other thing: there is no checkpointing for this win app, is it? Because everytime i pressed "exit boinc manager" the app began from zero time. If i left it running and preempted apps in memory, then it could continue. Perhaps this isn't possible due to the nature of the used algorithms, that might be one reason i can think of.
Thanks for answers and greetings from Germany!

Hope this helped and was worth the effort,
Christopher
ID: 39180 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 39182 - Posted: 27 Apr 2010, 12:16:31 UTC - in response to Message 39180.  

Now it has finally finished with a duration of 15:17 hours (initial estimate was around 12 hours, if memory serves) but with an inconclusive validation as you can see.

Until the wingman comes in that is the "correct" response... it just means that you need to wait for step 2 to complete ...
ID: 39182 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Christopher Herr
Avatar

Send message
Joined: 12 Mar 10
Posts: 7
Credit: 104,940,688
RAC: 0
Message 39186 - Posted: 27 Apr 2010, 13:13:24 UTC - in response to Message 39182.  

Until the wingman comes in that is the "correct" response... it just means that you need to wait for step 2 to complete ...


Hello Paul, thx for your answer, but i am aware of the principle of mutual validation in Boinc.
The point of my post was not the inconclusive validation, but imho the longer than estimated running time and the missing progress bar as well as the lack of checkpointing.
But perhaps these issues have been resolved already.

Greetings,
Christopher
ID: 39186 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 39201 - Posted: 27 Apr 2010, 16:49:34 UTC - in response to Message 39186.  

Until the wingman comes in that is the "correct" response... it just means that you need to wait for step 2 to complete ...


Hello Paul, thx for your answer, but i am aware of the principle of mutual validation in Boinc.
The point of my post was not the inconclusive validation, but imho the longer than estimated running time and the missing progress bar as well as the lack of checkpointing.
But perhaps these issues have been resolved already.

Greetings,
Christopher

Sorry, i missed the point ...

I *THINK* Travis has fixed the progress bar for sure and there are checkpointing issues that also were addressed ... only testing will tell if he got all of the issues...

No idea about the longer run times, though that could be a side effect of the checkpointing issues ... my tasks that ran seemed to run in about the same time so this may be a YMMV issue ...

Aside from the changes to increase the science done and have better models the other goal is to ease the burden on the server by the elimination of file handling and that is something that will be well worth the wait ...

Anyway, sorry about the misunderstanding ... :)
ID: 39201 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Christopher Herr
Avatar

Send message
Joined: 12 Mar 10
Posts: 7
Credit: 104,940,688
RAC: 0
Message 39202 - Posted: 27 Apr 2010, 16:58:39 UTC - in response to Message 39201.  

Sorry, i missed the point ...

...
Anyway, sorry about the misunderstanding ... :)

Never mind, no hard feelings ;-P, never about something this trivial.
Thx anyway.
ID: 39202 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Joses
Avatar

Send message
Joined: 8 Jul 09
Posts: 19
Credit: 1,667,175
RAC: 0
Message 39210 - Posted: 27 Apr 2010, 18:22:21 UTC

Aborted last night, it was at about 110 hours done and 209 hours to go. All other tasks were put as suspended, so it was the only task running.

I think I'm running one of the more normal tasks right now, but even though the other tasks are suspended, BOINC appears to do a call to the other application URLs for any status checks or something, then I see that BOINC restarts this MW task again, so this may be one reason why I hadn't progressed much beyond 200 hours yet to do.
http://www.joescat.com/boinc/
ID: 39210 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : News : testing new application (milkyway3)

©2024 Astroinformatics Group