Welcome to MilkyWay@home

found a small memory error in the code

Message boards : Application Code Discussion : found a small memory error in the code
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 9234 - Posted: 26 Jan 2009, 22:43:43 UTC
Last modified: 27 Jan 2009, 22:16:19 UTC

I don't think this is effecting anything at the moment, but it should be fixed and will be in v0.18.

At the end of calculate_integral_convolved in evaluation_optimized.c:

        for (i = 0; i < [b]ap->convolve[/b]; i++) {
                free(N[i]);
                free(r_point[i]);
                free(r3[i]);
        }


needs to be changed to:
        for (i = 0; i < [b]ia->r_steps[/b]; i++) {
                free(N[i]);
                free(r_point[i]);
                free(r3[i]);
        }

ID: 9234 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile speedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
Message 9239 - Posted: 26 Jan 2009, 23:42:23 UTC - in response to Message 9234.  

Can you please wait a while before releasing a new version?
As long as there are no severe problems it's time to settle down a bit... ;)

mic.


ID: 9239 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Kokomiko
Avatar

Send message
Joined: 27 Sep 07
Posts: 8
Credit: 25,779,225
RAC: 0
Message 9242 - Posted: 27 Jan 2009, 1:17:56 UTC

How about a release number instead a new version for minor bug fixes? Or is there anywhere a competition for the most versions in a week? ;)
ID: 9242 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 9243 - Posted: 27 Jan 2009, 1:52:06 UTC - in response to Message 9239.  

Can you please wait a while before releasing a new version?
As long as there are no severe problems it's time to settle down a bit... ;)


Thats why i didn't update it yet :D This shouldn't have any effect but a small memory leak.
ID: 9243 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile speedimic
Avatar

Send message
Joined: 22 Feb 08
Posts: 260
Credit: 57,387,048
RAC: 0
Message 9279 - Posted: 27 Jan 2009, 22:13:14 UTC - in response to Message 9234.  

I don't think this is effecting anything at the moment, but it should be fixed and will be in v0.18.

At the end of calculate_integral_convolved in evaluation_state.c:


Didn't find it there... you mean evaluation_optimized.c, right?
mic.


ID: 9279 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 9281 - Posted: 27 Jan 2009, 22:16:08 UTC - in response to Message 9279.  
Last modified: 27 Jan 2009, 22:16:27 UTC

I don't think this is effecting anything at the moment, but it should be fixed and will be in v0.18.

At the end of calculate_integral_convolved in evaluation_state.c:


Didn't find it there... you mean evaluation_optimized.c, right?


Woops, yeah. Edited the post :)
ID: 9281 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>HFR>RR] ThierryH

Send message
Joined: 2 Jan 08
Posts: 23
Credit: 495,882,464
RAC: 0
Message 13433 - Posted: 1 Mar 2009, 0:30:29 UTC
Last modified: 1 Mar 2009, 0:30:54 UTC

Other memory error in evaluation_optimized.c :

Function free_constants, lines :

for (i = 0; i < ap->number_streams; i++) {
free(xyz[i]);
}

Must be replace by :

for (i = 0; i < ap->convolve; i++) {
free(xyz[i]);
}

With this error, each work unit create a memory leak of 1.5 Ko.

Thierry.
ID: 13433 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 13601 - Posted: 2 Mar 2009, 5:48:36 UTC

Um, am I the only one that thinks that this might be one of the reasons my computers turn into turtles?
ID: 13601 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Travis
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Aug 07
Posts: 2046
Credit: 26,480
RAC: 0
Message 22964 - Posted: 21 May 2009, 23:09:29 UTC - in response to Message 13433.  

Other memory error in evaluation_optimized.c :

Function free_constants, lines :

for (i = 0; i < ap->number_streams; i++) {
free(xyz[i]);
}

Must be replace by :

for (i = 0; i < ap->convolve; i++) {
free(xyz[i]);
}

With this error, each work unit create a memory leak of 1.5 Ko.

Thierry.


Thanks, this will be in the newest version of the code.
ID: 22964 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Misfit
Avatar

Send message
Joined: 27 Aug 07
Posts: 915
Credit: 1,503,319
RAC: 0
Message 22974 - Posted: 22 May 2009, 0:58:44 UTC - in response to Message 13601.  

Um, am I the only one that thinks that this might be one of the reasons my computers turn into turtles?

Check the turtle section in the BOINC WIKI for help.
me@rescam.org
ID: 22974 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 23017 - Posted: 22 May 2009, 13:14:26 UTC - in response to Message 22974.  

Um, am I the only one that thinks that this might be one of the reasons my computers turn into turtles?

Check the turtle section in the BOINC WIKI for help.

Just did thank you ...

And strangely enough this code was there as an example ...
ID: 23017 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Misfit
Avatar

Send message
Joined: 27 Aug 07
Posts: 915
Credit: 1,503,319
RAC: 0
Message 23038 - Posted: 22 May 2009, 19:07:49 UTC - in response to Message 23017.  

Um, am I the only one that thinks that this might be one of the reasons my computers turn into turtles?

Check the turtle section in the BOINC WIKI for help.

Just did thank you ...

And strangely enough this code was there as an example ...

And they made it a sticky
me@rescam.org
ID: 23038 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 12 Apr 08
Posts: 621
Credit: 161,934,067
RAC: 0
Message 23073 - Posted: 23 May 2009, 0:38:44 UTC - in response to Message 23038.  

Um, am I the only one that thinks that this might be one of the reasons my computers turn into turtles?

Check the turtle section in the BOINC WIKI for help.

Just did thank you ...

And strangely enough this code was there as an example ...

And they made it a sticky

Interesting...

Well, I don't see the point of having 4 or 5 different documentation sites... Trac Wiki for development, UCB wiki for no information to speak of ... UBW, BOINC FAQ Service ... lord only knows how many others ... fracturing the information like that means that we have that many more times to make mistakes, and updating the same fact in 5 places means that you just spent 5 times as much effort as needed to get the information right ...

It was why i quit ...

It is obvious to me that the powers that be, Dr.Anderson and the project leaders are quite simply not interested in a good repository of information about BOINC.

John37xxxx has been trying to get some other social networking stuff up and running ... and I wish him well, but, that too will likely fail because there is no project support. Sadly, the projects just cannot see that when they don't support these efforts, they die and that is one of the reasons we have only a couple hundred thousand participants world wide ...
ID: 23073 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
dobrichev

Send message
Joined: 30 May 09
Posts: 9
Credit: 105,674
RAC: 0
Message 26947 - Posted: 2 Jul 2009, 21:26:56 UTC

Here is the list of the memory leaks I found in application source v0.18

file boinc_astronomy.C

void worker()
...
free(sp);
free_search_parameters(s); // <-- missing
...


file evaluation_optimized.c

void free_constants(ASTRONOMY_PARAMETERS *ap)
...
//for (i = 0; i < ap->number_streams; i++) { // <-- wrong
for (i = 0; i < ap->convolve; i++) { // correct
...


file evaluation_state.c

void free_state(EVALUATION_STATE* es) {
int i;
free(es->stream_integrals);
for (i = 0; i < es->number_integrals; i++) {
free_integral_area(es->integral[i]);
free(es->integral[i]); // <-- missing
}
free(es->integral);
}

int read_checkpoint(EVALUATION_STATE* es) {
...
if (1 > fscanf(file, "background_integral: %lf\n", &(es->background_integral))) return 1;
free(es->stream_integrals); // <-missing, read_double_array below allocates memory
es->number_streams = read_double_array(file, "stream_integrals", &(es->stream_integrals));
...


file search_parameters.c

void free_search_parameters(SEARCH_PARAMETERS *parameters) {
free(parameters->search_name);
free(parameters->parameters);
free(parameters->metadata);
free(parameters); // <-- missing
}

=====

There is also a 4-byte memory leak somewhere in the boinc library, probably in diagnostics_win.cpp
diagnostics_threads.push_back(pThreadEntry);

=====

Additionaly, when checkpoint is loaded, the memory is previously allocated on the basis of the workunit, but the loops for data reading are based on the checkpoint content. This causes heap corruption when the checkpoint file was generated for another workunit due to the difference in the data member count. Same could occur when for some reason the checkpoint file is corrupt.

=====

In general, since the amount of required heap memory is known at the beginning, all the memory allocation calls may be reduced to one allocation per data type. This should affect the performance, especially on multiprocessor mashines.
ID: 26947 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 26976 - Posted: 3 Jul 2009, 7:05:14 UTC - in response to Message 26947.  

In general, since the amount of required heap memory is known at the beginning, all the memory allocation calls may be reduced to one allocation per data type. This should affect the performance, especially on multiprocessor mashines.

I doubt it would have any significant effect on the performance. Why should it?
ID: 26976 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
dobrichev

Send message
Joined: 30 May 09
Posts: 9
Credit: 105,674
RAC: 0
Message 26987 - Posted: 3 Jul 2009, 14:12:41 UTC - in response to Message 26976.  

All memory allocations are executed within seconds, so this will NOT affect the MW application performance. Compiler runtime usually is requesting larger memory blocks from OS, and later malloc calls are managed by the runtime internally within these blocks. When the amount of necessary space is known (which is our case) it is better to allocate memory at once instead of giving the runtime a chance to fragment the memory or to allocate unusual space from the OS. At least this is the way I like to code.
The really slow process is the OS memory request in multiprocessor system due to the locks. Frequent OS memory allocation requests will affect the rest of the running processes and cause delays if some really memory intensive applications are running in parallel. This is not the case for the desktop machines.
Actually I am not sure how BOINC environment is running the application. Probably on process termination all the memory leaks gone along with the process itself so there might be not memory allocation problems at all.

A real benefit would be if the arrays of type x0, y0, z0, x1, y1, z1... are replaced with x0, x1... y0, y1... z0, z1 which allows effective loop vectorization. But this is another matter.
ID: 26987 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 26990 - Posted: 3 Jul 2009, 14:37:28 UTC - in response to Message 26987.  

A real benefit would be if the arrays of type x0, y0, z0, x1, y1, z1... are replaced with x0, x1... y0, y1... z0, z1 which allows effective loop vectorization. But this is another matter.

Do you mean the xyz[convolve][3] array?
Loop vectorization works also with the current code which can be observed with the optimized Linux apps (which use unmodified code just with an autovectorizing compiler giving a factor 2 or 3 speedup with capable CPUs).
ID: 26990 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
dobrichev

Send message
Joined: 30 May 09
Posts: 9
Credit: 105,674
RAC: 0
Message 27040 - Posted: 4 Jul 2009, 14:12:33 UTC - in response to Message 26990.  

Do you mean the xyz[convolve][3] array?


Exactly.

I compiled v0.18 source on Windows. Adding some intrinsics in the innermost loop and drawing out of the loops 1/(q*q), sinb, sinl, cosb and cosl along with some other minor changes resulted in performance improvement of ~11% over the latest Gipsel's SSE3 build for Windows. The half of CPU time is consumed for the exp() function in the inner loop. Using vectorized exp() from intel's compiler didn't help, but storing the intermediate results in array and executing the exp() in a separate loop helped.
The rearangement I made is manually to process in parallel the two subsequent loop iterations, using the XMM registers. When x[j] and x[j+1] are located sequentially in the memory they can be loaded in one instruction directly in the register w/o wasting cycles and dirtying other registers for shuffling. This may be applicable for some other arrays too.

BTW, any suggestions where and whether to publish my observations (source modifications, code and comments) are welcome. The message boards are messed up (just as I am doing with this post). I am not sure if anybody is still interested in CPU (not GPU) versions since the task is highly prone to parallelization and therefore GPU builds probably contribute almost 100% of the scientific results.
ID: 27040 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile banditwolf
Avatar

Send message
Joined: 12 Nov 07
Posts: 2425
Credit: 524,164
RAC: 0
Message 27041 - Posted: 4 Jul 2009, 14:46:10 UTC - in response to Message 27040.  

I'm interested in a faster Cpu app. I only have 1 computer, but faster is better. :)
Doesn't expecting the unexpected make the unexpected the expected?
If it makes sense, DON'T do it.
ID: 27041 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Cluster Physik

Send message
Joined: 26 Jul 08
Posts: 627
Credit: 94,940,203
RAC: 0
Message 27090 - Posted: 5 Jul 2009, 9:13:06 UTC - in response to Message 27040.  
Last modified: 5 Jul 2009, 9:32:37 UTC

I am not sure if anybody is still interested in CPU (not GPU) versions since the task is highly prone to parallelization and therefore GPU builds probably contribute almost 100% of the scientific results.

That is one of the reasons I didn't post a link to a slightly improved CPU version. It is generally about 10-20% faster than the build you are referring to (would have to look it up, it's too long ago) and it does not use any intrinsics ;). Using a CPU is largely energy waste compared to a GPU so I stopped looking to the CPU application. Furthermore the credits would be even higher for the CPUs with faster applications. I think the project needs to consider an adjustment.

For comparison, I calculated a WU for 74.24 credits on a C2D E8400@stock (3GHz). It took 1820 seconds (some minor stuff was running in the background).

BTW, any suggestions where and whether to publish my observations (source modifications, code and comments) are welcome. The message boards are messed up (just as I am doing with this post).

You can either open a new thread or write Travis a private message (that's the way I used). I think he is only interested in general modifications that will help the performance on all platforms. Some of the suggestions I've done never made it to the stock app (guess they were forgotten after the major improvements were done). And in the moment he is supposed to prepare the rollout of the GPU project and the CUDA apps. But maybe it will help and it should not be wasted, as the CPU project is planned to run in parallel.
ID: 27090 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Application Code Discussion : found a small memory error in the code

©2024 Astroinformatics Group