l*******G 发帖数: 1191 | 1 I run matlab in linux with a batch like this:
===matlab_batch.sh===============
#!/bin/bash
#this is my bash program to run matlab code matlab_program.m
#repeatedly without GUI
loopindices="1 2 3 4 5"
for loopind in $loopindices
do
echo running matlab $loopind
matlab -nodesktop <
matlab_program
$loopind
EOF1
echo finished running matlab $loopind
done
==================
where matlab_program is another matlab code (matlab_program.m) which can run
in matlab command line prompt and it takes an integer $loopind as input
from matlab command line.
The code above (matlab_batch.sh) runs well, however, as the index $loopind
increases from 1 to 5, it slows down like crazy! Even if each time the
actual matlab program matlab_program.m is doing the same exact thing!!!
So what is going on with consecutive matlab processes being initialized from
linux shell ??? The "EOF1 EOF1" section above is a usual trick for
bash in linux to call a non-bash program like matlab. Within each for loop
of loopind, a matlab is invoked and then closed after matlab_program
finishes. So does an earlier matlab process affect a later matlab process?
If it does, how and why? Matlab must be leaking a lot of memory. Oops, my
program takes 2 hours to run when loopind =1, yet it takes 20 hrs to run by
the time loopind=5....
My program is not trivial (took me long time to code it up) to and it is
painstaking to see it would slow down like this. I have 24G memory on a
cluster and the matlab program is running on head node without any
parallelization.
Mathworks must be doing a very bad job in memory management especially with
the newer versions of matlab. Oops, time to abandon matlab for me. | l*******G 发帖数: 1191 | 2 I finally found the reason. When the directory under which I run the
code has too many (>5000) files, matlab gets very very very slow. Because in
each of the for loop iteration ($loopind), I create about 1500 temp files
to save data onto disk, and then I will load these files one by one. Before
loading them, I use calls to exist(filename,'file') to make sure the file
exists. It turns
out that exist(filename,'file') gets very very slow when there are many
files in the folder even though an exact filename is given. Matlab shouldn't
be checking every file in the folder to see if a specific file exists or
not since no wild card symbols (*) are used in the check. It is kind of
ironic that a piece of code runs slower and slower when you increase number
of files stored under that folder.
Otherwise seem to have the same problem:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/3
【在 l*******G 的大作中提到】 : I run matlab in linux with a batch like this: : ===matlab_batch.sh=============== : #!/bin/bash : #this is my bash program to run matlab code matlab_program.m : #repeatedly without GUI : loopindices="1 2 3 4 5" : for loopind in $loopindices : do : echo running matlab $loopind : matlab -nodesktop <
| l*******G 发帖数: 1191 | 3 Oops, even though I found the cause of the problem, there doesn't seem to
exist a simple solution to it. Matlab does not have an explicit way to stop
the file checks on linux system?? I understand IDE may want to check number
of files etc in current dir in order to search for programs etc. When
running code without IDE, it makes no sense!
It is terrible that matlab would slow down because number of files in folder
is large.
A sloppy solution is to save the files in a subfolder rather than in the
current dir directly. It works but sucks! It's funny that matlab would check
and watch any file changes in the current dir but no in subfolders!
The bottom line is that matlab will monitor files in its PATH to search for
programs in case one program calls another. If you create a lot of files
during a matlab session and store those files in the PATH, then matlab gets
incredibly slow. Unfortunately, current dir is always in PATH, so creating
large number of files in current dir is bad for matlab. Creating files in a
subdir is okay as the subdir is not in PATH unless you add the subdir to the
PATH explicitly. | l*******G 发帖数: 1191 | 4 I run matlab in linux with a batch like this:
===matlab_batch.sh===============
#!/bin/bash
#this is my bash program to run matlab code matlab_program.m
#repeatedly without GUI
loopindices="1 2 3 4 5"
for loopind in $loopindices
do
echo running matlab $loopind
matlab -nodesktop <
matlab_program
$loopind
EOF1
echo finished running matlab $loopind
done
==================
where matlab_program is another matlab code (matlab_program.m) which can run
in matlab command line prompt and it takes an integer $loopind as input
from matlab command line.
The code above (matlab_batch.sh) runs well, however, as the index $loopind
increases from 1 to 5, it slows down like crazy! Even if each time the
actual matlab program matlab_program.m is doing the same exact thing!!!
So what is going on with consecutive matlab processes being initialized from
linux shell ??? The "EOF1 EOF1" section above is a usual trick for
bash in linux to call a non-bash program like matlab. Within each for loop
of loopind, a matlab is invoked and then closed after matlab_program
finishes. So does an earlier matlab process affect a later matlab process?
If it does, how and why? Matlab must be leaking a lot of memory. Oops, my
program takes 2 hours to run when loopind =1, yet it takes 20 hrs to run by
the time loopind=5....
My program is not trivial (took me long time to code it up) to and it is
painstaking to see it would slow down like this. I have 24G memory on a
cluster and the matlab program is running on head node without any
parallelization.
Mathworks must be doing a very bad job in memory management especially with
the newer versions of matlab. Oops, time to abandon matlab for me. | l*******G 发帖数: 1191 | 5 I finally found the reason. When the directory under which I run the
code has too many (>5000) files, matlab gets very very very slow. Because in
each of the for loop iteration ($loopind), I create about 1500 temp files
to save data onto disk, and then I will load these files one by one. Before
loading them, I use calls to exist(filename,'file') to make sure the file
exists. It turns
out that exist(filename,'file') gets very very slow when there are many
files in the folder even though an exact filename is given. Matlab shouldn't
be checking every file in the folder to see if a specific file exists or
not since no wild card symbols (*) are used in the check. It is kind of
ironic that a piece of code runs slower and slower when you increase number
of files stored under that folder.
Others seem to have the same problem:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/3
【在 l*******G 的大作中提到】 : I run matlab in linux with a batch like this: : ===matlab_batch.sh=============== : #!/bin/bash : #this is my bash program to run matlab code matlab_program.m : #repeatedly without GUI : loopindices="1 2 3 4 5" : for loopind in $loopindices : do : echo running matlab $loopind : matlab -nodesktop <
| l*******G 发帖数: 1191 | 6 Oops, even though I found the cause of the problem, there doesn't seem to
exist a simple solution to it. Matlab does not have an explicit way to stop
the file checks on linux system?? I understand IDE may want to check number
of files etc in current dir in order to search for programs etc. When
running code without IDE, it makes no sense!
It is terrible that matlab would slow down because number of files in folder
is large.
A sloppy solution is to save the files in a subfolder rather than in the
current dir directly. It works but sucks! It's funny that matlab would check
and watch any file changes in the current dir but no in subfolders!
The bottom line is that matlab will monitor files in its PATH to search for
programs in case one program calls another. If you create a lot of files
during a matlab session and store those files in the PATH, then matlab gets
incredibly slow. Unfortunately, current dir is always in PATH, so creating
large number of files in current dir is bad for matlab. Creating files in a
subdir is okay as the subdir is not in PATH unless you add the subdir to the
PATH explicitly. | b******3 发帖数: 4385 | |
|