d****n 发帖数: 1637 | 1 同时打开1000个文件,程序肯定会卡死。
你要用amortize 法则。buffer IO.
譬如,你不能同时装入全部文件。
但是你可以把每个文件得1000行顺序读入。
充其量也就是1000*1000个array大小再内存里面。
然后纪录每个文件得lseek,用一个独立得 array。
处理前1000个文件得1000行。
输出这些到文件。
读入lseek 得 array,读入下1000个文件得各1000行,继续重复处理。
一般这个buffer理想应该在10million。这样做,你能实现计算。
又不至于让内存用干。 |
|
w*s 发帖数: 7227 | 2 not getting it yet,
the offset of lseek is number of bytes, not by line number,
so i need to keep calculating myself ?
e.g., in the above example, if i remove the 2nd line, how can lseek help ?
many thanks ! |
|
i*****e 发帖数: 113 | 3 1) open file2 with rw
2) set file2 size to length of file1, and lseek to eof
3) read a block (2k or 4k, i.e. one page) from file1
4) reverse the bits on the block
5) find proper place and write back to file2
6) if not done, go to 3) |
|
|
|
S******1 发帖数: 269 | 6 Unix has file descriptor, and use lseek such kind of thing so you can
control the size you want to read.
Not sure if it is in this case. |
|
c*******1 发帖数: 589 | 7 读system call的manual.
man 2 stat 看文件size.
man 2 lseek.
man 2 read.
至于能不能优化performance, 看具体要求了。 |
|
d****n 发帖数: 1637 | 8 1. first use getc() or getline (),
when reach a newline char '\n', use ftell() function save the current
position into a list/array sequentially
save this entire array into a text/binary format file as index.
2. when you try to pull out server lines from the huge file,
organize the line numbers in ascending order,
for each line number in your query
binary search your previous saved index
lseek the position in the index.
getc() or getline() until to the first '\n' by that offset
3.... 阅读全帖 |
|
s****n 发帖数: 700 | 9 how can I just to the next n line using C.
I can't create a index for the file. And I can't use lseek because the
offset is pretty random.
thank yo. |
|
|
c*****a 发帖数: 808 | 11 flag在main的外面.之前我用fgets加File pointer *就不会重复,感觉是因为跟
pointer有关,现在用lower level的read加 lseek(fd, atIndex, SEEK_SET)来做,感
觉fd 在parent和child不会因为另外一边改变而改变
有人指教吗
int flag =1;
int main(){
pid=fork();
if(pid<0)
error
else
if ( pid == 0 ) {
/* child*/
do{
if(flag ==0)
read and write n bytes ,atIndex +n
if XXX break;
kill(getppid (), SIGUSR1) ;
signal(SIGUSR1, shandler);
pause();
}while(1);
exit(3);
} else {
/* parent... 阅读全帖 |
|
c*****a 发帖数: 808 | 12 I am using lower level I/O, lseek, open and read. the output of stream gets
print out twice. instead of "JOHN\n", it becomes "JOHN\nJOHN\n".
I manage to get it working, if I pass the value of current position with 2
sighanlders between parent and child; but it is a crappy code, because I
create a lot of temp variables for swapping.
However, when I do not use lower level stuff :
FILE *fp;
fp = fopen(filename,"r");
because fp is pointer, if fp is changed on parent, fp is also chan... 阅读全帖 |
|
w**n 发帖数: 88 | 13 lseek, if you are talking about system call... |
|
m*****e 发帖数: 4193 | 14 read the man page of lseek carefully |
|