How to get a line from a large file if you know the offset

Go To


The file is a UTF8 text files.

Each characters has varying number of bytes and each line has varying number of characters.

Does has table of line numbers to byte location function or something like that?

Also after that how to read that?

2012-04-04 04:37
by user4951


The StreamReader class is the typical choice for line-by-line reading of a file. It does not maintain any history of what it reads in a file and thus does not know where the last line ended or where the next one will. When requested (via ReadLine), it simply processes characters until it reaches the new line string or the end of the file.

I do not know the actual implementation of the StreamReader, but I would assume that it uses the Encoding class to handle multi-byte encodings and only maintains a small buffer of potentially pre-read data to improve read performance (reading chunks is better than just the 10 bytes you need now). Any other buffers, such as the characters in the current line, would be locals to functions like ReadLine that need them.

If you need to seek randomly, you will need to use the BaseStream property to generate a table of line starts for yourself and then seek that stream to the beginning of the desired line. From there, you should be able to use ReadLine as usual.

2012-04-04 05:15
by Gideon Engelberth
I noticed that streamreader do not have offset properties. Is that where basestream kicksin? Yea I would need tables of line starts I suppos - user4951 2012-04-04 05:25
+1. That's the stuff. I got it - user4951 2012-04-04 05:56