If I want to append the contents of a src file into the end of a dest file in Ruby, is it better to use:
while line = src.gets do
or
while buffer = src.read( 1024 )
I have seen both used and was wondering when should I use each method and why?
One is for reading "lines", one is for reading n bytes.
While byte buffering might be faster, a lot of that may disappear into the OS which likely does buffering anyway. IMO it has more to do with the context of the read--do you want lines, or are you just shuffling chunks of data around?
That said, a performance test in your specific environment may be helpful when deciding.
You have a number of options when reading a file that are tailored to different situations.
Read in the file line-by-line, but only store one line at a time:
while (line = file.gets) do
# ...
end
Read in all lines of a file at once:
file.readlines.each do |line|
# ...
end
Read the file in as a series of blocks:
while (data = file.read(block_size))
# ...
end
Read in the whole file at once:
data = file.read
It really depends on what kind of data you're working with. Generally read is better suited towards binary files, or those where you want it as one big string. gets and readlines are similar, but readlines is more convenient if you're confident the file will fit in memory. Don't do this on multi-gigabyte log files or you'll be in for a world of hurt as your system starts swapping. Use gets for situations like that.
gets will read until the end of the line based on a separator
read will read n bytes at a time
It all depends on what you are trying to read.
It may be more efficient to use read if your src file has unpredictable line lengths.