Why are there extra blank lines in my python program output?

I'm not particularly experienced with python, so may be doing something silly below. I have the following program:

import os
import re
import linecache

LINENUMBER = 2

angles_file = open("d:/UserData/Robin Wilson/AlteredData/ncaveo/16-June/scan1_high/000/angles.txt")

lines = angles_file.readlines()

for line in lines:
    splitted_line = line.split(";")
    DN = float(linecache.getline(splitted_line[0], LINENUMBER))
    Zenith = splitted_line[2]
    output_file = open("d:/UserData/Robin Wilson/AlteredData/ncaveo/16-June/scan1_high/000/DNandZenith.txt", "a")
    output_file.write("0\t" + str(DN) + "\t" + Zenith + "\n")
    #print >> output_file, str(DN) + "\t" + Zenith
    #print DN, Zenith

output_file.close()

When I look at the output to the file I get the following:

0   105.5     0.0

0   104.125  18.0

0   104.0    36.0

0   104.625  54.0

0   104.25   72.0

0   104.0    90.0

0   104.75  108.0

0   104.125 126.0

0   104.875 144.0

0   104.375 162.0

0   104.125 180.0

Which is the right numbers, it just has blank lines between each line. I've tried and tried to remove them, but I can't seem to. What am I doing wrong?

Robin

python

2009-06-16 13:47
by robintw

Somebody PLEASE remove the "readline" tag from this question. The questioner's problem is not specific to the readline method of Python file objects [which the questioner is not using anyway; he's using readlines] and is entirely unrelated to the *x console-reading readline facility (which appears to be the topic of almost all other questions tagged with "readline") - John Machin 2009-06-17 15:22

For a GENERAL solution, remove the trailing newline from your INPUT:

splitted_line = line.rstrip("\n").split(";")

Removing the extraneous newline from your output "works" in this case but it's a kludge.

ALSO: (1) it's not a good idea to open your output file in the middle of a loop; do it once, otherwise you are just wasting resources. With a long enough loop, you will run out of file handles and crash (2) It's not a good idea to hard-wire file names like that, especially hidden in the middle of your script; try to make your scripts reusable.

2009-06-16 13:58
by John Machin

why is that a kludge? he's taking a string, works with its start and uses the rest for the output. what's wrong with that - SilentGhost 2009-06-16 14:08

I am not sure why you would want to remove a character that you intend on adding to the output anyways - Andrew Hare 2009-06-16 14:14

Because the newline doesn't belong to the piece of data that he's bothered to give a presumably meaningful name (Zenith). It's only a coincidence that it's "the rest". Next episode: the output is not to a text file, it's inserted into a database column, and somebody is back here asking about the line break in the middle of a row in the CSV file that they got back from their query - John Machin 2009-06-16 14:21

well, John, it would be useful then to give a link to readlines docs, explaining how \n got there in the first place rather than provide useless piece of code, wouldn't it - SilentGhost 2009-06-16 14:58

A link to readlines docs would be pointless, because I'm recommending against it because (a) it's old hat (b) it reads the whole file into memory, which is not necessary. The newline got there because that's what reading a line does, whether with f.readline(), f.readlines(), or "for line in f". The OP had already been told by the first responder that the newline was the problem. Why do you not suggest to the first responder that he should have given a reference to the readline docs? Why do you say that using .rstrip('\n') is a "useless piece of code" - John Machin 2009-06-16 15:50

Amen, John, amen - tzot 2009-12-03 00:00

Change this:

output_file.write("0\t" + str(DN) + "\t" + Zenith + "\n")

to this:

output_file.write("0\t" + str(DN) + "\t" + Zenith)

The Zenith string already contains the trailing \n from the original file when you read it in.

2009-06-16 13:48
by Andrew Hare

Alternative solution (handy if you are processing lines from file) is to strip the whitespace:

Zenith = Zenith.strip();

2009-06-16 13:57
by slovon

Another kludge. If you are interested in all but the newline, do line = line.rstrip('\n') else if you want to remove whitespace, do it from ALL fields: splitted_line = [x.strip() for x in line.split(';') - John Machin 2009-06-16 14:03

EDIT: See comments for details, but there's definitely a better way. [:-1] isn't the best choice, no matter how cool it looks. Use line.rstrip('\n') instead.

The problem is that, unlike file_text.split('\n'), file.readlines() does not remove the \n from the end of each line of input. My default pattern for parsing lines of text goes like this:

with open(filename) as f:
    for line in f.readlines():
        parse_line(line[:-1]) # funny face trims the '\n'

2009-06-16 14:02
by ojrac

Ohhh and the 3rd thing the OP's doing wrong is using readlines. He should be doing "for line in angles_file:". And you shouldn't be using funny faces; it's entirely possible that the last line is not terminated by a newline and your funny face will munch the last character. Use .rstrip('\n') which loses the newline IF ANY - John Machin 2009-06-16 14:44

As much as I like the idea of the funny face munching on the last non-\n character, I admit... that's bad behavior. ;) Good point on the readlines() vs. for line in file - ojrac 2009-06-16 14:54

If you want to make sure there's no whitespace on any of your tokens (not just the first and last), try this:

splitted_line = map (str.strip, line.split (';'))

2009-06-16 14:01
by eduffy