[reading & writing files] #python #file operations #readline #readlines
Reading a File
Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent (see open()). 'b' appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.
f = open('my_path/my_file.txt', 'r')
file_data = f.read()
f.close()
Mode Description
r Open for reading plain text
w Open for writing plain text
a Open an existing file for appending plain text
rb Open for reading binary data
wb Open for writing binary data
First open the file using the built-in function, open. This requires a string that shows the path to the file. The open function returns a file object, which is a Python object through which Python interacts with the file itself. Here, we assign this object to the variable f.
There are optional parameters you can specify in the open function. One is the mode in which we open the file. Here, we use r or read only. This is actually the default value for the mode argument.
Use the read method to access the contents from the file object. This read method takes the text contained in a file and puts it into a string. Here, we assign the string returned from this method into the variable file_data.
When finished with the file, use the close method to free up any system resources taken up by the file.
Writing to a File
f = open('my_path/my_file.txt', 'w')
f.write("Hello there!")
f.close()
Open the file in writing ('w') mode. If the file does not exist, Python will create it for you. If you open an existing file in writing mode, any content that it had contained previously will be deleted. If you're interested in adding to an existing file, without deleting its content, you should use the append ('a') mode instead of write.
Use the write method to add text to the file.
Close the file when finished.
Too Many Open Files
Run the following script in Python to see what happens when you open too many files without closing them!
files = []
for i in range(10000):
files.append(open('some_file.txt', 'r'))
print(i)
With
Python provides a special syntax that auto-closes a file for you once you're finished using it.
with open('my_path/my_file.txt', 'r') as f:
file_data = f.read()
This with keyword allows you to open a file, do operations on it, and automatically close it after the indented code is executed, in this case, reading from the file. Now, we don’t have to call f.close()! You can only access the file object, f, within this indented block.
readline
f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by '\n', a string containing only a single newline.
>>>
>>> f.readline()
'This is the first line of the file.\n'
>>> f.readline()
'Second line of the file\n'
>>> f.readline()
''
For reading lines from a file, you can loop over the file object. This is memory efficient, fast, and leads to simple code:
>>>
>>> for line in f:
... print(line, end='')
...
This is the first line of the file.
Second line of the file
If you want to read all the lines of a file in a list you can also use list(f) or f.readlines().
Alternatively
Conveniently, Python will loop over the lines of a file using the syntax for line in file. I can use this to create a list of lines in the file. Because each line still has its newline character attached, I remove this using .strip().
camelot_lines = []
with open("camelot.txt") as f:
for line in f:
camelot_lines.append(line.strip())
print(camelot_lines)
Outputs:
["We're the knights of the round table", "We dance whenever we're able"]
Or another example
filepath = 'Iliad.txt'
with open(filepath) as fp:
line = fp.readline()
cnt = 1
while line:
print("Line {}: {}".format(cnt, line.strip()))
line = fp.readline()
cnt += 1
The above code snippet opens a file object stored as a variable called fp, then reads in a line at a time by calling readline on that file object iteratively in a while loop and prints it to the console.
Another Example:
filepath = 'Iliad.txt'
with open(filepath) as fp:
for cnt, line in enumerate(fp):
print("Line {}: {}".format(cnt, line))
In this implementation we are taking advantage of a built-in Python functionality that allows us to iterate over the file object implicitly using a for loop in combination of using the iterable object fp. Not only is this simpler to read but it also takes fewer lines of code to write, which is always a best practice worthy of following.
readlines()
readlines() may be used to read multiple lines.
Other types of objects need to be converted – either to a string (in text mode) or a bytes object (in binary mode) – before writing them:
>>>
>>> value = ('the answer', 42)
>>> s = str(value) # convert the tuple to string
>>> f.write(s)
18
f.tell() returns an integer giving the file object’s current position in the file represented as number of bytes from the beginning of the file when in binary mode and an opaque number when in text mode.
To change the file object’s position, use f.seek(offset, whence). The position is computed from adding offset to a reference point; the reference point is selected by the whence argument. A whence value of 0 measures from the beginning of the file, 1 uses the current file position, and 2 uses the end of the file as the reference point. whence can be omitted and defaults to 0, using the beginning of the file as the reference point.
>>>
>>> f = open('workfile', 'rb+')
>>> f.write(b'0123456789abcdef')
16
>>> f.seek(5) # Go to the 6th byte in the file
5
>>> f.read(1)
b'5'
>>> f.seek(-3, 2) # Go to the 3rd byte before the end
13
>>> f.read(1)
b'd'
In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)) and the only valid offset values are those returned from the f.tell(), or zero. Any other offset value produces undefined behaviour.
File objects have some additional methods, such as isatty() and truncate() which are less frequently used; consult the Library Reference for a complete guide to file objects.