Mock reading a File line-by-line with Python
Easily test python code that reads a .csv file using unittest.mock.mock_open
For one of my recent Python projects, I had to read a .csv file containing test data-set line-by-line and map them to one of 4 regression models. My solution was to create a generator method that returned every line read from the file.
Before returning the data I packed it into a rudimentary DataRecord object that indicated whether the record returned was a header or data line together with the data read from the file.
The code above shows how the given source file is read line-by-line and processed. Each line is then returned in the form of a DataRecord
. The line that does not constitute a header is expected to consist of only floating point numbers in my example. The DataRecord
is as follows:
The built-in python function open
is used to open the file to read followed by reading each line. When I started to write unit tests to test the file reading and parsing functionality I struggled a little bit to figure out how to mock the build-in python function open
and inject test data into it in order not to depend on the existence of a physical file for my tests.
I found what I was looking for in the following python documentation:
The unit test code is as follows:
To make it easier for me to assert the lines of data read I used a list of data lines containing the header line as the first item followed by data lines. The tricky part was to figure out what to set the read_data
parameter of mock_open
to. It must be set to newline delimited strings which are then read by the actual built-in python function open
line-by-line.
After that my test code worked as expected and I was able to assert that a DataRecord
of type header
was indeed the header line and a DataRecord
which was not of type header
was a line containing actual data.
I hope this helps others to quickly and easily test their file reading functionality in their unit tests.
All code used in this article can be found in my GitHub repository.