JerBouma
2/9/2019 - 7:17 PM

[Dealing with Large Datasets] Using chunksizes to control for large datasets (no Memory Errors) #pandas

[Dealing with Large Datasets] Using chunksizes to control for large datasets (no Memory Errors) #pandas

chunksize = 10**5 # define a chunksize -> read 100.000 rows per chunk

# text_file_reader represents all our chunks
text_file_reader = pd.read_csv('FILE', header=None, chunksize=chunksize, iterator=True)

# Combining
df = pd.concat(text_file_reader, ignore_index=True)