Avoid Memory Error : Techniques to reduce Dataframe Memory Usage

Image-1: Memory Error in Jupyter Notebook

Lets follow STAR (Situation, Task, Action, Result) approach to understand the article.

SITUATION

I was working on a project involves Machine learning with 4 GB RAM System and required a lot of memory intensive computation or data-set size that was large enough to hang my system.

TASK

Can I still complete my project which requires memory intensive computation with 4 GB RAM ?

How to avoid Memory Error ?

How to reduce memory usage by your variables in program (Lets take variable as dataframe object as of now) ?

ACTION

Image-2: Example of reducing memory usage

I have written Jupyter notebook to show techniques to reduce dataframe size even by 98% in some cases. For detailed explanation of different memory reduction scenarios and complete code, please refer to Jupyter notebook.

However, I am just pasting important 4 lines of code for your reference i.e. 4 techniques to reduce dataframe size:

  • Change in int datatype
## Action: conversion of dtype from "int32" to "uint8"
converted_df_age = df_age.astype(np.uint8)
  • Change in float datatype
## Action: conversion of dtype from "float64" to "float16"
converted_df_query_doc = df_query_doc.astype('float16')
  • Change from object to category datatype
## Action: conversion of dtype from "object" to "category"
converted_df_day_of_week = df_day_of_week.astype('category')
  • Convert to Sparse DataFrame
## Action: Change of DataFrame type to SparseDataFrame
df_sparse = df_dense.to_sparse()
Jupyter Notebook code: Drop Memory Usage

RESULT

Hurray !! You learned how to reduce dataframe size given different scenarios assuming you have gone through Jupyter notebook completely :).

Please clap if article helps you and share with your friends as well.

Happy Learning !!

References

  1. https://nbviewer.jupyter.org/github/aakashgoel12/Play-DataStructure-Python-Data-Engineering/blob/master/top_4_memory_usage_drop_tricks.ipynb

--

--

--

Senior Data Scientist @ Fractal Analytics

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aakash Goel

Aakash Goel

Senior Data Scientist @ Fractal Analytics

More from Medium

An Introduction to Hadoop for Beginners

Your go to Numpy checklist

Difference between SQL and SQLite

Python Lists and Dictionaries