Avoid Memory Error : Techniques to reduce Dataframe Memory Usage

Image-1: Memory Error in Jupyter Notebook

Lets follow STAR (Situation, Task, Action, Result) approach to understand the article.

SITUATION

I was working on a project involves Machine learning with 4 GB RAM System and required a lot of memory intensive computation or data-set size that was large enough to hang my system.

TASK

Can I still complete my project which requires memory intensive computation with 4 GB RAM ?

How to avoid Memory Error ?

How to reduce memory usage by your variables in program (Lets take variable as dataframe object as of now) ?

ACTION

I have written Jupyter notebook to show techniques to reduce dataframe size even by 98% in some cases. For detailed explanation of different memory reduction scenarios and complete code, please refer to Jupyter notebook.

However, I am just pasting important 4 lines of code for your reference i.e. 4 techniques to reduce dataframe size:

  • Change in int datatype
  • Change in float datatype
  • Change from object to category datatype
  • Convert to Sparse DataFrame

RESULT

Hurray !! You learned how to reduce dataframe size given different scenarios assuming you have gone through Jupyter notebook completely :).

Please clap if article helps you and share with your friends as well.

Happy Learning !!

References

  1. https://nbviewer.jupyter.org/github/aakashgoel12/Play-DataStructure-Python-Data-Engineering/blob/master/top_4_memory_usage_drop_tricks.ipynb

--

--

Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store