sumo
0

Working with big files

OutOfMemoryException. Don’t panic, your applications are ok, this is not a real exception. It will be more familiar to those people that have been working with I/O operations with heavy files and most of you would wonder “if I have 8GB RAM and I’m working with a 500MB file, how am I out of memory?”. Well, to know why we are getting this kind of exceptions, first we have to understand how the RAM memory works.

Let’s take a moment to think about an empty room of 25m2. We start filling it with furniture: a coach( 3m2) in front of the tv table (2m2) and next to a foot lamp (0.5m2), then a coffee table (2m2) and a confortable reclining chair at a side (1.5m2)… At this point, we have 16m2 of free space in our room, but what happens if we want to put a bed in it? Well, probably, besides having a lot of free space, the current placement of the rest of the furniture won’t allow us. This is a elementary example of how RAM memory works.

While our server is running, RAM slots are filled and emptied, and some holes will appear between the occupied memory slots. Our free memory is composed for all this holes and it’s in here while our application will try to allocate the 500MB file which we are working with. The problem is, most of the time we won’t have a continuous hole of 500MB, so fitting that file will be like fitting a bed in our living room. This is a concept known as fragmentation.

fragmentationdiagram

There is another problem while we are working with big files, so even when we can find a hole to fit our 500MB file, ¿what happens if our application has to work with 2 files at the same time? ¿and if it has to work with 100? We are not going to find 100 holes for 500MB files, that’s for sure. We will find this problem when we are creating a web application which is used by a lot of  users at the same time, who need to upload or download big files.

The solution is simple: use a buffer. Instead of storing all the file in our RAM memory, we can try to temporarily store just a small part of it, a part that we can allocate easily in our free holes, working fail proof and taking less space of our RAM memory. If we create a 1MB buffer and we are working with 100 files at the same time, it will only take 100MB RAM and our application can easily find 100 holes of 1MB.

Enough chit chat, let’s get to work. This piece of code is written in C#, but can be transported to any other language, the important think here is the concept. You can download a complete example application, to debug or whatever, from my GitHub. In this example, we are reading a file and writting it to disk, but you can actually modify it to perform any other operation, simply changing the line 16.

 

Leave a Reply

Your email address will not be published. Required fields are marked *