The Need

We are currently looking for a way to scale beyond a single massive 200TB machine. (in an economical way) The most logical step, is a so called distributed file system. Basically we want to add server(s) to increase the total amount of usable storage, but need to keep the structure consistent over all the storage servers. We also want to be able to get to the data if the system fails (or one of the subsystems fail). This is my personal logbook of relevant articles, presentations, … I found in my hunt for knowledge and experience.

GlusterFS

Software defined network storage system. Now bought by Red Hat, ensures that GlusterFS is going to be pretty stable and most likely will work great together with Centos we prefer for our servers. If we want support, switching over to Red Hat would be doable. They have multiple types of storage; distributed, replicated, striped, tiering or combination of those.

Articles :

Setup GlusterFS on Centos 6.7 on Digitalocean

Resources :

Note : Red Hat documents, are really high quality reports, however they might be strongly biased.

MergerFS

 

 

note : this is work in progress.