NFS on Btrfs on multi-devices .vs. Glusterfs on distributed-volume?


NFS on Btrfs on multi-devices .vs. Glusterfs on distributed-volume?

I considering a storage for Email. 
This storage system run on my own private cloud (already replicated), then i does not care about replication. 
I'm thinking about 2 options:
1- I will create few "disk" (volume on cloud), and create a Btrfs filesytem on multi-disk; and when filesystem full, i'll create more "disk" and add it to btrfs file system by:
btrfs device add /dev/vdX /mnt

btrfs filesystem balance /mnt

This mount point (/mnt) will be expose over NFS, and my Dovecot server will mount this export, and store emails on it. 
2- I will create few "disk" (volume on cloud), and create a GlusterFS distributed-volume accross these disks; and when filesystem full, i'll create more "disk" and add new "disk"(s) to GlusterFS distributed-volume, an re-balance it.
My Dovecote will mount this volume using glusterfs-client, and store emails on it.
(Repeat: i don't need replicate, because my "disk", volume on private cloud, replicated underhood)
Do you think which option have better:

performance? (many many small read/write I/O)


Answer 1:

You have to consider the I/O pattern of a mail server: Read/write many small files as fast as possible. Both your variants are really unsuitable for this when dealing with a large number of clients, IMHO.

Neither FS is fast enough, and I guess especially the locking overhead of GlusterFS will be significant. Then you add another layer with NFS, which has it’s own overhead. Instead of this, I would try to connect the mail store with as small overhead as possible and with a fast file system. Usually this means to connect as directly to the physical storage as possible, but since you hide your architecture behind bullshit bingo terms like “private cloud”, we don’t know what would be possible.

One approach you could try would be to export the storage via iSCSI to the mail server and then use a FS that’s fast with many small files and maybe, if it’s really important, use LVM to be able to easily add space to that FS in the form of additional iSCSI volumes (which adds back some overhead).

Whatever you try though: You have to benchmark the different variants and see if you get the required performance out of it.

Answer 2:

If you need to choose one from the two above then NFS is preferable I think.

GlusterFS looses all its benefits as distributed filesystem in your setup since OpenStack
volumes are still mounted from a central storage. It is neither more stable nor
more smart since It should care about distributed file locking while NFS locking
is done on a sigle server.

I am no sure that combining your storage from multiple devices is a good idea. Alternatively
you may consider skipping high level OpenStack volume service fucntionality and exposing your storage directly – formated LVM(/ZFS/SAN) volume exported by NFS. Going this way you will eliminate unnecessary iSCSI level and will be able to encrease the mail storage space on demand as long as the main storage has enough free space.