I recently revived a “old” compute cluster. While its hardware was formidable back in the day, now it has been replaced with a younger version, which has more compute nodes. I wanted to create a data partition of two disks, normally I have a raid controller but on this server, there is none.  An open invitation for a software RAID it is, so I could go with mdadm as software raid, but I did not want to read up on all those commands (again). When in fact ZFS is already in my head.


Creating a mirror or RAID 1 can be done using : zpool create $poolname mirror $first_disk $second_disk
This would make a mirror and mount it on /data, now that is the theory, in practice you most likely will be shown this error :

invalid vdev specification
use '-f' to override the following errors:
/dev/sdb does not contain an EFI label but it may contain partition information in the MBR.
/dev/sdc does not contain an EFI label but it may contain partition information in the MBR.

So be sure to add -f , it is however a good idea to double check if those are the disks you want to use. Note that the content on both disks will be destroyed during this process.

zpool create -f data mirror /dev/sdb /dev/sdc

After that you can see

[[email protected] ~]# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
data   216K  3.51T    96K  /data
[[email protected] ~]# zpool status
  pool: data
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0

errors: No known data errors

One of the tuning options for pools is ashift, which can be 9 (for 512 sector drives) or 12 (for 4k sector). However, this can only be set at creation. This can be done using the option -o ashift=value so why did I not tell you ? Cause ZOL (ZFS on Linux) since a while, will try and find the correct value.  From my finding (on the internet) almost all disks these days are 4k sector drives or advanced format drives. This you can check using hdparm -i /dev/sdb (you might need to install this)

[[email protected] ~]# hdparm -I /dev/sdb
        Model Number:       WDC WD4Y0                  
        Firmware Revision:  80.00A80
        Transport:          Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
        Supported: 9 8 7 6 5 
        Likely used: 9
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:   16514064
        LBA    user addressable sectors:  268435455
        LBA48  user addressable sectors: 7814037168
        Logical  Sector size:                   512 bytes
        Physical Sector size:                  4096 bytes
        Logical Sector-0 offset:                  0 bytes
        device size with M = 1024*1024:     3815447 MBytes
        device size with M = 1000*1000:     4000787 MBytes (4000 GB)

As you see the Logical sector size, is 512 bytes, this is for backwards compatibility, but the physical sector size is 4k. So in this situation a ashift=12 would be ideal. You can verify what your ashift is, by using the zdb tool :

[[email protected] ~]# zdb | grep ashift
            ashift: 12

From what I read in the repo, it seems that 512 bytes in some cases can give you more storage if you have allot of very tiny files, compared to 4k, but that 4k is in almost all cases allot more performant. In general terms unless you really have a corner case, default ZFS will most likely guess the best option.

After creating this pool, I would recommend you read up on basic tuning, in short :

zfs set xattr=sa data
zfs set acltype=posixacl data
zfs set compression=lz4 data
zfs set atime=off data
zfs set relatime=off data

And that’s it folks !