etckeeper setup on Centos 7

15 December, 2015

etckeeper is a package to keep track of changes in the /etc folder, that’s where the configuration is supposed to be. So you get a trackrecord of changes in configuration -a must have-, most definitely when you run yum-cron nightly. Its also a great way to document why some things have been changed.

Like most cool tools, after doing it, you totally forget how you got it working. So here I share how I did it and plan how to use it. I picked git, as this is the default way, and git is hip these days.

On the server you want keep in revision

# like most things in life, package is in epel
yum install epel-release

# I choice you, git!
yum install etckeeper git

# go etc
cd /etc

# lets init
etckeeper init

# first commit
etckeeper commit "init our configuration server"


While strictly speaking not necessary I like to have my configuration saved somewhere else, when git FUBAR’s or server won’t boot, at-least we can look how the configuration was (or was not) changed.

on our target server : (I try to create a password less login, perhaps other methods are available)

# I want the configuration on a remote server (central in my case)
# note : security wise this might not be 100%

# create a key

# copy the key 
ssh-copy-id -i etckeeper@sysadmin

# or alternative
cat .ssh/ | ssh etckeeper@sysadmin 'cat >> /home/etckeeper/.ssh/authorized_keys'

on the remote server :

# adduser and set pasword for first time login
adduser etckeeper
passwd etckeeper

# create git
su etckeeper
git init --bare /opt/etckeeper/public.git

and finally on the target server add the remote : (adapt as needed)

git remote add origin etckeeper@sysadmin:/opt/etckeeper/public.git

and change the configuration :

nano +43 /etc/etckeeper/etckeeper.conf





Manually record changes

Changing something  in /etc ? A good idea to tell your colleagues why (or the future you).

etckeeper commit "I added this ip to /etc/hosts cause I'm to lazy to type a ip."

Auto changes to /etc

Defaults will catch those ! Yum, yum-cron are caught by a plugin. I am not sure about rpm, but etckeeper will autocommit all changes it finds!

What changed ?

Since we use git, most git commands work (git status, git log). So its as easy as : cd /etc && git log  or for short cd /etc && git log --pretty=oneline

Pulling back changes 

I have not yet pulled back from the repo, but this should work :

etckeeper vcs checkout [HASH]

if you only need one file :

etckeeper vcs checkout [HASH] [FILE]

Useful sources :

[:error] [pid 8725] [client :51515] PHP Warning:  date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in on line 29, referer:

PHP being a bit silly is it not ? Fix this crazy error : (Centos 7)

nano +878 /etc/php.ini


;date.timezone =


date.timezone = Europe/Brussels

possible time zones

restart (reload might work) httpd to get it active

service httpd restart

Be gone from my log file evil error!

While most people would think (including me) that yum reinstall kernel kernel-headers kernel-firmware would be enough, its not!  You need to select the one you wanne reinstall by removing it and then reinstalling it. (since its the only package that has multiple versions I guess)

For me this was :

# remove the evil
yum remove kernel-2.6.32-573.8.1.el6.x86_64 kernel-devel-2.6.32-573.8.1.el6.x86_64

# reinstall it again
# note : in my case I was oke with the "latest and newest" kernel
yum install kernel kernel-devel kernel-firmware

Happy kernel reinstalling ! (don’t think that a thing)

zfs: disagrees about version

9 December, 2015

ZFSonLinux (ZOL) is a great project that creates a Linux kernel port of the ZFS filesystem. However when the kernel updates, it always makes problems with ZFS kernel module 🙁  I have not found a stable solution, only a very dirty “windows alike method”. I will share it as a future reference for my colleagues and -primary- myself.

Failed to load ZFS module stack.

In essence, a new kernel is installed, it will “weak” link the ZFS modules, for some reason ZFS doesn’t like that and gets partially updated. Both the new and the old kernel will not be able to load the ZFS data, for people who are now in full panic mode (like myself every-time this happens) : Your data is not lost. 

# find the version of spl and zfs
dkms status

# both remove them
dkms remove -m zfs -v 0.6.3 --all
dkms remove -m spl -v 0.6.3 --all

# install the headers for the new kernel
# ubuntu/debian
apt-get install linux-headers-$(uname -r)

# centos
yum install kernel-headers

# reinstall zfs
yum reinstall zfs

#add & build them again
dkms add -m spl -v 0.6.3
dkms add -m zfs -v 0.6.3
dkms install -m spl -v 0.6.3
dkms install -m zfs -v 0.6.3

# try loading in :
modprobe zfs

# if you can load zfs again now, you can skip this step
# I can't so I had to reboot my machine. (I know its crazy)
# find the data 
zpool import

# my poolname is tank
zpool import tank

And that is how I saved myself! (for now.)

Some notes :

  • Reinstalling doesn’t always work, sometimes you just need to remove zfs yum remove zfs after that its a good idea to clean up dkms manually. The command below is floating around on the web; It comes down to removing the modules from /lib/modules/$(kernel_version)/extra/  I removed them from all the kernels, as I only wanted to use the newest kernel anyway.
    find /lib/modules/$(uname -r)/extra -name "splat.ko" -or -name "zcommon.ko" -or -name "zpios.ko" -or -name "spl.ko" -or -name "zavl.ko" -or -name "zfs.ko" -or -name "znvpair.ko" -or -name "zunicode.ko" | xargs rm -f
    find /lib/modules/$(uname -r)/weak-updates -name "splat.ko" -or -name "zcommon.ko" -or -name "zpios.ko" -or -name "spl.ko" -or -name "zavl.ko" -or -name "zfs.ko" -or -name "znvpair.ko" -or -name "zunicode.ko" | xargs rm -f
  • Update 11/01/2016
    • The same problem happened today, a Centos 6.X server crashed due to a raidcontoller blocking. This forced a reboot, for some reason this booted in the not-latest-installed kernel, so zfs was installed in a newer kernel and weak linked to the “older” kernel. Rebooting in this case is the thing you should try first.
  • Update  14/12/2016

lets encrypt, with Centos 6.

5 December, 2015

Note : There are alternatives ways of getting lets encrypt to work in non-default environments, one is described in my new article : Let’s Encrypt on … any Linux distro

Let’s encrypt the web, an easy, automated and free method to get https for your website. I already explained how you could install letsencrypt on centos 6.7, but things on the interwebz go fast. So fast that in fact the tutorial is already deprecated. Since *beta* support has been added for Python 2.6, now Centos 6.X should work out of the box. Spoiler : it doesn’t yet. (hence the beta label by letsencrypt) This guide should help to get https in a not yet fully supported environments (such as Centos 6). As you might have noticed, also is now running on https! (not cause its really necessary, but it is cool isn’t it ? :P)

My start point

  • I have a Centos 6.X configured and yum-cron updated nightly
  • I have apache (http) running multiple domains and is a bit configured
  • I have 0 experience with SSL setup in http (Trust me, I have never done this before, successfully)

Let’s Encrypt the web, this is where my https story began!

Getting the certificate

The first part is easy, the docs help out allot and since we all read them just after the terms and services. Right guys/gals ?

# copy the software
cd /opt
git clone git clone
cd letsencrypt

Now the next part would be to start the tool and it should help you there, the problem is, this requires to bind to port 80, which is obviously in use, by apache (httpd). So that won’t work, also if you run this with Python 2.6 (Centos 6X) you will get a warning and it won’t wanne do anything without you telling it to go in --debug mode.

There is however an alternative plugin included, which uses the webroot of the domain (in Apache words : DocumentRoot). Now Let’s Encrypt does not give out wildcard certificates, which means that you do not get * validated, instead you can get,,,, … just remember that you have to request those at the same time when you request the certificate, if you repeat the process, they won’t work. Since we aim to automate,  I like to use as little as possible command line arguments, so I made a config file.

create /etc/letsencrypt/cli.ini

# the default is 2048 (more is better)
rsa-key-size = 4096

# plugin
authenticator = webroot

# webroot
webroot-path = /var/www/svennd/

# domains
domains =,

# flags
# renew is good for automation

Note : change the domain names to your domain name(s).

Now we can run the tool :

/opt/letsencrypt/letsencrypt-auto --config /etc/letsencrypt/cli.ini --debug certonly

Since I am on a not supported system I need the --debug flag. If everything goes as planned you should be congratulated as followed :

Congratulations! Your certificate and chain have been saved at
  /etc/letsencrypt/live/ Your cert will
  expire on 2016-03-04. To obtain a new version of the certificate in
  the future, simply run Let's Encrypt again.
- If like Let's Encrypt, please consider supporting our work by:
 Donating to ISRG / Let's Encrypt:
 Donating to EFF:

Possible errors
Since It already took me some time to get here know that these errors are also rather common;


The following 'urn:acme:error:connection' errors were reported by
the server:

Which means it has no access to the server in general, best start point would be to check firewall or connection setting. The server should be publicly accessible during the webroot challenge.


FailedChallenges: Failed authorization procedure. (http-01): urn:acme:error:unauthorized :: The client lacks sufficient authorization :: Invalid response from []: 404

I banged my head on this one, I received this error when I moved my website and configuration from http to https, this made the location unreachable. But it would be something you would also receive if your webroot is different from normal and you just copy-pasted the config. The webroot is the directory where users get their “index.php/html/asp/…” page from. For allot users thats somewhere here : /var/www/public_html/my_domain/ If you are not sure, its DocumentRoot in the configuration of Apache. Another way to know is to create a file “test.html” and go to your website : domain.ext/test.html a 404 means its not in the right directory. (you expect an empty white page)  Be sure that yourdomain.ext/.well-known/* is accessible ! Thx to Luis for pointing this out.


Error: urn:acme:error:rateLimited :: There were too many requests of a given type :: Error creating new cert :: Too many certificates already issued for:

This happens when you have played to much with them 😀 The solution is simple and hard, its called : wait it out. As long as the beta is in, they will rateLimit rather strongly, I believe not to many people will see this, after the initial beta period.

error code 1 in cryptography

Command "/root/.local/share/letsencrypt/bin/python2.7 -c "import setuptools, tokenize;__file__='/tmp/pip-build-cAuqmP/cryptography/';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-rhCaoe-record/install-record.txt --single-version-externally-managed --compile --install-headers /root/.local/share/letsencrypt/include/site/python2.7/cryptography" failed with error code 1 in /tmp/pip-build-cAuqmP/cryptography

This happened due to limited resources during cryptokey generation. The solution was to create more free memory, although one should never go straight to production server without testing SSL first. Stopping the memory hog would help.

Errno 22

OSError: [Errno 22] Invalid argument: ‘/etc/letsencrypt/live/cert.pem’ letsencrypt

It only happened during a server move, see the post.

Activate the SSL in Apache

Now I assume somehow you got to the point where you got congratulated and created the certificate. This would mean that you got four new files in /etc/letsencrypt/live/ , you would see cert.pem, chain.pem, fullchain.pem, privkey.pem.

I have Apache 2.2.15 (yum info httpd) and by default it won’t listen to port 443. So we need to add this :

In /etc/httpd/conf/httpd.conf find Listen 80 and add

Listen 443

After that you can adapt your virtualhost website configuration, I work with VirtualHost *:80.  My config looked like this :

<VirtualHost *:80>
        # server setup
        DocumentRoot /var/www/svennd
        <Directory "/var/www/svennd">
                AllowOverride All
                Order allow,deny
                Allow from all

I wanted to have both http and https running and after that is working (you want to check if everything works in https first)! Permanently redirect all traffic to https. To do that pretty much copy the virtualhost 80 to virtualhost 443. (full example, change to your domain!)

LoadModule ssl_module modules/

<VirtualHost *:443>
 # server setup
 DocumentRoot /var/www/svennd

 # ssl setup
 SSLEngine ON
 SSLProtocol all -SSLv2 -SSLv3
 SSLHonorCipherOrder On
 SSLCertificateFile /etc/letsencrypt/live/
 SSLCertificateKeyFile /etc/letsencrypt/live/
 SSLCertificateChainFile /etc/letsencrypt/live/

 <Directory "/var/www/svennd">
 AllowOverride All
 Order allow,deny
 Allow from all

<VirtualHost *:80>
 # server setup
 DocumentRoot /var/www/svennd
 <Directory "/var/www/svennd">
 AllowOverride All
 Order allow,deny
 Allow from all

I also added that httpd has to load the ssl module, on default installation however ssl module is not installed! Fix that with : yum install mod_ssl . After that remove /etc/httpd/conf.d/ssl.conf or comment it.

Now you have to restart your httpd service, before doing so test if the config is right : service httpd configtest

You expect : Syntax OK. If that is the case restart your webserver :

service httpd restart

Now both http and https should be available; If its not, first check if your firewall allows connections on 443. For me it did not, I filter on INPUT rules, so I only had to add it there :

# add it
iptables -I INPUT -p tcp --dport 443 -m state --state NEW,ESTABLISHED -j ACCEPT

# save it
service iptables save

Then my WordPress took both https and http. Next part is probably only for WP owners, so you can skip that.

Getting WordPress to play nice with Lets-encrypt ssl

Adapting WordPress itself is rather easy, in wp-admin -> Settings -> General -> change WP address and site address to both https://domain.ext. After that, I noticed most of my images where broken due to using http:// (note : you would get mixed error, I already adapted my .htaccess) You could change that using MySQL query (source):

UPDATE wp_posts 
SET post_content = ( Replace (post_content, 'src="http://', 'src="//') )
WHERE  Instr(post_content, 'jpeg') > 0 
        OR Instr(post_content, 'jpg') > 0 
        OR Instr(post_content, 'gif') > 0 
        OR Instr(post_content, 'png') > 0;

That’s all !

All http request redirected to https, except for .well-known for renewal

I had to allow .well-known to be served over http, otherwise we can’t renew the certificate. This is my .htaccess (from WP),  this is useful tool for testing .htaccess files.

RewriteEngine On

# its a http page request
RewriteCond %{HTTPS} off

# its not .well-known
RewriteCond %{REQUEST_URI} !\.well-known

# perm redirect to https version
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI}  [R,L]

RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

note : Change R to R=301 once you have tested this configuration. (that is permanent)

Now add a cron, I added this /etc/cron.weekly/certificate

/opt/letsencrypt/letsencrypt-auto --config /etc/letsencrypt/cli.ini --debug certonly

if [ $EXITVALUE != 0 ]; then
 /usr/bin/logger -t letsencrypt "ALERT exited abnormally with [$EXITVALUE]"
exit 0

This should update you’re SSL certificate every week, this leaves enough time for you to see if something is not running as expected. If you should miss it don’t worry, let’s encrypt has you’re email adres for just that case. You get a nice e-mail warning you :


Your certificate (or certificates) for the names listed below will expire in 13 days (on 2016-02-03 12:18:00 +0000 UTC). Please make sure to renew your certificate before then, or visitors to your website will encounter errors.


For any questions or support, please visit Unfortunately, we can't provide support by email.

The Let's Encrypt Team

After you moved over, dump your url in ssllabs to see your SSL rating, some tweaks might be needed to get you to A+, but I believe its definitely worth it! ( is now A+)

Encrypted a tiny part of the web !

Green lock! Already feel allot safer!

This seems like something to print out and put in a frame!

This seems like something to print out and put in a frame!

This is a simple manual to install LibreNMS, a network monitoring system, on a clean Centos 7.1 . I’m a huge fan of this so I think its important to expand information on this. Recently I have have tried to update the docs of Centos/RHEL, I’m not sure if they will be accepted as I’m rather new to this project and while I want to give back, I’m not that good with giving neutral documentation 🙂 Below I wrote a quick start, some options I left out, just cause you wane be monitoring rather sooner then later. For more support you could always look in the LibreNMS docs.


# installation epel repo
yum install epel-release

# install dependencies
yum install php php-cli php-gd php-mysql php-snmp php-pear php-curl httpd net-snmp graphviz graphviz-php mariadb jwhois nmap mtr rrdtool MySQL-python net-snmp-utils php-mcrypt fping git mariadb-server

# some extra stuff
pear install Net_IPv4-1.3.4
pear install Net_IPv6-1.2.2b2

After installing the packages needed for LibreNMS Its time to setup the database.

# set to start at boot
chkconfig mariadb on

# start the service
service mariadb start

# configure MariaDB

# go into mysal console
mysql -u root -p

# in the console :
# create database

# give permissions to a new user librenms
TO 'librenms'@'localhost'
IDENTIFIED BY 'password'

# flush the permission

Installing the software is actually just cloning the master git repo and keeping it up to date, this is done with a cron.

# move to /opt directory
cd /opt

# clone the source
git clone librenms

# move to work directory
cd /opt/librenms

# copy the original configuration to work config
cp config.php.default config.php

# fill in the config.php
# most important are the database (db_*), community
# also add : $config['fping'] = "/usr/sbin/fping"; 
# don't ask me why its not there to begin with ... 
nano config.php

# while we are in the right directory lets create some needed
# directories and execute some SQL data
php build-base.php

# add a new user
# change name, pass and email
php adduser.php name pass 10 email

# add new user
# and add it to the apache group
useradd librenms -d /opt/librenms -M -r
usermod -a -G librenms apache

# create directories
mkdir /opt/librenms/rrd 
mkdir /opt/librenms/logs

# chmod them
chmod 775 /opt/librenms/rrd

# change owner
chown -R librenms:librenms /opt/librenms
chown apache:apache /opt/librenms/logs

After that I suggest we create the cron, good thing its already in the git.

# copy it from source to cron
cp /opt/librenms/librenms.nonroot.cron /etc/cron.d

# make it readable
chmod 0644 /etc/cron.d/librenms.nonroot.cron

# I set the ownership to root
# though the execution of the crons will not be root,
# hence the name nonroot (which is recommended)
chown root.root /etc/cron.d/librenms.nonroot.cron

After that we are ready to setup the webinterface, here I use httpd (Apache), but there are other services also possible (nginx, lighthttpd, …)

# make sure Apache is running and started
# it normally already is
chkconfig --levels 235 httpd on

# just in case its not yet
service httpd start

# create a new file
touch /etc/httpd/conf.d/librenms.conf

# then add 
# note : change !
<VirtualHost *:80>
  DocumentRoot /opt/librenms/html/
  CustomLog /opt/librenms/logs/access_log combined
  ErrorLog /opt/librenms/logs/error_log
  AllowEncodedSlashes On
  <Directory "/opt/librenms/html/">
    AllowOverride All
    Options FollowSymLinks MultiViews
    Require all granted

# remove or comment this file :
rm /etc/httpd/conf.d/welcome.conf
# and finally restart httpd
service httpd restart

After that you probably want to add more snmpd servers then only localhost. I used snmp-scan.php for that, you need to set the ranges of ip’s it can search for in config.php $config['nets'][]  It is also possible to just add them manually in the web-interface.

Now give it 10-15 minutes to start collecting data and you are ready to login to the web-interface! (set up in Apache setup)

One of our servers had problems with nfs service (network file system), considering it was a storage server mainly used for nfs, that’s a problem. We where monitoring it before and after and strangely -for a Linux achine- nothing really popped up. Last week the same server had a out of memory state that pretty much locked up the kernel and as such the server. A reboot “fixed” it, but I assumed a new kernel had been pushed and was now being a pain in the ass. Like any production machine it could not be rebooted during the day until, -after some discussion- we noticed that the jobs running for a few weeks where in fact not doing anything. I checked the updates and to my surprise this machine had not been updated for a long while, by default I install yum-cron (unattended-upgrades for debian), on any system I install, but this one was before my time. There where around 1gb of updates ready, and that’s allot that can go wrong. Considering the OOM we decided quickly that installing the latest kernel would be the best thing to do, as clearly this OOM was fixed in the latest version. Since allot some tools depend on that kernel, we decided to take all the updates. During the updates however the machine locked up, active ssh connections got unresponsive, the monitor on the machine was not getting any signal and the worst : the USB keyboard & mouse did not get any power.

Clearly the machine was not functioning and we had to reset the machine. We where welcomed on the reboot with this error :

Server fails to boot with lvm error : "Invalid argument for --available: ay Error during parsing of command line."

That exact error gives back one result on google, a redhat article (I wrongly assumed was behind a paywall, in fact its just behind a login/registration wall) In fact there is not really an article behind. The only thing they give is the “Environment” it can occur in :

  • initscripts versions greater or equal to 9.03.32-1 and below 9.03.38-1
  • lvm versions below lvm2-2.02.97-2.el6

Normally yum will protect you from these kind of problems, so by breaking the transaction on yum (server reset) I assume on of those two got on the bad version and gave me the trouble. I used a rescue USB stick and checked the ext4 (root and home) for errors, both where ok. Then I rebooted the machine since installing a new version of lvm, lvm-libs and initscripts looked very difficult from inside a debian distro. The root partition however remained read-only. There is this tiny hack to fix that :

mount -o remount,rw /lvm_root /

Which will remount the root partion in read/write mode. Most people would now be able to finish yum with :


To bad for me that failed.  The advice given then is to check for dupes package-cleanup --dupes, (since that was one of the problems) However checking for dupes and cleaning them, can -in my case- be dangerous: it wanted to remove glibc and 2.1 Gb of system application, that really bad news, so don’t let it do that!

The thing that got me on track was : yum-complete-transaction --cleanup-only this removed the constant error telling me to finish the transaction. After that I wanted to remove the duplicates, you cannot use yum, cause yum will remove dependencies :

rpm --nodeps --noscript -e package

If you like me had around 1gb of updates, this is a easy hack :

# Store duplicate-list in a file.
package-cleanup --dupes > duplicates.txt

# Remove odd lines.
awk 'NR%2==0' duplicates.txt > single-duplicates.txt

# Replace new lines with spaces.
awk '{ printf "%s ", $0 }' single-duplicates.txt > duplicates.txt

# Copy-paste content of duplicates after command.
rpm --nodeps --noscripts -e [paste here]

source, also works for Centos 6X. After that I reinstalled allot of the packages. I tried to recall the problematic (initscripts, lvm, glibc, kernel, kernel-headers,…) and made sure those where done.

Later I also found yum check all that did not help too much as most was already fixed, but still worth the look!

After yum and rpm where both happy (updated) I rebooted the machine and until now its holding up like a champ. Also to hell  with the last sysadmin that got me in this mess! Install yum-cron if you are to lazy to update regularly!

Also usefull :

crond: (root) BAD FILE MODE

27 November, 2015

cron is not working. You would see this error in /var/log/cron-* or /var/log/cron or in some configurations /var/log/messages. Its panic time!

Just kidding, bad file mode is just permissions.  You fix it with :

chmod 0644 /etc/cron.d/your.cron

Also a common one is WRONG FILE OWNER again rather straight forward :

chown root.root /etc/cron.d/your.cron

Note that you need root to be the owner, even tho you can execute it as a non root user.



A great find today, I’d like to share : LibreNMS ! Its a really open source fork of observium. I was a fan of observium, but I dislike the fact that they took “our” kickstarter money and just force you to pay for the latest codebase. The “community” version is just updated twice a year. Clearly SNMP is a bit of a black box and every vendor changes stuff when they feel like it, (I hear it has something to do with the moon/sun-cycle)  so its needed to update regularly, allot of functions are not available in community edition.

I don’t mind these guys making money, I just don’t find it useful to pay for a license for something that is only considered ‘good practice’ where I work. (I only maintain ~20 server and downtime is not something we get killed for, just beaten up a bit)

Comes to save my day : LibreNMS.

I really wanted to add a few lines of code to observium, but since it was closed source, could not do that, as such I wanted to build a own simple system. Good thing some heroes are out there that saved me the extra headache!

Happy network monitoring!

On a virtual server,  date was responding with the correct time/date. However logs where filling up with a wrong time. I normally change the timezone like this : (adapt as needed)

ln -sf /usr/share/zoneinfo/Europe/Brussels /etc/localtime
[root@main ~]# date
Sat Nov 14 21:55:22 CET 2015

Also in /etc/sysconfig/clock change to : (adapt as needed)


Even when those are done the logs kept on showing up logs very long time ago, the cause was rsyslog running behind… never happened before so :

service rsyslog restart

This was also causing my fail2ban to be to easy on this spammers I believe! Questions, remarks, just shoot below !


Update : seems /etc/sysconfig/clock is not per definition the problem, its rsyslog that is keeping the old date, just restarting ryslog seems to be enough!