A easy leap second, using Google’s public smear NTP service

Isn’t it nice, not to waste time, on an issue you did not know you had until someone pointed out a solution ? Well that’s exactly what happened today. Seems on December 31, 2016 (its soon and its the worst possible time if you ask me) there is a leap second. Since our Earth has some unknown variables changing the rotation speed the time has to be adapted now and again. (quit at random it seems) Leap seconds cause all sorts of fun problems so unless you want to spend December 31 – January 1 behind a computer screen, better apply this leap-second cream to those chunky machines.

Engineer’s at Google came with a nice solution, some hours before or/and after they make the seconds the tiniest bit longer. The coolest part: you can use there expertise, when using there servers for ntp (network time protocol).  Here’s how :

Install it if you did not use ntp before:

yum install ntp

At about line 20, /etc/ntp you should find :

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst

Comment, those 4 lines and add google’s

server time1.google.com iburst
server time2.google.com iburst
server time3.google.com iburst
server time4.google.com iburst

resulting in :

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server time1.google.com iburst
server time2.google.com iburst
server time3.google.com iburst
server time4.google.com iburst

Restart the ntp service (or start)

service ntpd restart

Then validate if the servers have been added correctly. Using ntpq -p note that the results may vary.  As long as its time[1-4].google.com

ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*time1.google.co 71.79.79.71      2 u   36   64    3   95.522   -0.522   0.190
+time2.google.co 71.79.79.71      2 u   36   64    3  106.206   -0.371   0.052
+time3.google.co 71.79.79.71      2 u   36   64    3    4.824   -0.167   0.034
 time4.google.co 71.79.79.71      2 u   36   64    3  256.651    0.917   0.049

hopefully this will ensure a happy new-year not a pager-duty new-year, we all know how that story ended :

Automating zfs snapshots of proxmox using Sanoid

Slowly ZFS on Linux is becoming the mainstream file system, however its more then just a file system, its a (software) raid,  it allows for snapshots, compression, deduplication, … its pretty cool, and I’m in love with it. One of the most important features for me is snapshots, those are dirty cheap in ZFS, so cheap everyone should use them. Cause you never know when you will issue a rm -rf /* or a cryptovirus hits you. Fixing those, with ZFS snapshots is allot easier and faster. However in order to be able to use them, you need to make them, and honestly, everyone forgets about making snapshots, as a solution ZFS snapshots is an excellent target for automation. One could build a snapshot system from scratch, but since ZFS is so popular, there must be some systems out there. There are some, zfs-auto-snapshot (excellent tutorial) is the most common, as its been around from close to the start of the ZOL (ZFS On Linux) project. While it works great, the configuration is limited and we want to able to configure depending on the (sub)volume. I found Sanoid, and the rest, as they say, is history. Sanoid seems commercial linked, but its GNU open-source. Setting it up is easy, but the documentation is a bit lacking right now. So here goes, how I set it up. (while not specific for proxmox, I use it on all new -ZFS- proxmox installations)

Our snapshot bro.

Our snapshot bro.

Dependency
Proxmox is Debian based, so apt-get :

apt-get install libconfig-inifiles-perl git

In case you would use Centos or the likes, these are the same packages :

yum install perl-Config-IniFiles git

Install
Then we can go an install Sanoid. I like to just clone the repository, and work from there, there might be other options of-course.

cd /opt
git clone https://github.com/jimsalterjrs/sanoid

Then create a hard link from the repo to the /usr/sbin (general system-wide root privileged scripts) or /usr/local/bin. (change accordingly downstream)

ln /opt/sanoid/sanoid /usr/sbin/

Then we need to create the configuration, sanoid will read from /etc/sanoid/sanoid.conf, so that seems a good place to put them. Also put the sanoid.default.conf there, else sanoid refuses to run.

mkdir -p /etc/sanoid
cp /opt/sanoid/sanoid.conf /etc/sanoid/sanoid.conf
cp /opt/sanoid/sanoid.defaults.conf /etc/sanoid/sanoid.defaults.conf

After that you should configure sanoid, before putting it in the cron tasks. I use the following pretty self explanatory config :

#################### 
# sanoid.conf file # 
####################

[tank/subvol-104-disk-1] 
        use_template = production

############################# 
# templates below this line # 
#############################

[template_production] 
        # store hourly snapshots 36h 
        hourly = 36

        # store 30 days of daily snaps 
        daily = 30

        # store back 6 months of monthly 
        monthly = 6

        # store back 3 yearly (remove manually if to large) 
        yearly = 3

        # create new snapshots 
        autosnap = yes

        # clean old snapshot 
        autoprune = yes

Then the cron :

nano /etc/crontab
*/5 * * * * root /usr/sbin/sanoid --cron

Results
And after 5 minutes you will see, snapshots popping up,  this cron runs once every 5 minutes, while once or twice an hours should be sufficient, I did not want to stray to far from the original readme in the git repo. That actually does check the snapshots every minute … After a while the result will be :

root@rocky:~# zfs list -t snapshot
NAME                                                USED  AVAIL  REFER  MOUNTPOINT
tank/fry@autosnap_2016-11-23_16:05:30_monthly  		14.0M      -  4.13T  -
tank/fry@autosnap_2016-11-24_23:59:01_daily    		13.4M      -  4.13T  -
tank/fry@autosnap_2016-11-25_23:59:01_daily        		0      -  4.13T  -
tank/fry@autosnap_2016-11-26_23:59:01_daily       		0      -  4.13T  -
tank/fry@autosnap_2016-11-27_23:59:01_daily        		0      -  4.13T  -
tank/fry@autosnap_2016-11-28_23:59:01_daily   		1.38G      -  4.14T  -
tank/fry@autosnap_2016-11-29_23:59:01_daily     	472K      -  4.14T  -
tank/fry@autosnap_2016-11-30_23:59:01_daily        		0      -  4.14T  -
tank/fry@autosnap_2016-12-01_00:00:01_monthly     		0      -  4.14T  -
tank/fry@autosnap_2016-12-01_11:00:01_hourly    	191K      -  4.14T  -
tank/fry@autosnap_2016-12-01_12:00:01_hourly    	191K      -  4.14T  -
tank/fry@autosnap_2016-12-01_13:00:01_hourly   		69.6K      -  4.14T  -
tank/fry@autosnap_2016-12-01_14:00:01_hourly    	38K      -  4.14T  -
tank/fry@autosnap_2016-12-01_15:00:01_hourly    	421K      -  4.14T  -
tank/fry@autosnap_2016-12-01_16:00:01_hourly    	534K      -  4.14T  -
tank/leela@autosnap_2016-11-25_11:09:01_monthly     40.4K      -  4.43G  -
tank/leela@autosnap_2016-11-25_11:09:01_daily       38.2K      -  4.47G  -
tank/leela@autosnap_2016-11-25_23:59:01_daily           0      -   883G  -
tank/leela@autosnap_2016-11-26_23:59:01_daily           0      -   883G  -
tank/leela@autosnap_2016-11-27_23:59:01_daily           0      -   883G  -
tank/leela@autosnap_2016-11-28_23:59:01_daily       1.10M      -   883G  -
tank/leela@autosnap_2016-11-29_23:59:01_daily           0      -   862G  -
tank/leela@autosnap_2016-11-30_23:59:01_daily           0      -   862G  -
tank/leela@autosnap_2016-12-01_00:00:01_monthly         0      -   862G  -
tank/subvol-104-disk-1@init                         432K      -   642M  -
tank/subvol-105-disk-1@init                        25.9M      -   689M  -
tank/subvol-106-disk-1@init                            0      -  1.24G  -

note : this result is not from this config, I use no yearly’s and a lot less hourly snapshots. On top of that, I generally store the configured  system with @init and put data on a separate volume, which I snapshot automated.

There are allot of other zfs “auto” snapshot systems ut there, a short list can be found on wikiComplete, here; Whatever you choose, be sure to try it first, and be sure you are happy with the features. But most important, use snapshots, they are not backups (as they are on the same physical system) but they are so cheap there are little arguments to not use them for profit. Happy snaps!

Careers at company

We at company love the stuff we make, we are great at it, see our huge numbers that our CEO tells us every year at our “fun” teambuilding event, which actually is an obligatory two day meeting. We charge our customers way to much, but hey, that’s why we can hire little peasants like you ! Our entire management has a six figure paycheck, that’s how freaking good we are doing. We most likely will fire you next dip we have, but at-least you can use our logo on your next CV. Aren’t we great ?

Now that we are full of talking about how great we are, we like to talk about a opportunity for you ! By opportunity we mean working hard, having a hugely skewed work/life balance, low paycheck and getting fired as soon as our HR manager second cousin finally gets his diploma.

We are looking for a M/F, experienced senior deployment, network, server-less, system, dev-ops engineer/administrator, who is acquainted with high-throughput data analyse and big data. (cause we have like a bunch of floppy disks, and really I can’t find that file on my 8GB stick.) Oh, and when we say M/F we really mean M since F can get pregnant and damn that’s annoying not to say expensive.  Now that we said all those fancy words, what we really need is someone to magically fix the printer. While doing that project the other dude (we forgot his name) din’t do before he left company not that we will credit you for it, its just pushing some buttons isn’t it ? It’s basically ready to deploy. Are you up for the challenge ?

jobs at company !

Look at all these happy people, they sure don’t work for us!

What we expect of you :

  • No 9-to-5 mentality, we expect you to work 24h, 7/7, and longer when our CEO’s mobile does not support the latest candy crush. Oh and expect pager-duty. All for that 38h paycheck!
  • Stress resistent and can finish a project with everyday shifting priority’s, an impossible deadline and changing specs all the time. Also, we are going to blame you for pretty much everything, that has a close resemblance to a computer. You did something with IT related education right ? So don’t you know how my unrelated, legacy, made for Windows 95 application stopped working ? Be ready to spend the night, cause I need it like, right now.
  • Team player cause we will dump the most boring project we have on you, since the last guy left to start a bakery.
  • Wanting to learn,  it’s hand-on, don’t expect to leave the room until the printer is fixed.
  • Open-minded, we need someone to clean up the kitchen as well.
  • Communication, it would be great if someone understood what our outsourced Indian QC where talking about.

What you will get from us :

  • Our love, as in “why did it take so long”. Working on cutting edge new technology as in what everyone else is doing right now.
  • A place where you can explore your full potential as a pispot. A great international work environment as in our housemaid is definitely not from here. We also offer you to learn a deprecated only used here dev. stack/programming language!
  • A competitive salary,  and we mean competitive, as in how low can we go.
  • Temporary contract, we don’t take the risk of burn outs. You’re the expensive version of toilet paper to us <3

Don’t wait, send us your soon to be unread, trashed CV !

(can be used for free as recruitment template)

Todo list on Ransomware attack

At work we been hit with a ransomware attack, this was a new experience for me. Lucky it was a “good” one, as we had backups and we where very fast to stop outbreak. However finding what to do, seemed more difficult then one would expect. For starters pulling out all the plugs is easy. Then comes the hard part, finding out what is happening, whats it’s going to do once you start again, how to get rid of it, and then trying to find ways to stop it from happening again, or at best create the least possible amount of damage. While this new hype is growing every day, finding decent up-to-date information seems not as trivial.  So for future reference, or for someone searching for quick information here goes :

Stop the encryption
– the moment you discover some computer in the network is doing encryption, is to stop it, remove the network, remove the power.  (this computer is going to be build up from scratch so don’t worry about the work, its basically lost)
– If you have a backup system in place stop it and deconnect it from network. This is important cause you don’t want to corrupt your backup.
– shut down any computer you think might be infected, or damaged

That was the easy part, shutting everything down. Now people really will start freaking out.

Find out how big the damage is
boot in a system you know is secure, read-only systems are great for this (Chrome OS, live USB, …) and use that to research
– determ the type of ransomware; and any specs you can find. In case of doubt, this would be an ideal time to contact a specialist. (this can help) This is also the time you search for a solution or a plan off “recovery”. In our specific case the ransomware was acting from one computer, and all other computers where affected but not infected. So we booted all the machines up (except for the infected) and kept a close monitor on new malicious activity, luckily for us nothing showed any weird behavior.

Recovery
– We started putting back backups.
– Since we did not trust the infected machine, we removed the disks and reinstalled from scratch and put a backup of the data back.

All that was lost was time, hopefully this post can be archived never to be used again, but I fear so.

Some useful links :

a client request body is buffered to a temporary file

A little cute harmless nginx [WARNING] yea, one of those. It means an uploaded file was larger then the allowed buffer (in memory) set by client_body_buffer_size which by default is 16kb (on x64 bit). So for any upload of image, you can expect this to be allot larger. (images can easely be 3-6mb a piece)  I decided to increase the value to 1mb, meaning most images will still fire this warning, but smaller images (web-ready) won’t. I also changed client_max_body_size which sets a maximum upload size (an http error 413 would be returned when attempted a larger upload)

client_body_buffer_size 1M;
client_max_body_size 10M;

So this warning is just a notice, that a write has been written to file system, which is a bit slower then to memory. (a lot slower) When you see allot of these popping up, it might be a good idea to increase this value, to not over use your hard drive / ssd, but at the cost of reserved memory. (RAM)

screen in lxc-attach

I tried to create a screen session on a Proxmox/lxc container, but got greeted with :

# screen -S rsync_archive_docs
Cannot access '/dev/pts/3': No such file or directory

What exactly is happening I’m not sure, but it looks like I already have some sort of session since I logged on, over the host (lxc-attach -n $ID) so either log in over a new ssh session directly to the server or, if you are playing with NAT like me, you can just run :

script /dev/null

And that creates a fresh console and screen works again, jeeej for screen !

cifs mounts with spaces in /etc/fstab on Centos 6

Quick hack, when adding cifs to /etc/fstab you can’t use quotes to bridge a space in Linux; this will fail :

"//SERVER/NAME WITH SPACE/" /local_share/  cifs  username=user,password=paswd,iocharset=utf8,sec=ntlm  0  0

that tricks works in shares from cmd :

mount -t cifs -o user=user,pass=pasw "//SERVER/SHARE WITH SPACE/" /local/share

so for /etc/fstab, you have to use \040 which unsurprisingly maps to space in octal. Well another thing learned! Here is what should work in /etc/fstab

//SERVER/SHARE\040WITH\040SPACES/ /local/share/  cifs  username=user,password=paswd,iocharset=utf8,sec=ntlm  0  0

Happy mounting Windows shares 🙂

 

Yum error: rpmdb : failed: BDB1507 Thread died in Berkeley DB library

Wuuut, yum broke on my Centos 7 container ! For me this error :

error: rpmdb: BDB0113 Thread/process 10680/140044063065920 failed: BDB1507 Thread died in Berkeley DB library
error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db5 -  (-30973)
error: cannot open Packages database in /var/lib/rpm
CRITICAL:yum.main:

Happened cause I ran out of memory, not completely OOM, but close enough to be an issue. (I had ~60mb free) I could not run any yum command, I tried yum clean all but that failed, also rpm –rebuilddb did not work. In the end I had to manually remove the databases and rebuild from scratch.  (it sounds worse than it is)

So first I moved the databases to a temp location

mv /var/lib/rpm/__db* /tmp

(I had 3, .001-.003) Then I did a clean all, that worked, and I finished off doing an update, that did quit allot of updates. (woops)

yum clean all
yum update -y

After I verified everything worked I cleaned up my tmp :

rm -i /var/lib/rpm/__db*

Extra tip, finding what processess use most memory :

top -o %MEM

(mysql and php where the biggest RAM eaters)

Sort on part of string using ls & sort

Sometimes sorting some files seems easy, but its not. For example this part :

(random_string)_(numeric_id)_(allot_of_digits)-(random_string)

Of course I would like to sort on numeric_id, and not on the random string; This is where command guru’s are master, I’m but a humble user, so I’d like to share my hacky solution !

ls -l | sort -t_ -k2

bam, sorted.

split large file with … split.

A small useful hack (/use of) when you need to cut a large (huge) file in pieces to move it to tapes (2.5 Tb), cd (800 Mb), … . Here comes split !

split

definitely not this kind of split

split -b 2T full_data.tar.gz "data.tar.gz.part"

This will split the full_data.tar.gz (+6 Tb) in pieces of 2Tb each, each part will be named data.tar.gz.part*

* is an option (see split –help), but default is double letter (aa, ab, ac, …., zz)

now, to join them together again, you can join them using cat

cat data.tar.gz* > full_data.tar.gz

Go forth and split those large files!