Bareos uses FileSets to decide what to backup and what not to backup. While the documentation is extensive, the approach is mostly include everything and exclude parts (include all, exclude after). While this is the way average people want to backup. If you work around non-IT minded people, you know that they will store EVERYTHING, EVERYWHERE and mix downloadable data with self created data. So the “include all, exclude after” method would make backups explode in size, with allot of unnecessary data.
So my idea is to exclude everything and include parts I know are relevant to backup. Inevitably this will make for allot exceptions to the rule, but with Bareos that is easier then including everything and finding a way to keep performance up. (exclude all, include some)
I want to have multiple small jobs that finish “quickly” on the client. Using this approach I can do multiple backups per day on specific data without generating an extreme load on client machines. The “backup everything” jobs can then be run on a lower frequency. As the frequently changed data is backed up already.
Since most of the work is in Microsoft Office applications, I started there. Now we have allot of scientific data floating around, this data however, is backed up on result servers and don’t belong in the client desktop backups. This is another reason to go with exclude everything, include what you need; (although nobody ever got fired for making a backup extra…)
Creating a new FileSet
Let’s start by creating a new FileSet, you can name it whatever, but best to name it something you can recognize.
nano /etc/bareos/bareos-dir.d/fileset/win_office.conf
A “template” would look like this :
FileSet {
# required name
Name = "win_office"
# volume shadow copy service
# this is windows specific
Enable VSS = yes
# include
Include {
# include from this directory
#
File = "C:/Users"
Options {
# config
Signature = MD5
IgnoreCase = yes
noatime = yes
# Word, Excel, Powerpoint
WildFile = "*.doc"
WildFile = "*.docx"
WildFile = "*.xls"
WildFile = "*.xlsx"
WildFile = "*.ppt"
WildFile = "*.pptx"
# open office
WildFile = "*.odt"
WildFile = "*.ods"
WildFile = "*.odp"
# pdf
WildFile = "*.pdf"
}
Options {
# all files not in include
RegExFile = ".*"
Exclude = yes
}
}
}
Test the FileSet
Now before we deploy this FileSet, you can test this on a client to see what exactly gets backed up (dry-run); The easiest way is to create a Job Definition and Job :
nano /etc/bareos/bareos-dir.d/jobdefs/BackupWindowsOffice.conf
JobDefs {
# name (required)
Name = "BackupWindowsOffice"
# type can be backup/restore/verify
Type = Backup
# the default level bareos will try
# can also be Full/Differential(since last full)/Incremental(since last incremental)
Level = Incremental
# the default client, to be overwritten by the job.conf
Client = bareos-fd
# the fileset we just created
FileSet = "win_office"
# the schedule
Schedule = "Nightly"
# where to store it
Storage = File
# the message reporting
Messages = Standard
# the pool where to store it
Pool = Incremental
# the higher the value priority the lower it will be dropped in the queue
# so for important jobs priority=1 will run first
Priority = 10
# the bootstrap file keeps a "log" of all the backups, and gets rewritten every time a
# full backup is made, it can be used during recovery
Write Bootstrap = "/var/lib/bareos/%c.bsr"
# in case these value's get overwritten
# define where would be a good pool to write
# note that full backup will be used atleast once because no full
# backup will exist
Full Backup Pool = Full
Differential Backup Pool = Differential
Incremental Backup Pool = Incremental
}
Then the Job :
nano /etc/bareos/bareos-dir.d/job/fileset_test.conf
Job {
# required
Name = "svennd-office"
# the default settings
JobDefs = "BackupWindowsOffice"
# overwrite the client here
Client = "svennd"
}
Obviously we are not going to wait until the nightly is ran; open bconsole and run :
Enter a period to cancel a command.
*estimate job=svennd-office listing
To see what files are scheduled for backup; The backup set I used as template is useful, but includes way to much. So you can change it, run a reload and test again using estimate :
2000 OK estimate files=7,741 bytes=141,521,221
You have messages.
*reload
reloaded
*estimate job=svennd-office listing
Until you have exactly what you need.
Current Windows FileSet
I have Windows 7 and Windows 10 users to backup. While generally that does not make much difference for Bareos, the Windows 7 users, have specifically a funny directory structure :
- C:/users/username is moved to D:/users/username, but C:/users still exists …
- D:/users/username/documents is moved to D:/documents
Bareos will only issue a warning if File = "D:/Users"
is not found, so you could only use a single config files, but I dislike ignoring warnings, so I made a difference between windows 7 and windows 10 FileSets;
Below you will find my current FileSets, since I’m still playing around with these sets, they will most likely still change. Note that you don’t have to generate a separate config file for every FileSet {} . So I think its a good idea to combine similar configs.
All these are found:
/etc/bareos/bareos-dir.d/fileset/
win_images.conf : images on windows 7 + 10, images are badly compressible, so don’t spend time on these.
# windows 7 & windows 10 images
FileSet {
Name = "Win7_images"
# volume shadow copy service
Enable VSS = yes
Include {
# location
File = "D:/Users"
Options {
# config
Signature = MD5
IgnoreCase = yes
noatime = yes
# images
WildFile = "*.jpg"
WildFile = "*.gif"
WildFile = "*.tif"
WildFile = "*.png"
}
# exclude everything else
Options {
# all files not in include
RegExFile = ".*"
# default user profiles
WildDir = "[C-D]:/Users/All Users/*"
WildDir = "[C-D]:/Users/Default/*"
# explicit don't backup
WildDir = "[C-D]:/Users/*/AppData"
WildDir = "[C-D]:/Users/*/Music"
WildDir = "[C-D]:/Users/*/Videos"
WildDir = "[C-D]:/Users/*/Searches"
WildDir = "[C-D]:/Users/*/Saved Games"
WildDir = "[C-D]:/Users/*/Links"
# application specific
WildDir = "[C-D]:/Users/*/MicrosoftEdgeBackups"
WildDir = "[C-D]:/Users/*/Documents/R"
WildDir = "*.svn/*"
WildDir = "*.git/*"
WildDir = "*.metadata/*"
WildDir = "*cache*"
WildDir = "*temp*"
# share services
WildDir = "*iCloudDrive*"
WildDir = "*OneDrive*"
WildDir = "*stack*"
# windows specific
WildDir = "*RECYCLE.BIN*"
WildDir = "[C-D]:/System Volume Information"
Exclude = yes
}
}
}
FileSet {
Name = "Win10_images"
# volume shadow copy service
Enable VSS = yes
Include {
# location
File = "C:/Users"
Options {
# config
Signature = MD5
IgnoreCase = yes
noatime = yes
# images
WildFile = "*.jpg"
WildFile = "*.gif"
WildFile = "*.tif"
WildFile = "*.png"
}
# exclude everything else
Options {
# all files not in include
RegExFile = ".*"
# default user profiles
WildDir = "[C-D]:/Users/All Users/*"
WildDir = "[C-D]:/Users/Default/*"
# explicit don't backup
WildDir = "[C-D]:/Users/*/AppData"
WildDir = "[C-D]:/Users/*/Music"
WildDir = "[C-D]:/Users/*/Videos"
WildDir = "[C-D]:/Users/*/Searches"
WildDir = "[C-D]:/Users/*/Saved Games"
WildDir = "[C-D]:/Users/*/Links"
# application specific
WildDir = "[C-D]:/Users/*/MicrosoftEdgeBackups"
WildDir = "[C-D]:/Users/*/Documents/R"
WildDir = "*.svn/*"
WildDir = "*.git/*"
WildDir = "*.metadata/*"
WildDir = "*cache*"
WildDir = "*temp*"
# share services
WildDir = "*iCloudDrive*"
WildDir = "*OneDrive*"
WildDir = "*stack*"
# windows specific
WildDir = "*RECYCLE.BIN*"
WildDir = "[C-D]:/System Volume Information"
Exclude = yes
}
}
}
win_office.conf
# all office files in users (c:/ and d:/)
# for win 7 = D
# for win 10 = C
FileSet {
Name = "Win7_office"
# volume shadow copy service
Enable VSS = yes
Include {
# location
File = "D:/Users"
File = "D:/My Documents"
Options {
# config
Signature = MD5
compression = LZ4
IgnoreCase = yes
noatime = yes
# Word
WildFile = "*.doc"
WildFile = "*.dot"
WildFile = "*.docx"
WildFile = "*.docm"
# Excel
WildFile = "*.xls"
WildFile = "*.xlt"
WildFile = "*.xlsx"
WildFile = "*.xlsm"
WildFile = "*.xltx"
WildFile = "*.xltm"
# Powerpoint
WildFile = "*.ppt"
WildFile = "*.pot"
WildFile = "*.pps"
WildFile = "*.pptx"
WildFile = "*.pptm"
WildFile = "*.ppsx"
WildFile = "*.ppsm"
WildFile = "*.sldx"
# access
WildFile = "*.accdb"
WildFile = "*.mdb"
WildFile = "*.accde"
WildFile = "*.accdt"
WildFile = "*.accdr"
# publisher
WildFile = "*.pub"
# open office
WildFile = "*.odt"
WildFile = "*.ods"
WildFile = "*.odp"
# pdf
WildFile = "*.pdf"
# flat text / code
WildFile = "*.xml"
WildFile = "*.log"
WildFile = "*.rtf"
WildFile = "*.tex"
WildFile = "*.sql"
WildFile = "*.txt"
WildFile = "*.tsv"
WildFile = "*.csv"
WildFile = "*.php"
WildFile = "*.sh"
WildFile = "*.py"
WildFile = "*.r"
WildFile = "*.rProj"
WildFile = "*.js"
WildFile = "*.html"
WildFile = "*.css"
WildFile = "*.htm"
}
# exclude everything else
Options {
# all files not in include
RegExFile = ".*"
# default user profiles
WildDir = "[C-D]:/Users/All Users/*"
WildDir = "[C-D]:/Users/Default/*"
# explicit don't backup
WildDir = "[C-D]:/Users/*/AppData"
WildDir = "[C-D]:/Users/*/Music"
WildDir = "[C-D]:/Users/*/Videos"
WildDir = "[C-D]:/Users/*/Searches"
WildDir = "[C-D]:/Users/*/Saved Games"
WildDir = "[C-D]:/Users/*/Favorites"
WildDir = "[C-D]:/Users/*/Links"
# application specific
WildDir = "[C-D]:/Users/*/MicrosoftEdgeBackups"
WildDir = "[C-D]:/Users/*/Documents/R"
WildDir = "*iCloudDrive*"
WildDir = "*.svn/*"
WildDir = "*.git/*"
WildDir = "*.metadata/*"
WildDir = "*cache*"
WildDir = "*temp*"
WildDir = "*OneDrive*"
WildDir = "*RECYCLE.BIN*"
WildDir = "[C-D]:/System Volume Information"
Exclude = yes
}
}
}
FileSet {
Name = "Win10_office"
# volume shadow copy service
Enable VSS = yes
Include {
# location
File = "C:/Users"
Options {
# config
Signature = MD5
compression = LZ4
IgnoreCase = yes
noatime = yes
# Word
WildFile = "*.doc"
WildFile = "*.dot"
WildFile = "*.docx"
WildFile = "*.docm"
# Excel
WildFile = "*.xls"
WildFile = "*.xlt"
WildFile = "*.xlsx"
WildFile = "*.xlsm"
WildFile = "*.xltx"
WildFile = "*.xltm"
# Powerpoint
WildFile = "*.ppt"
WildFile = "*.pot"
WildFile = "*.pps"
WildFile = "*.pptx"
WildFile = "*.pptm"
WildFile = "*.ppsx"
WildFile = "*.ppsm"
WildFile = "*.sldx"
# access
WildFile = "*.accdb"
WildFile = "*.mdb"
WildFile = "*.accde"
WildFile = "*.accdt"
WildFile = "*.accdr"
# publisher
WildFile = "*.pub"
# open office
WildFile = "*.odt"
WildFile = "*.ods"
WildFile = "*.odp"
# pdf
WildFile = "*.pdf"
# flat text / code
WildFile = "*.xml"
WildFile = "*.log"
WildFile = "*.rtf"
WildFile = "*.tex"
WildFile = "*.sql"
WildFile = "*.txt"
WildFile = "*.tsv"
WildFile = "*.csv"
WildFile = "*.php"
WildFile = "*.sh"
WildFile = "*.py"
WildFile = "*.r"
WildFile = "*.rProj"
WildFile = "*.js"
WildFile = "*.html"
WildFile = "*.css"
WildFile = "*.htm"
}
# exclude everything else
Options {
# all files not in include
RegExFile = ".*"
# default user profiles
WildDir = "[C-D]:/Users/All Users/*"
WildDir = "[C-D]:/Users/Default/*"
# explicit don't backup
WildDir = "[C-D]:/Users/*/AppData"
WildDir = "[C-D]:/Users/*/Music"
WildDir = "[C-D]:/Users/*/Videos"
WildDir = "[C-D]:/Users/*/Searches"
WildDir = "[C-D]:/Users/*/Saved Games"
WildDir = "[C-D]:/Users/*/Favorites"
WildDir = "[C-D]:/Users/*/Links"
# application specific
WildDir = "[C-D]:/Users/*/MicrosoftEdgeBackups"
WildDir = "[C-D]:/Users/*/Documents/R"
WildDir = "*iCloudDrive*"
WildDir = "*.svn/*"
WildDir = "*.git/*"
WildDir = "*.metadata/*"
WildDir = "*cache*"
WildDir = "*temp*"
WildDir = "*OneDrive*"
WildDir = "*RECYCLE.BIN*"
WildDir = "[C-D]:/System Volume Information"
Exclude = yes
}
}
}
More Bareos articles.