VCLI a universal Veeam CLI

https://github.com/shapedthought/vcli

VCLI is a tool that I have been developing for a while which allows you to run API commands on almost all of the Veeam products.

The supported APIs are:

  • VBR
  • VB365
  • VB Azure
  • VB AWS
  • VB GCP
  • VONE

It is written in Go so it is provided as small, portal binary with all dependencies included. Python is a great language and one I use often, but dependency management can be a pain.

By why?

I wanted a way to get data from APIs with little setup or having to write a specific script to get the data.

I know that a lot of people are very familiar with various tools to get data from APIs so this tool may not be for you. It’s not meant to be a replacement for any current tools, just another tool for the kit bag.

Currently the tool only allows for GETS, but I am working on adding in POST and PUTS. I doubt that I will be adding DELETE as I don’t see any good coming from that.

Nushell

VCLI is pretty simple and only dumps out the data to either JSON or YAML. If you want to do more with the tool I suggest looking at https://www.nushell.sh/.

Nushell is a new approach to a shell which focuses on manipulating structured data which works brilliantly for data coming back from APIs.

I suggest that you try out both VCLI and Nushell for yourself and let me know if you like it and you can suggest any improvements.

M365 Reports, the Graph way

In this post I want to discuss Microsoft Graph, in my personal experience I haven’t seen many people use Graph for aspects such as reporting.

I believe the main reason for this is that using it within an application or with most languages, it can be quite a chore.

The reason for this is the requirement to set up an Application with Azure AD, and deal with secretes etc. If you are new to scripting or programming it can quite daunting.

Enter the PowerShell SDK

The PowerShell SDK provides all the power of the Graph API, but with the convenience of PowerShell. The one major advantage over all other options is that you do not need to manually set up the application within Azure AD.

That said it does require a bit of set up, and knowledge on where to find the commands.

Most of the installation information can be found here:

https://docs.microsoft.com/en-us/powershell/microsoftgraph/installation?view=graph-powershell-1.0

The main steps are:

  1. Allow PowerShell to install the sdk/ module
  2. Install the sdk
  3. Log into Graph
  4. Start making commands

First allow PowerShell to install the Module

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Next install the SDK

Install-Module Microsoft.Graph -Scope CurrentUser

Next we need to log into the Graph, and in doing this we have to set the scope of the commands that we want to use.

Knowing what scope again takes a bit of practice, but the documentation does show you how to discover the scopes you need to include to be able to run a command.

The easiest I’ve found is to do a simple Get-Help on the command with the -Detailed flag.

Get-Help Get-Mguser -Detailed

The examples tend to show you what scopes you need. The other option is:

Find-MgGraphCommand -Command 'Get-MgUser'

As you can see the Permissions are shown on the right. However, not all of them are needed, for my use the “User.Read.All”, tends to be enough to get user details.

Authorisation

When you first log into Graph it will prompt you to log in and provide admin authorisation, this is required to provide PowerShell the scopes that you have provided. It does not provide the application with administration rights.

This is similar to installing something on your computer, you need to be admin to install, but not operate the app.

Getting Reports

The main reason for me to use Graph is to get all the reports from M365, which is really easy once you know where to look.

https://learn.microsoft.com/en-us/graph/api/resources/report?view=graph-rest-1.0

There are reports for all aspects of M365, ranging from Mail through OneDrive and SharePoint. Some of the most useful ones I’ve found are:

But there are dozens of other commands available, and it are well worth exploring!

One thing that I found that with some of the reports that data is obfuscated, and MS does this purposely. If you want to be able to see full details such as “user principal name” on some reports you need to go into the MS365 portal and do the following:

Settings > Org Settings > Reports > uncheck “Display concealed user, group and site names in all reports”.

Once you have done that you’ll be able to see full details in the reports.

Keep Coding ๐Ÿฆ€

Awesome log search with cmder, nushell and ripgrep

In this article I wanted to go through how to parse log files like a boss using three tools.

  • cmder
  • nushell
  • ripgrep

Cmder is an console emulator which is far nicer than either PowerShell or Windows terminal, it is available from https://cmder.net/

Note that need to manually add cmder to the environment’s path.

Nu Shell can run within cmder by simply installing then typing “nu”, you can then run something like “ls” to see how the the output has changed.

You can also perform operations on the data like sorting and listing as well a host of other operations. It is also great for parsing API/JSON data.

You can install nushell various ways but the easiest is using chocolately.

choco install nushell

RipGrep is a commandline tool that is a grep alternaitve for all platforms and provides blazing fast search of patterns in files or folders.

https://github.com/BurntSushi/ripgrep/blob/master/GUIDE.md

It can be easily installed using a few different methods:

https://github.com/BurntSushi/ripgrep/blob/master/README.md#installation

Combining all the tools to search through logs.

First fire up nushell, then go to the logs folder.

nu

cd C:\ProgramData\Veeam\Backup

Then list the contents but only the directories. Note that th reverse flag is required as it normally sorts decending.

ls | where type == Dir | sort-by modified --reverse | first 10

Select the folder you want with the standard cd command, then list the directories by modified date.

Next we can use ripgrep to quickly look through a log file for errors, outputing to a table format.

rg error -C 5 some-log-file.log | table

The -C flag is “context” which provides 5 rows above and below searched term. There are also a ton of other flags and features including recursive search through all the files in a directory by simply omitting the file name

rg error -C 5 

You can also check how many occurances of a word are in each file using the lowercase -c flag.

rg -c error

You can also easily output the data to a file in various formats; csv, html, json and yaml.

rg error -C 5 some-log-file.log | save csv output.csv

Both nushell and ripgrep are in written in Rust which is blazingly fast and is a language that is one to watch out for in the future.

Keep coding ๐Ÿฆ€

Veeam Easy Connect – Python Module

For the last few months I’ve been picking over an idea of creating a module that makes it a lot easier to work with the Veeam APIs in Python.

The result of this is a new module called Veeam Easy Connect or vec for short.

https://github.com/shapedthought/veeam-easy-connect

The module, as the name suggests, makes it easy to connect to the API, and then starting making requests. For example to log into the VBR server you would do this:

from veeam_easy_connect import VeeamEasyConnect

vec = VeeamEastConnect("username", "password")

vec.vbr().login("api_address")

So you first import the module, then instantiate an instance of the module with the administrator and password. Finally you call the api type, along with a the login method with the API address.

I should mention that the vec is available on Pypi which means that you can install the module very easily using:

pip install veeam-easy-connect

The module allows you to use it in a bunch of different ways so has several methods to get various data. For example to get and download the headers in JSON format with the access token you can do the following:

header = vec.get_request_header()
# save to variable

vec.save_headers("file_name")
# export to file_name.json

There are other methods for:

  • Getting and saving the access token
  • Getting and setting the API version
  • Getting and setting the API port

There are also some request methods included which makes it easier to send requests without having to pull out or construct the authorisation header.

data = vec.get("https://192.168.0.123:9491/api/v1/objectRestorePoints")

However, even this is a lot of effort having to remember the address, port and api version, so if you pass “False” as the second argument, you can just past the very end part of the URL (after the v1 in this case).

data = vec.get("objectRestorePoints", False)

There are also PUT and POST request methods which require a data value to be passed as well.

data = {"performActiveFull": False}
job_id_url = "jobs/{id}/start"
resp = vec.post(job_id_url, data=data, full=false)

“Full” in this case means full URL just using the named parameter syntax.

Currently the module uses “requests” under the covers so is synchronous and therefore blocking. However you could combine it with aiohttp async by getting the headers synchronously then running multiple requests to an api using aiohttp. Though I have to admit that when it comes to concurrency, Golang is probably your better bet (a module for that is also on the cards).

The module also includes login with MFA, and I’m working on SSO. I’ve hit some problems with the SSO piece, but it is still on the todo list.

Thanks for reading, and keep coding! ๐Ÿค“

CKA Exam tips

It has been a while since I posted, I think everyone says that after a long break!

I was motivated to write this post as I know several people that are looking to take their CKA soon.

I’m not going to be providing exact information that is on the exam as I believe that is against the rules. However, I will provide some hints on the resouces I used and the areas you need to focus on.

Get the best learning resources ๐Ÿฅ‡

I watched bunch of different courses in getting ready for the exam, but by far the best resources is Mumshad Mannambeth’s course on Udemy called “Prepare for the Certified Kubernetes Administrators Certification..” or you can get the course on KodeKloud. Either way you get access to the labs which are very good.

Not only does he go over, in detail the concepts, but also focus on helpful tips that help you pass the exam. Honestly, just going by the k8s documentation online is NOT enough.

Note that when you sign up for the CKA you get access to two free goes at killer.sh which is designed to be harder than the CKA. I recommend that you gives these a go sooner than later as they are a great learning tool. But, don’t be too dishearted if you don’t complete them in the alloted 120 min, they are tough!

Need for speed ๐Ÿš€

The next thing you need to know is that the 2 hour exam goes past very quickly, your fingers will likely not stop moving the whole time.

Because of this the use of imperative commands is a MUST, there is certainly no time to create yaml files, espcially for the easier questions like updating an image or scaling the number of pods in a deployment.

One resource that I found to help with this is:

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands

One helpful thing I found that I believe is a fairly new addition is the imperative command to create ingress rules.

https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-ingress-em-

Remember that you allowed to access the K8s website during the exam, this includes the above resources. I’ll come back to this point later.

Note that the areas that you have to create yaml files are; PV, PVC, Storage Classes and Network policies. I recommend spending a lot more time with these, especially Network policies, as the YAML syntax was the hardest for me.

I found this resource which helped me play around with different sceanarios.

https://github.com/ahmetb/kubernetes-network-policy-recipes

You should also study and keep a bookmark of the “cheat sheet”

https://kubernetes.io/docs/reference/kubectl/cheatsheet/

JSON path and Custom Columns

Following on from the above, for speed, familiarity with JSON path and Custom Columns will save you a lot of time when you need to get specific data.

k get nodes -o jsonpath="{.items[*].spec.podCIDR}"

Fortunatly Mumshad provides a free course via KodeCloud on this subject

https://kodekloud.com/courses/json-path-quiz/

There are also useful linkes on the k8s website

https://kubernetes.io/docs/reference/kubectl/jsonpath/

When practicing I recommend that you pipe the output of jsonpath to jq which it makes it easier to naviate the json data.

Custom columns is a little easier on the syntax

k get nodes -o custom-columns=NAME:.spec.podCIDR

https://kubernetes.io/docs/reference/kubectl/overview/#custom-columns

Create a bookmark library ๐Ÿ“–

As stated before, you have access to the k8s website as a resource (though not the k8s forum). However, it’s not the most organised site, and searching for resources can take time, which you don’t have.

It is important to have a well organised list of references on your bookmarks bar to allow you to quickly access what you need.

I suggest organising it something like this:

  • Cheat Sheet
  • Installation
  • Update K8s
  • ETCD
    • Backup
    • Restore
  • Storage
    • PV
    • PVC
    • Storage Classes

Understanding K8s file locations ๐Ÿ”

Knowing where key file locations are is a must, for example you need to know where static pod manifests are located, as well as certificates. For example:

/etc/kubernetes/manifests/
/etc/kubernetes/pki/

Understanding KubeADM ๐Ÿค“

It is important to know how K8s services run either in a KubeADM deployed environment, and one that is installed baremetal on the hosts (the hard way).

It is also important to know how to access the logs of the services depending how they have been deployed.

For example:

k logs etcd-k8scontrol -n kube-system | less
journalctl -u kubelet.service -l

As well as knowing how to check a service running bare metal is running, and how to restart it.

systemctl start my-service

Note that you can list all running services by simply running systemctl.

Modify VIM ๐Ÿ”ง

The default settings of VIM can be unhelpful when writing or updating YAML. I update the .vimrc file with the following.

vim ~/.vimrc
set tabstop=2
set shiftwidth-2
set expandtab

Save then reload.

source ~/.vimrc

Upgrade your VIM skills

I watched a few YouTube videos on VIM to help me:

  • Moving around VIM faster e.g.
    • shift + [ to skip paragraphs
    • W to move forward by word
  • Selected lines of text to delete and copy/paste (v in command mode, d cut, p paste)
  • Find and replace (sed like commands)

I found upgrading these skills made me more effective and faster when I had to update or create a YAML file.

It is also useful to look into the ‘sed’ command as it is also very useful.

ETCD Backup and Restore

These are two of the most time consuming tasks so it is well worth practicing these as much as possible. I can practically recite all the commands off by heart now.

These are mainly time consuming as you need to grab the certificate locations for the ETCD pod which are listed either by running.

There are a couple ways to do it, either via kubectl:

k get pods etc-k8scontroller -o yaml | grep -i .crt
k get pods etc-k8scontroller -o yaml | grep -i .key

Or by doing a cat on the manifest file:

sudo cat /etc/kubernetes/manifests/etcd.yaml| grep -i .crt
sudo cat /etc/kubernetes/manifests/etcd.yaml | grep -i .key

If you can memorize the locations, great, but I found it tough, plus there’s not guarentee they are in the usual location (pki).

I found that ETCD restores are more tricky. This is mainly because it is hard to try out without breaking your cluster. One trick I found useful was to do a standard VM level backup (with Veeam of course), and then simply restore if you mess it up. Note that you cannot use a Kind based multi-cluster to test as bringing down the api server will mean you lose access to the Kind cluster (as I found out).

The restore documentation can be found here, but I found it a little sparse on details.

https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#restoring-an-etcd-cluster

From the other resources I’ve read, though it only recommends moving the api-server manifest from the manifests location. Moving everything is actually better as you need to restart all the system pods anyway.

mv /etc/kubernetes/manifests/*.yaml .

Here we are moving the files to the current directory, this will shutdown all the system pods.

Next run the ECTD restore as per the documentation.

It states in the documentation that you need to update the ETCD address, but in all the times I’ve done this it hasn’t been a requirement.

What isn’t stated is that if you have restored the backup to a different location like /var/lib/etcd_restore then the ETCD manifest file also needs to be updated.

- hostPath:
      path: /var/lib/etcd
      type: DirectoryOrCreate
    name: etcd-data

Then move all the manifest files back to the manifests directory.

mv *.yaml /etc/kubernetes/manifests/

Obviously be careful with this command in case you have any other yaml files in the same directory.

Delete or not to delete ๐Ÿค”

When you need to update a Deployment or StatefulSet you have to first dump out the YAML as most fields don’t allow inplace editing.

k get deploy my_deploy -o yaml > my_deploy.yaml

The problem with this is that it dumps out a lot of additional information that is added like “creationTimestamp” and a whole lot under “annotations”.

I have found that this additional information doesn’t cause any issues when redeploying a Deployment or StatefulSet. So I don’t recommend spending time cleaning them up.

Switching Namespaces Context

Constantly having to type the -n flag can get tiresome so you can change the namespace context the one you are working on via:

k config set-context --current --namespace=some-namespace

WARNING: remember to switch the namespace context back to “default” before moving on or you may deploy something in the wrong place!

Pick Your Targets ๐ŸŽฏ

In the CKA you are free to move back and forward through the questions, each have a different weight. I recommend having a look through them all before jumping in, and selecting the higher value questions first. That way if you do run out of time, it is more likely going to be the easier, lower value questions.

Note, make sure that there aren’t any prerequesit steps from a previous question.

It’s pure K8s ๐Ÿ˜‡

Most of the exam is pure K8s without any 3rd party integration, this means that things like Storage Classes can only relate to hostpath or local variants.

It also applies to CNI providers and Ingress Controllers, so knowing how to install one or another I don’t believe is a requirement (might want to check this one!)

I think this is due to the fact that these aren’t directly controlled by K8s and they don’t provide installation instructions on the K8s website. But you do need to know how to work with them.

Other areas

Other areas that you could study:

  • Sidecar containers with communication via emptyDir
  • Basic shell scripting in relation to the CMD directive in the container spec
    • Use the impreative command to add these quickly
    • Remember to add the –command flag so they are added under command, and not args
k run test --image=busybox --command -- sleep 1000
  • Exec into a specific container in a multi-container pod (-c)
  • Running a specific command in a temporary container do so something
k run rest --image=busybox --rm -it -restart=Never -- nslookup some-service
  • Study up on security contexts, remember that capabilities only work at the container level
  • Practice, practice, practice Network Policies
  • Have a look over the .kube/config file, and get familiar with “kubectl config” options
cat ~/.kube/config
  • Using openssl to decode .crt files
openssl x509 -in something.crt -text -noout
  • Get familiar with the ports that each of the management componets use.

That’s all for now, thanks for reading!

K8s demo cluster using the Rook CSI

For the last few days I have been trying out deploying K8s with a CSI based storage so I can test Kasten K10.

I found this article:

Rook, Ceph, CSI, Kubernetes, and K10: An All-in-One Stateful Experience on your Laptop (kasten.io)

On how do to do this on Docker Desktop, so have modified the process slightly to working a standalone vanilla K8s cluster.

For cluster I have 4 x Ubuntu 20.04 VMware (6.7) VMs; one controller and three workers. Each have 2 vCPU and 4GB RAM, 16GB OS and 100GB data disk on the workers all thin provisioned.

I’ve installed k8s on the cluster using kubeadm with calico networking.

All nodes:

cat << EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

cat <<EOF | sudo tee /etc/systemctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables    = 1
net.ipv4.ip_forward                   = 1
net.bridge.bridge-nf-call-ip6tables   = 1
EOF

sudo sysctl --system 

sudo apt-get update && sudo apt-get install -y containerd

sudo mkdir -p /etc/containerd

sudo containerd config default | sudo tee /etc/containerd/config.toml

sudo systemctl restart containerd

# sudo systemctl status containerd

sudo swapoff -a

sudo rm /swap.img

sudo vim /etc/fstab

# Press i
# Delete or comment out the below line
# /swap.img     none    swap    sw      0       0
# Press ESC, :, w, q, ENTER

sudo apt-get update && sudo apt-get install -y apt-transport-https curl

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

sudo apt-get update

# 1.21.3 was most up-to-date version at time of writing
sudo apt-get install -y kubelet=1.21.3-00 kubeadm=1.21.3-00 kubectl=1.21.3-00

sudo apt-mark hold kubelet kubeadm kubectl

Then on Control node unless otherwise stated

sudo kubeadm init --pod-network-cidr 192.168.0.0/16 --kubernetes-version 1.21.3

# run the mkdir command shown on the screen

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

# check control node is showing ready
kubectl get nodes

# Run the result of this command on each of the workers
kubeadm token create --print-join-command

# This should show all nodes in read state
kubectl get nodes

Installing and configuring Rook.

First need to make sure that you have a sdb drive attached to the worker nodes.

sudo lsblk

Then follow the next steps, you’ll find most of them in the quick start but there are some modifications due to this being a cluster instead of local.

Rook Docs

The changes as well as the info on the toolbox came from the Linux Foundation.

Kubernetes – Getting Started With Rook – YouTube

git clone https://github.com/rook/rook.git

cd rook

git checkout v1.6.8

cd rook/cluster/examples/kubernetes/ceph

kubectl create -f crd.yaml -f common.yaml -f operator.yaml

vim cluster.yaml

You will need to make some changes to the cluster.yaml file

Awesome SSH Client: Release Alpha 148 ยท Eugeny/tabby (github.com)

storage.nodes I have added:

storage:
  useAllNodes: true
  useAllDevices: true
  config:
  nodes:
  - name: "k8s-worker1"
    devices: 
    - name: "sdb"
  - name: "k8s-worker2"
    devices:
    - name: "sdb"
  -name: "k8s-worker3"
    devices:
    - name: "sdb"

Make sure that your node names match up with your node names.

Next run:

kubectl apply -f cluster.yaml

kubectl apply -f csi/rbd/storageclass.yaml

kubectl apply -f toolbox.yaml

# check progress with 

kubectl -n rook-ceph get pods --watch

When you have three (or as many worker nodes as you have added ) of the below:

rook-ceph-osd-X-XXXXXXX-XXXX

Then you know that everything is up and running. But if you want a closer look we can use the toolbox pod created above.

kubectl -n rook-ceph exec -it <pod_name> bash

# Overall status
ceph status

# Information on the 
ceph osd status

exit

Next create the StorageClass

kubectl create -f csi/rbd/storageclass.yaml

I used the one above as opposed to the storageclass-test in the Kasten blog as it suites the multi-node cluster that has been deployed.

Now for some reason that I’m still working out I could not just create the VolumeSnapshotClass via:

kubectl create -f csi/rbd/snapshotclass.yaml

This throw up errors for me. Apparently this was an issue when it was still in beta but my understanding is that it has been included with snapshot.storage.k8s.io/v1 since version 1.20 and I’m running 1.21.3.

I suggest trying the command above and if you hit an error these are the steps that I took to fix it. These are also detailed on the https://github.com/kubernetes-csi/external-snapshotter page.

git clone https://github.com/kubernetes-csi/external-snapshotter.git

cd external-snapshotter

kubectl create -f client/config/crd

kubectl create -f deploy/kubernetes/snapshot-controller

Now you should be able to create the SnapshotClass shown above.

Next we need to make a few changes to the StorageClass and VolumeSnapshotClass so they’ll work in our set up.

# Without it being the default it has caused issues
kubectl patch storageclass rook-ceph-block \
    -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

kubectl annotate volumesnapshotclass csi-rbdplugin-snapclass k10.kasten.io/is-snapshot-class=true

# You can modify the snapshot class before applying but this is just as easy
kubectl patch volumesnapshotclass csi-rbdplugin-snapclass -p '{"deletionPolicy":"Retain"}' --type=merge

Right we’re now done with Rook ๐Ÿ˜€

Kasten Install

Kasten install is well documented on the site but I’m putting the steps below for ease.

# runs a check to see if everyone is good to go
curl https://docs.kasten.io/tools/k10_primer.sh | bash

# adds the kasten k10 repo
helm repo add kasten https://charts.kasten.io/

# installs k10
helm install k10 kasten/k10 -n=kasten-io

# monitor the progress
kubectl -n kasten-io get pods --watch

Now the bit that caught me off guard was how the heck to access the k10 Dashboard externally as the Kasten blog assumes you are running it locally so 127.0.0.1 will work.

So you will need to create a NodePort using the following.

# Expose the node port to the Gateway Service
kubectl expose service gateway -n kasten-io --type=NodePort --name=gateway-nodeport

# Get the Node port
kubectl get svc -n kasten-io gateway-nodeport

You can then access the interface externally via:

http://192.168.4.77:31013/k10/#/

Obviously replacing the IP and Port with your relative ones or add it to DNS and it give it a friendly name.

After that is should just work.

Thanks for reading!

Azure Report

Recently Jorge De La Cruz posted a method of creating and sending a job report from Azure.

Veeam: HTML Daily Report for Veeam Backup for Azure is now available – Community Project – The Blog of Jorge de la Cruz

I followed in fashion with creating an adaption of that report by using Golang as I’m a big fan of the language.

https://github.com/shapedthought/veeam_azure_assessment

I’ve not been able to make my version as pretty as Jorge’s and parsing the strings for some of the outputs has proven to a bit of challenge in Go.

However, in my opinion Jorge’s report has issues as it is a shell script so you need a Linux VM and the email functionality needs some work in my opinion.

Anyway, what I decided to was to combine our efforts by creating a Docker image based on Ubuntu. It has the script (slightly modified) embedded as well as all the required packages (curl, jq), plus a Golang executable that I wrote which takes advantages of the languages native SMTP support.

The images is available from: txtxx56/azure_report

You will need to create an .env file with the following:

VEEAMUSER=username
VEEAMPASS=password
VEEAMURL=https://55.55.55.55
SMTPTO=their@emailaddress.co.uk
SMTPFROM=your@emailaddress.co.uk
SMTPSERVER=your.smtp.com
SMTPPASS=sMtpPaSSwOrd

You can then run the container using the following:

docker run --rm --name azure_report --env-file ./.env txtxx56/azure_report:0.5

You should see “Sending Email” then “Email Sent” if it worked correctly.

If you want to get a local copy of the HTML report run the following:

docker run --rm --name azure_report --env-file ./.env -v ${pwd}/target:/home/oper/vba_azure_reports txtxx56/azure_report:0.5

You will need to update the local target name, for example in the folder that I run this from I have a ‘report’ directory so map the bind mount to that.

My plan is to take this step further and create a Kubernetes CronJob which will leverage a Secret and ConfigMap. It is possible do secrets in Docker Swam but as there is no way to schedule the running of a container, so I decided that K8s is a better option.

Veeam SQL Log Copy to cloud

In this post I am exploring connecting Veeam to an Azure based Linux Repository for the use of doing a Backup copy, specically a Log Backup copy using the immediate copy mode launched with v10.

The reason for this is because Veeam Cloud Tier does not allow the copy of VLB files aka Log Backup files.

The above diagram shows the overview of my lab environment.

First I have a lab subnet which holds part of my lab VMs, the networks are routed using a PfSense Router which in turn connects to my home Router then to the internet. Nothing to crazy there.

The first thing I had to do was to create a VM in Azure to work as my remote repostory, I opted for a B2s VM running Ubuntu 18.04 with a S6 64GB HDD managed volume attached for the copy data.

In additon to this I used VeeamPN which is free software from Veeam that allows you to create a VPN to your network. It is available from:

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/veeam.veeampn

Of course the VM running VeeamPN does have a cost attached, between the two VMs the total cost according to the Azure calculator was $84. Most of that was the compute, upping the capacity didn’t actually cost a lot more.

The installation process is pretty simple, the online guide provides all the information you need to set it up.

https://www.veeam.com/veeam_pn_2_1_user_guide_pg.pdf

You also need to download the VeeamPN hub OVA from the Veeam website, again this is free. You don’t actually need this if you are planning to connect a single host to VeeamPN as you can install and use OpenVPN; however, as the Backup Server also needs to communicate to the remote repo, it makes sense to install the simple-to-use hub VM.

The one slightly more complicated piece was getting the PfSense set up to use the Gateway. What I did with this was to download the configuration XML from VeeamPN and do the following.

  1. Add the cert in the CA under “System/Certification/CAs”, you’ll need to change the method to “Import an existing Certification Authority”
  2. Add the certificates under “System/Certification/Certicates”, just a copy/paste of the Private and Public keys.
  3. Create a new gateway under “System/Routing/Gateways”, I assigned this to the IP address of the VeeamPN Hub.
  4. Creat a static route with the destination as the Azure subnet e.g. 10.1.2.0/24 and assign the new VeeamPN Gateway.

Once all that is done you should be able to ping your Azure VM assuming you don’t have any firewall rules that might block access.

The Azure based VM required me to format the sdc disk with XFS, my friend Jorge De La Cruz has a great post on this subject.

https://jorgedelacruz.uk/2020/03/19/veeam-whats-new-in-veeam-backup-replication-v10-xfs-reflink-and-fast-clone-repositories-in-veeam/

The rest of the process is the same as adding a repository locally through the Veeam GUI; however, remember that you will need to create an SSH key on the repository and VBR servers to be able to connect to the Azure VMs. Those public keys will then need to be copied to the Azure VM’s know_hosts file.

I personally did this manually by doing the following

sudo nano ~/.ssh/known_hosts

Then doing a copy/paste of the public keys then saving off the file.

SQL Log Cloud Copies

This section still needs research to confirm certain aspects.

You are able to do a backup copy of the absolute minimum to the copy which is two restore points. However, you are able to set the log shipping to far longer.

What this means is that you can dramatically decrease the amount of data held in the cloud for the VM portion of the backup while having an extended log backup. Cloud Tier is completely seperate from Backup Copy so you could have longer retention, using tiering, as well as another copy in the cloud using that option.

The potential process in case of emergency then would be to grab the backup flies from the cloud based repo. In the case of Azure this could be achived using VeeamSCP or any other means of copying over the network.

https://www.veeam.com/fastscp-azure-vm.html

https://www.veeam.com/veeam_fastscp_azure_1_0_user_guide_pg.pdf

Why is this a good thing? Well you could of course do a straight copy of the logs using the Azure/AWS CLI to copy a folder to the a bucket which is a low-cost option. The problem with this is that it is unsupported by Veeam, so if you have any issues then it’s on you. Not say it won’t work, and it would be cheaper, but I couldn’t recommend it. Doing it this way keeps it under support.

This of course is another option for the copying of logs to the cloud and is supported by Veeam. However, as it isn’t part of the main console so requires seperate management, it isn’t my recommended option.

Assuming you also downloaded the Main VM backups from copies that you sent to Azure via Capacity Tier, the VLB files can be placed in the same directory as the VBK & VIB files and if they are of a matching timescale, the logs will ‘reconnect’ allowing granular restore points.

Data Gathering for Veeam NAS Backup

This post is about a possible method of calculating change rate and the archive rate of a given filesystem.

The issue that a lot of people have is calculating the change rate of a filesystem in order to be able to accurately estimate the repository storage for Veeam. In addition to this it can help with estimating what can be archived.

In my search for a method of watching a filesystem using code, I found a NirSoft program that already does what I needed.

https://www.nirsoft.net/utils/folder_changes_view.html

Folder Changes View (FCV) monitors a filesystem and shows the changes in real-time including modifications, created files and deletions. Unfortunatly the latter doesn’t provide the capacity of the file, but I guess we can’t have it all.

Note that I did look at a Python package called ‘Watchdog’, which was promising but was going to be a lot more work.

The advantage of doing it this way as opposed to scanning a whole file system is that doesn’t put any pressure on the filesystem and still gets relevant data. There is still a place for a full file anaysis, for example file type breakdowns; however, this is a ligher weight option.

Important Note: Though I am confident in the FCV application, you can’t be too careful. I recommend that you run it on a dedicated VM and that the user account has Read Only permissions.

FCV can export a HTML file which holds a table, this is great when it comes to Python as we can grab that table very easily using Pandas. That in turn will turn the file into a dataframe which can be easily manipulated.

However, we don’t really want to be sending potentially sensitive information over the internet, so we really need to anonymise the data as much as possible and save it in a secure state before sending.

I have written a program that will do all the work for us which is up on my Github page.

https://github.com/shapedthought/file_html_report_processor

The program will do the following:

  • Import the HTML data
  • Hash the filenames
  • Remove the Path and File Owner information
  • Export a randomly generated encryption key saved to a file
  • Save the resulting data in a text file in an encrypted string

The file containing the data can be sent securely via an email, with the encryption key sent seperatly.

Inversely the application will convert the data to json with the encryption key, this can then be fed back into Python Pandas to do start analysis.

Note: this program is provided with the MIT licence, please review before using.

Exploring the Veeam API with Python

I have recently been experimenting and testing the Veeam API, however, I have noticed that there is a lack of information on how to work with it especially if you are new to the subject.

I love Python so it made sense to explorer the API using this, however, trying to communicate how to use it is a bit of a challenge. Enter Jupyter Notebooks.

If you are not already familiar with these, they essentially allow you to enter both markdown text and executeable python code into notebook ‘cells’.

The Markdown areas allow you to provide explaination of what you are doing in a logical fashion which makes it ideal for this type of use.

You can install Jupyter Labs which is the latest incarnation of Notebooks using:

pip install jupyterlab

You will need Python installed on your system (with it included in your PATH variables) for this to work. You then just need to run:

jupyter-lab

Alternativtley VS Code has direct support for Jupyter Notebooks so you can run them in there if you wish. Personally I find the Labs looks nicer but VS Code has intellisense which can be very useful.

I have uploaded a JP Notebook to my github which can be viewed here:

https://github.com/shapedthought/juypter_veeam_api