Category Archives: Data Protection

Enabling Rubrik storage integration using PowerShell

Rubrik supports doing SAN-integrated snapshots to perform backups of virtual machines similar to other backup products. My experience with Rubrik is that it seems to perform each backup as an individual task rather than trying to perform all the required operations on a target at the same time. For example, if you have an SLA policy targeting a SQL database/instance it will run as a separate process to the VM image based backup of that SQL server.

This seems to extend to how it performs backups with storage integrated snapshots, it doesn’t do any analysis of the VMs on each datastore to target multiple VM backups with the same storage snapshot. This seems to be the reason that Rubrik recommends using the storage integrated snapshot feature for specific workloads only.

For one of my recent clients, I quickly wrote the below script for enabling/disabling storage snapshots using their PureStorage FlashArray across their entire VM environment. To use it, you need to configure an API token for Rubrik and then run it against the target Rubrik Brik local API endpoint.


$rubrikURL = "https://rubrik.fqdn.internal/api/" # update me

# Get the API Token for the api requests
if ($token -eq $null) {
$token = read-host -Prompt "API Token for $rubrikURL"
}

# Connect to Rubrik and get a list of virtual machines
$endpoint = "v1/vmware/vm"
$uri = "$rubrikURL$endpoint"
$data = Invoke-WebRequest -Method Get -Headers @{ "accept" = "application/json"; "Authorization" = "Bearer $token" } -uri "$uri"
$vmlist = $data.content | convertfrom-json | select-object -ExpandProperty data

# Get more details about each VM
$vmdetails = @()
foreach ($vm in $vmlist | select-object -first 2) {
$vmid = $vm | select-object -ExpandProperty id

$endpoint = "v1/vmware/vm/$vmid"
$uri = "$rubrikURL$endpoint"

$data = Invoke-WebRequest -Method Get -Headers @{ "accept" = "application/json"; "Authorization" = "Bearer $token" } -uri "$uri"
$detail = $data.content | convertfrom-json
$vmdetails += $detail
}

# Detail the VMs and VMs that require updates
write-host "VM Details:"
$vmdetails | ft id,name,isArrayIntegrationPossible,isArrayIntegrationEnabled
write-host "VMs capable of array integration (but not yet enabled):"
$vmsrequiringupdate = $vmdetails | where-object { $_.isArrayIntegrationEnabled -eq $false -and $_.isArrayIntegrationPossible -eq $true}
$vmsrequiringupdate | ft id,name,isArrayIntegrationPossible,isArrayIntegrationEnabled

# 'Patch' each VM to enable array integration
foreach ($vmdetail in $vmsrequiringupdate ) {
$vmid = $vmdetail.id
$endpoint = "v1/vmware/vm/$vmid"
$uri = [uri]::EscapeUriString("$rubrikURL$endpoint")

write-host "Processing $($vmdetail.name) with id $vmid at $uri " -NoNewline
$patchdetails = @{ "isArrayIntegrationEnabled" = $true } | ConvertTo-Json -Compress # change to false to disable storage snapshots

# Perform the patch request
$data = Invoke-WebRequest -Method "Patch" -Headers @{ "content-type" = "application/json"; "accept" = "application/json"; "Authorization" = "Bearer $token" } -uri $uri -body $patchdetails -ContentType "application/json"

if ($data.StatusCode -eq 200) {
write-host "(Status $($data.StatusCode))" -ForegroundColor Green
} else {
write-host "(Status $($data.StatusCode))" -ForegroundColor Red
}
}

Veeam Job Status tip

Working recently with a colleague I mentioned using the left & right arrow keys to move between status reports for jobs within the Veeam console which they didn’t know about.

When you have the job status window open you can press left to go to the prior job report or right to go to the latest/next job report. It simplifies having to go to the job log to see how a job has been performing.

Veeam job status (image taken from https://helpcenter.veeam.com/docs/backup/hyperv/realtime_statistics.html?ver=120 )

The Veeam support article for viewing real time statistics mentions this in a tip:

Veeam tip for status (image taken from https://helpcenter.veeam.com/docs/backup/hyperv/realtime_statistics.html?ver=120)

Effectively performing initial backup of VMs over throttled network links using Veeam

Recently I’ve been backing up some large virtual machines over a WAN and wanted to detail the way I’ve approached this challenge.

Situation

  • Veeam Infrastructure is primarily located in the primary data centre
  • Multiple remote sites with a mix of Hyper-V and vSphere environments
  • High speed WAN to the majority of remote sites (>= 1 Gbit/sec)
  • Remote sites typically did not support Veeam WAN acceleration with existing hardware
  • Remote sites have several very large virtual machines (10 TB+) in addition to regular VM workload.

Approach

Network throttling was enabled between the Veeam Infrastructure in the core data centre to the remote Hyper-V Hosts (on-host proxy mode) and a vSphere proxy at each remote site. Each site had specific network throttling requirements but generally this was somewhere between 300-500 Mbps 24/7.

The Veeam Repositories were formatted using ReFS with 64K blocks to support linked clones for faster synthetic full backups. Each backup job was then configured for a singe large VM in an Incremental Forever mode with Synthetic Full backups occurring weekly (& no Active Full Backups), as the data was traversing a throttled WAN the Compression mode was configured to Optimal.

For each large VM, I then went and modified the exclusion list for disks and added an individual disk and performed a backup; once each backup was completed I added additional drives and started another backup. Once all drives were completed, I then reconfigured the disks to process back to “All Disks”.

Outcome

This approach was quite successful had the following benefits:

  • It didn’t tie up remote proxy tasks for an extended period of time potentially preventing the backup of other virtual machines at the remote site. Each new disk was consuming a single proxy task and existing disks were significantly faster as it only required an incremental backups.
  • Using a dedicated backup job for the large VM meant that the long run time didn’t impact other VM backup operations.
  • Each Virtual Disk was not competing with other virtual disks for bandwidth during the initial backup allowing each backup to complete faster. For some disks this still took multiple days.
  • It provided an immediate restore point for a subset of data when the initial & each subsequent backup completed.
  • It allowed stop points between each backup if maintenance was required on the Veeam infrastructure.

Veeam Compression & De-dupe appliances

Veeam offers a number of configuration settings for compression and deduplication when using backup and backup copy jobs and its important to configure these correctly when using de-duplicating storage appliances as a backup repository.

On a backup and backup copy job there are a number of options for configuring the compression level as seen in the image below, and if configured these settings will be used from the first Veeam component involved in the backup (typically a backup proxy). In the case of a Hyper-V host using on-host proxy mode, this is also the source hypervisor.

Veeam backup job compression level settings

Configuring this option as ‘Optimal’ is generally recommended as this will reduce the network throughput requirements and storage requirements. Veeam provides some feedback on the configured compression-level option:

Veeam compression level None feedback
Veeam compression level Dedupe-friendly feedback
Veeam compression level Optimal feedback
Veeam compression level High feedback
Veeam compression level Extreme feedback

It’s quite rare to utilise the High/Extreme compression levels in a Veeam environment due to the significant increase in CPU utilisation, if you intend to utilise these options I would strongly recommend targeting very specific workloads with a separate backup job for that purpose.

On the backup repository side there is also an additional option to decompress backup data before storing. This is typically used for deduplicating storage appliances such as the Dell EMC Data Domain or PureStorage FlashArray //C where the appliance performs storage efficiency. This allows the appliance to see the full set of backup data and implement compression/deduplication optimally for that appliance.

In environments with high-speed networking, hyper-v hosts with on-host proxy mode and deduplicating storage appliances it is worthwhile performing tests with the backup job compression level to ‘Dedupe-friendly’ or ‘None’ with decompress backup file data blocks before storing enabled on the repository. This may reduce the compute workload on the host and allow for backups to complete in a more timely manner as it will not be performing compression and decompression on the proxy and repository server respectively.