Blog: backup

A Windows 8 machine was being backed up with the Windows 7 backup. The backup completed all the file level backup but it failed backing up the system image. I found various articles indicating that the problem was in creating the shadow copy, and apparently it tries to create the shadow copy on the system partition instead of the larger “C” partition (in this case). In this case, the system partition (partition #1 on the physical disk 0) was 1GB and the C drive (partition #2 on the physical disk 0) is about 450GB. [more]

I used the partition program Mini Partition Home Edition V7.7 (downloaded from http://www.partitionwizard.com/download.html) to resize the C drive smaller, then shift it so the system partition can grow contiguously. I increased the system partition size to 2.5GB. Then, the Windows 7 backup program ran to completion and backed up the system image also.

In using the Mini Partition program, I had to remove all USB drives from the system. If USB drives are found, then the Partition Wizard will error out when it reboots to apply the partition changes. This problem is discussed in the FAQ’s for the Partition Wizard found here: http://www.partitionwizard.com/faq.html


 

A system was running GUID partition tables (GPT) in place of MBR and UEFI instead of BIOS. After a restor from backup, when trying to enable BitLocker, I got an error saying, “Element not found”. This vague error message did not provide any helpful results on Google, so I tried running BitLocker from the command line. Running the command “manage-bde –on C: -tpmandpin” gave me an error code (0x80070490) to go with the vague message. A Google search for the error code yielded this link to TechNet that says this is a known issue when moving hard drives between systems using the UEFI boot firmware and that running “bcdboot %systemdrive%\Windows” command will fix it. The command did not fix the problem, but it pointed me in the right direction. Some more searching led me to this link that talks about how to manually delete the “bootmgfw.efi” file in the UEFI boot partition. After deleting the file and then running the “bcdboot” command from the TechNet article, BitLocker encrypted the drive.


 

A monitoring service reported an Asigra DS-System running low on available disk space. Looking at our storage reports, I was unsure as how we had filled up so much space so quickly. The answer comes from the “Trash” that Asigra creates. When Asigra expires backup data either due to the max generations or retention policies being met, it doesn’t immediately remove the data from the DS-System. Instead, this data is moved to the Asigra  equivalent of the Windows Recycle Bin before being purged. By default, a scheduled job runs monthly to remove data from Trash that is older than 30 days.  This means that data could be in there for potentially 2 months before being removed permanently.  You can run a job to purge the data manually or you can alter the schedule to run more frequently if needed.


 

Over the years, many people have asked me about backup for home machines.  Burning files to DVDs and carry them to a different location is problematic.  It's a lot of trouble to make frequent offsite backups.  I recently did some research and decided on using a program called Duplicati for backup and Amazon S3 (Simple Storage Service) for storage.  I think having the backup program and the storage separate is the best solution.  I can even back up to multiple providers in case one of them just goes away without warning.

Duplicati is free and open source and runs on Windows, Linux and MacOS.  It has a nice GUI interface plus a rich command line.  Duplicati has built-in AES-256 encryption, which means you hold the key and your backups are encrypted before leaving your network.  It creates normal zip files and then encrypts them with AES Crypt, so even if Duplicati breaks, you can still download, decrypt, and unzip your backups using other standard tools. [more]

Duplicati will back up to many different cloud providers (Amazon S3, Rackspace, Google Docs, SkyDrive, Tahoe-LAFS, WEBDAV, FTP, SSH) as well as file based locations.

I chose Amazon S3 for storage because of the history of reliability of Amazon.  The cost if not much either.  You get 5 GB free, and then it’s 12.5¢/GB/month after that.  So you can store 50GB for less than $6/month.  It is even cheaper if you choose the Reduced Redundancy Storage (RRS).

Get Duplicati here http://www.duplicati.com/.

Sign up for Amazon S3 storage here http://aws.amazon.com/.


 

The other day I had an issue come up with a customer where VSS (Versative Storage Server) integrated file system backups stopped working for some unknown reason. Usually, a reboot fixes these types of issues, but backups continued to fail after a reboot. I started a support call with the backup vendor and after seeing the error logs, the support tech seemed fairly sure he knew what the problem was. This error is usually caused by a malformed path within the registry. So he had me run the following commands on the server and send him the output. [more]

vssadmin list writers >> c:\writers.txt
vssadmin list providers >> c:\providers.txt
vssadmin list volumes >> c:\volumes.txt

diskshadow /L c:\shadow.txt
list writers detailed

After reviewing the text files created, he found the malformed path:

- File List: Path = c:/windows\hpsum_1327455089, Filespec = hpsumserverw32.exe

To correct the issue, I searched the registry under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\ for “c:/windows\hpsum_1327455089” and corrected the path to “c:\windows\hpsum_1327455089” . After doing this, the backup ran fine. Further research uncovered the root cause. During the last maintenance window, HP System Update Manager was used to update the HP System Management Homepage on these servers. This malformed registry key was created by HPSUM during the upgrade.


 

When setting up VaultLogix online backup, make sure the server is not configured to apply Windows Automatic Update and reboot during the backup window. The problem that occurs is Windows Automatic Update reboots the server in the middle of a backup, VSS shuts down, but can allowed enough time to commit a partial backup which can leave off drives that need backed up. On the next backup, the agent will think there is new data and commenced to reseed those drives on the subsequent backup. Which can cause a problem, if the network has a slow connection that had to be seeded with a mobile vault, because backups will never be able to catch up.

If this occurs you must stop the re-seeding, purge all of the impartial backups since the error occurred. Then resynchronize the Vault Logix DTA file. Then the agent will not try to reseed data that is already backed up and just do the deltas for the selected drives.


 

The CommVault Exchange Mailbox iData agents do not backup mailboxes associated with disabled Windows user accounts. The backup job reports a "success" for the job, but when the details of the backup are explored, the backup set does not contain any data. Additionally, requesting a listing of all failed objects for the backup job results in a "no failures" status. According to CommVault, this behavior is by design as is the "successful" backup status. After all, the job did not technically fail if it is not designed to include mailboxes belonging to disabled user accounts. This is very strange given that, in general, CommVault iData agents have an "inclusive by default" behavior.  This can become a real problem if you try to restore data for a former employee whose Windows user account was disabled when they left the company.  The lesson here is that you should always test your backups. Even if the backup report and all job status notifications indicate you are good....test anyway.


 

One of our customers uses VMware VCB backups integrated with CommVault Simpana. The CommVault job simply calls a pre-backup script to snapshot the VM and copy all the VM files to the VCB proxy, backs up the files from the proxy to the CommVault media server, then a post-backup script commits the snapshot and purges the VM files from the VCB proxy.

Recently, we upgraded this customer from VMware VI3.5 to VMware vSphere v4 Update 2. For most of the VMs that are backed up with VCB, we had no issues at all. The backups ran the weekend following the upgrade with no issues. However, all of the VMs that had been secured with the Windows Security Configuration Wizard would not back up. These VMs are in the DMZ and are locked down very tight because they host externally available web applications. The issue is that each time a backup was initiated from CommVault, the VCB script would return a non-zero error due to a snapshot failure in VMware. VMware’s error was “Cannot create a quiesced snapshot because the create snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine.” This would happen when using VCB scripts, but I could create a snapshot without error from the VI client. [more]

After much research and testing, I determined that the problem was hold-over from the VMTools upgrade. In the new version of VMTools, a new service is installed called VMware Snapshot Provider is installed. This service gets installed when VMTools is upgraded. Its purpose is to help facilitate application consistent snapshots through the VMTools. On the servers that were getting the “quiesced snapshot error”, this service was not present at all, but VMTools had already been updated…very strange. Here is where the Security Configuration Wizard comes in. Part of our lockdown policy is to disable a service called COM+ System Application. This service manages the configuration and tracking of COM+ based components. Apparently, without this service enabled, VMTools upgrade will NOT install the VMware Snapshot Provider service. Without the service, no quiesced snapshots and you get errors when creating snapshots via the VCB integration modules.

So why could I create a snapshot from the Vi client? Well, VMware knows that you are using VCB to create snapshots for the purpose of backup. What good would the backup be if it wasn’t app consistent? The VI client, on the other hand, will first try to create an app consistent snapshot, but if it fails or times out, it will go ahead and create the snapshot “crash consistent” without error. VCB is not as forgiving. If the guest quiesce fails, the snapshot fails…end of story. The solution was to uninstall the VMTools, reboot, temporarily enable and start the COM+ System Application service, install VMTools, then disable the COM+ System Application service. After I did that, backups have been running fine since.


 

I learned the reason that VMware suggests having service consoles for ESX hosts on at least two distinct networks last week. I was troubleshooting intermittent backup issues with Veeam on a customer network and couldn’t really find any pattern to the failures. Two or three backups in a row would run successfully, then 5 in row might fail. The behavior was very random. However, the failures were always on Virtual Machines associated with a specific ESX host. At first I thought the host was healthy, but after watching the VI client for an extended period of time, I noticed that the ESX host would drop offline (showing disconnected in the VI client) and then come back online again.  This indicated the problem wasn’t just affecting the management/backup server. [more]

In order to level set my troubleshooting efforts, I decided to reboot this ESX host. However, after the reboot, I could not connect to it with the VI client. I could ping the IP assigned to the service console, but couldn’t SSH or connect via the VI client. I logged in via iLO and found that an ifconfig at the command line returned IP = 0.0.0.0…..interesting. So what is responding to my pings. I checked the arp cache on one of the switches and found that a thin client had been plugged in that had the same IP as my LAN service console. What is really odd is the MAC address for the thin client was all zeros AND the IP I was using for the LAN service console is not even available to be distributed by DHCP. I was not able to connect to the thin client to see how it was configured, but I was able to connect to ESX host via a second service console port that I placed on the iSCSI network. The management/backup server has a connection to the iSCSI network to do backups to disk so I was able to change the LAN-facing service console IP to another IP and everything started working fine. The backup issue was obviously being caused by changes in the arp entries on the backup server between the thin client and the ESX host. So, be aware that at boot-time, if ESX determines that the IP it is using for a service console is already in use, it just rips it out of the configuration and continues to boot with NO WARNINGS or ERRORS on the console.


 

I had an issue come up with using GUID partition table disks in Windows 2008 VMs. The issue involves doing a file-level restore from image-based backups made using 3rd party VMware backup utilities such as Veeam Backup, Vizioncore vRanger, or esXpress. In Windows 2008, the disk containing the system partition is always MBR, but disks with non-system partitions I had been using GPT. I found specifically with Veeam, file level restore functionality does not work because when the vmdk is mounted to the recovery host during the process, the partition table cannot be read. The partitions on the system disk show up fine, but all partitions on GPT disks are not available. A VERY close look at the Veeam documentation shows that GPT disks are not supported, only MBR disks. So, if one of these products will be used for backup, it would be best just to go with the MBR disks.