Data and Backup Management for Architectural Photographers

 

Let’s get one thing out of the way first: We did not become photographers to worry about mundane issues like data management, right?

It is an incredibly boring topic, and I should know. Before my career in photography, I was an IT systems engineer for 15 years. Keeping data safe was half my job. My years in the industry also gave me insight into what happens to businesses (yes, you too are a business) that lose their data. It’s not pretty. 

I am in the lucky position to have a slightly higher than average aptitude for technical solutions than the average photographer. Even if you feel like most of what I’m writing is going over your head, take it at least as a starting point to review your production data and backups. I challenge you to run a few what-if scenarios in your head, and to see if you still sleep well tonight. Keep in mind that you can always hire an IT professional to set you up with a backup solution that you can manage yourself. 

Please note that I will not make explicit product recommendations. Some of you will be on Macs, others on PCs. Some will have a pile of production data hard drives in caddies, others might run a giant NAS. Instead, I will focus on a sample workflow that will hopefully provide inspiration for those of you who lack confidence in theirs.  

Introduction

While we should think of ourselves as businesses, small freelancers usually do not have the resources to implement enterprise-grade backup solutions. Luckily, that isn’t really necessary these days. Cheap disk drives are abundant, but I would still recommend to put some thought into a proper workflow.

There are many different ways to implement your data’s lifecycle depending on requirements. In this example, I will take you through my strategy, and why I decided to go down that route.

To give you an idea of my average job:

  • Architectural shoots with few compositions, but a large overhead of bracketed shots, or redundant shots of models walking through images. Some compositions consist of only two images, others of 50.
  • I copy all RAW files onto my computer for proofing purposes. I never delete a photo, since I often get requests to edit up new photos from old projects, even years after the project was delivered.
  • From my RAW dumping ground, I copy only the proofs I want to present to my clients into my Lightroom catalog. Once my clients have made their selection, I edit up the images and deliver them through a cloud service. This approach allows me to retain all images, but without bloating my Lightroom catalog.
  • As a result, all images reside in a very large RAW dump, which quickly fills up my computer and my NAS. Then there is a redundant copy of selected RAWs in Lightroom, which remain on my computer during editing, and later on my NAS permanently. There’s also a permanent copy of all delivered images in the cloud. 

Let’s have a look at how to handle your bytes in a secure manner.

Low effort: Have your data in two places

In a bare-minimum approach, this data should always exist in at least two places. While I’m shooting, I’m saving to two separate cards. Once I copy the images off to my computer, they get backed up daily. So if your own setup resembles a computer with an attached USB drive for backups: Congratulations, you have the basics covered.

Here is what this very simple setup protects you against:

  • A failure of your computer’s hard drive or some other calamity that only affects your computer. 

It will not protect you against:

  • A power surge that fries your computer and any attached hard drives
  • A site-loss scenario (think fire, flood, earthquake, meteorite)
  • A burglary
  • Accidental data deletion that you notice after a few days (assuming you take daily backups)
  • Malware

As you can see, there are a lot of mishaps with a two-places approach to your data that you are not protected against . You could argue that those scenarios are a lot less likely to occur than a broken computer, and you would be correct. But are you willing to bet your business on them never occurring?

The problem with the two places approach is that your data is not really in two locations. As long as a device is physically connected to a computer, it is really just one device. A power surge will travel down the USB lead to fry the attached drive. A fire would also affect both locations, as would malware that encrypts all devices attached to your computer.

Better yet: Have your data in three places

Larger businesses cover against these issues by having multiple backup media that they rotate through. I’m a small operation of one, so I try to keep my backup media to a minimum. Because I’m lazy, one USB drive (I use that term interchangeably for USB housings as well as USB docking stations with naked SATA drives, pick whichever you like best) is constantly connected to my computer for daily backups.

My workflow – click me for full-res version

Once a week (unless I forget), I attach a second drive that carries the exact same data as my daily backup. It contains all relevant bits that I would need in case of a site loss. Think RAW images, edited images, catalogs, business documents, etc. Once that backup is finished, and this is very important, this backup leaves my house. You could store your backup in a bank lockbox if you were paranoid, or you could leave it in a detached garage, or at a friend’s house. In my case, I just leave it in my car. The drive is encrypted, so if someone steals it from my car, I’m not losing any personal data. 

Note: Disconnect all cables from your daily backup while running your weekly backup. Remember, if all your backups are connected to your computer simultaneously, they could all be destroyed at the same time. 

But the cloud, they say

My only exception to my three-places policy is the archived part of my RAW dump. It contains by far the greatest volume of data, and grew to a point where I needed a separate approach. Roughly once a year, I move two identical copies from my production RAW dump on my PC onto two identical hard drives. These files are very rarely needed and not business critical, so there’s no need to clutter my daily backups or productive systems with them indefinitely. The same principle as with my regular backups applies here: One copy lives in my office, the other copy lives at another location.

Cloud storage provides another level of data security. At the end of each project, I deliver all edited images as full-res 100% JPEGs to my clients. These images sit in the cloud indefinitely, and provide me with another layer of security in case my office, my house, and the places where I store data out-of-house disappear overnight. It is very likely that such an event would also lead to my demise, but if I somehow make it through the apocalypse, I still have a copy of my finished work sitting in an Amazon datacentre somewhere.

People have repeatedly suggested to keep all my backups in the cloud for convenience. I’m not a huge fan of that approach. Not only does it make me completely dependent on a fast, functioning internet connection, it also exposes me to the policy changes of transnational corporations, who might decide that I do not get my data back for whatever reasons outside my control. At best, I would consider this approach as a backup for my backups. 

The way in which companies bill for cheap cloud backups is often quite creative. With some services, storing data can be very cheap, but restoring it will be a lot more expensive. Money aside, even with a fast fibre connection I would rather remain the master of my own data.

Software

I will not go into detail on how to get your data copied or moved onto backups. It will just say that backups benefit from some degree of automation. There are all sorts of data backup and synchronisation solutions on the market, and I blissfully stopped paying attention to any of them when I left the industry. Personally, I use a very simple command-line tool called Robocopy, that is included in Windows operating systems. It scans my entire computer for recent changes, and copies only new or changed files over to my daily or weekly backup. It is a very convenient and fast tool for the technically minded. If you want something more convenient, ask your friendly computer person or Google for advice.

For those interested in Robocopy, here is how the work-related part of my script looks:

robocopy.exe C:UsersdrAppDataRoamingAdobe q:pccAdobe-Settings /MIR /MT:4 /R:10 /W:0 /XD @Recycle
robocopy.exe d:lr q:pcdlr /MIR /MT:4 /R:10 /W:0 /XD @Recycle
robocopy.exe d:ps q:pcdps /MIR /MT:4 /R:10 /W:0 /XD @Recycle
robocopy.exe d:raw q:pcdraw /MIR /MT:4 /R:10 /W:0 /XD @Recycle 
robocopy.exe w: q:nasw /MIR /MT:4 /R:10 /W:0 /XD @Recycle
robocopy.exe y: q:nasy /MIR /MT:4 /R:10 /W:0 /XD @Recycle

Note: I include the Adobe settings folder, which will allow me to restore settings, calibration profiles, and any other customisation of Adobe products. 

I execute this script manually, since I do not want a scheduled task to kick off and slow down my computer while I’m otherwise busy. This requires some diligence, so if you don’t want to rely on remembering your backup runs, just schedule an automatic task.

Maintenance & Verification

Little maintenance should be required with this type of setup. Every few years, I will run out of production or backup capacity, and I will usually just replace drives with larger options. “Just” is a bit of a misnomer here, since the process requires some level of technical confidence. It can also take a day or two to copy 10TB of data from one medium to the other, times two. Before I make any changes, I consider every step of the process to ensure that my data is always present in three locations, in case I blow up one or two of my media. Creating and ticking off a run sheet helps with staying on top of the process.

Regular verification should also be part of your process. It is lovely that your backup software presents you with a green check mark, but does that actually mean your backed up data is where it is supposed to be? A few times a year, and every time you make a change to your backup software, have a look through your backup media to see if recently added files are actually present and readable.

When it comes to backups, paranoia is your friend!

Summary

The benefit of the three-places system is that it entirely relies on cheaply available components: SATA hard drives. Every few years the latest generation of disks adds another few terabytes of capacity, and I can simply replace the drives used productively and in backups. Three-places covers you against most common data-loss scenario short of your street disappearing into a sinkhole, while not breaking the bank. 

Bonus tip: When investing in a NAS (Network Attached Storage, a mini-server), get one with at least four bays and RAID5 or similar redundancy. This will allow you to add more disks on the fly as your storage requirements grow. Most devices from reputable companies like Qnap, Synology and Netgear allow you to replace old disk with larger ones on the fly. It’s an extremely convenient system that grows with your needs. 

Bonus tip 2: For the SSDs (solid state drives, be it SATA or M.2) in your computer, make sure to keep your production files (RAW images, edited images, catalog files) on a separate physical drive from your operating system. This allows you to add a larger drive later, and move all existing data over, without having to worry about moving your operating system. This approach might not be an option with a laptop, but it’s quite easy with a workstation. 

Benefits of the three locations approach with SATA hard drives:

  • Cheap (SATA HDD)
  • Fast (with USB 3.x devices)
  • Flexible (capacity extension)
  • Covers against the most common data-loss scenarios:
        Single or dual drive failure
        Power surges
        Water damage
        Fire
        Malware
        [Insert natural disaster]
        Theft
        Accidental deletion (until you overwrite your weekly backup)
  • Independent from internet connections
  • Independent from the whim of transnational corporations (in a cloud-only backup scenario)
  • Better sleep

Disadvantages:

  • Full responsibility for your own backups requires ongoing maintenance and checks
  • Some technical ability is required.