How Oracle (yes Oracle!) helped Amazon suck the cost out of database backup

Amazon.com’s massive retail operations require a ton of database power and related database backup. All that tape backup along with the robots and specialized software required to run them cost significant cash.

The giant retailer uses Oracle databases and Oracle Recovery Manager (RMAN) to automate backup using robotic tape library systems. The cost of all that back-up paraphernalia was non-trivial so when Oracle brought out Oracle 10g with updates that let RMAN back up directly to Amazon’s S3 storage service, Amazon saw an opportunity to cut out the backup middleman, er robot.

According to a recently released AWS white paper describing the process:

We incurred significant capital expenditures over the years for tape hardware, data center space for this hardware, and licensing fees for tape software. In addition, it’s been difficult for us to hire engineers with the requisite experience for operating such hardware. We knew that Amazon S3 could reduce these costs to near zero.

The problem was while tape robots perform basic read/writes,  Amazon also needed sophisticated (read: pricey) tape backup software to provide additional capabilities. A move to S3 would obviate that need.

Amazon Web Services, acting as IT supplier to its parent company, tested out the new scenario, using an Oracle white paper as a starting point, comparing relative costs, data security, availability and durability considerations of old and new configurations. The decision was made to make the move

The transition to S3-based backup started last year and by summer, 30 percent of backups were on S3; three months later it was 50 percent. The company expects the transition to be done by year’s end — except for databases in regions where Amazon s3 is not available.

The diagrams below outline old vs. new set up.

One plus is that Amazon.com DBAs like the change. As Amazon.com grows, so does its databases. That causes tape backup and restore operations to take much longer. With S3 backups “DBAs don’t have to contend for available tape drives anymore, and in the case of disaster recovery, they don’t have to wait for hours while a backup is restored from tape,” said Dalibor Marceta, database engineering manager for Amazon Merchant Technologies, according to the AWS white paper.

Amazon cautions that cost comparisons are tricky. For tape backup you have to consider overall spending on the hardware, backup software and the tape itself, then factor in depreciation and cost of adding hardware over time plus the cost of retaining people to physically manage and maintain the hardware and resolve contention issues. Amazon S3 cost, by comparison, can be found by multiplying the size of database backups by S3′s per-GB price multiplied by the frequency of backups plus the AWS bandwidth charges for outbound data.

In this case, Amazon estimates it’s saving more than $ 1 million a year in hardware and software costs and — perhaps more importantly — it no longer has to hammer out separate contracts with tape hardware and software vendors. And, AWS said it takes less than half the time to back up a database to S3 compared to tape.


GigaOM