I've been exploring various aspects of Cloud Computing lately and opened an Amazon S3 account to check it out. Initially, I used the S3Fox add-on to Firefox to create a couple of buckets and upload a small test file. Then I started exploring S3's SOAP APIs.
Normally I'd use Java but since I've been learning Ruby I figured I'd try that approach. It turned out to be quicker, smaller & easier than Java would have been. Why? Ruby has easy to use SOAP and Crypto modules that are built-in to the language. Yes, Java has crypto in its API but it doesn't built-in turnkey SOAP client handling. There are well-known JARs for SOAP functionality, and they're not difficult to use, but neither are they as simple and turn-key as Ruby.
Here I describe this simple little program. It may help others to get started with S3, Amazon APIs, or the Ruby SOAP or crypto modules.
First I wrote a short Ruby module for common stuff needed for Amazon S3.
This contains constants for the API WSDL URL, the account ID, etc.
S3 requires a specially formatted timestamp and an HMAC SHA1 private key hash for every function call.
It also has a few methods to do this dirty work.
Check it out:
S3util.rb
In researching this online, I found several different ways of producing these keys, all of which claimed to work, but only one of which actually did work for me.
Also, Amazon's documentation was inadequate. Perhaps I missed something, but if I did then it wasn't easy to find as I spent a couple of hours digging through it. For example:
The next part is using the Ruby SOAP module to call the S3 SOAP APIs.
I chose the simplest APIs, operations "ListMyBuckets" and "ListBucket(name)".
Here's the code:
listBuckets.rb
Example session: (the warnings come from Ruby SOAP::WSDLDriverFactory reading the WSDL)
S:\ruby\AmazonS3>ruby listBuckets.rb
USAGE: listBucket [-list] [bucketname]
S:\ruby\AmazonS3>ruby listBuckets.rb -list
ignored attr: {}abstract
warning: peer certificate won't be verified in this SSL session
S3 Buckets for this account
Owner: mrclem3
Bucket: mclements.net-data
Bucket: mclements.net-dnd
S:\ruby\AmazonS3>ruby listBuckets.rb mclements.net-dnd
ignored attr: {}abstract
warning: peer certificate won't be verified in this SSL session
S3 bucket: mclements.net-dnd
Name: mclements.net-dnd
Keys: 1000
Partial?: false
Contents:
Key: AbilScoreAdj.dat
Date: 2009-04-10T19:28:43.000Z
Size: 2905
Owner: mrclem3
Class: STANDARD
Key: build.xml
Date: 2009-04-13T18:22:08.000Z
Size: 1237
Owner: mrclem3
Class: STANDARD
S:\ruby\AmazonS3>
What is interesting about this code is first, how easy it is to make SOAP calls from Ruby. Next, you can see in the Ruby code that the formatting of the SOAP reply is problematic.
I don't know if this is a problem in Amazon S3, or in the Ruby SOAP package, but the structure of the reply object is different if there is exactly 1, or more than 1, bucket. According to the WSDL, the reply should have an array (collection, whatever) of buckets. But if there is only one, in Ruby, it has to be accessed as a simply hierarchical child of the reply, not as a container. If you try to access it as a container, it fails. But if there is more than one, then they must be accessed as a container. I suspect this is a problem in the Ruby SOAP module, but who knows?
My solution was to print the multiples (as this is the most common case). If there is only one, this will raise a Ruby exception. I catch this exception and print the single. Optimize for the common case and handle the special case - kludgy, but it works.