In last Friday's post, I briefly introduced Microsoft Azure and talked about Azure Storage. Today I'm going to continue talking about Azure Storage by talking about Security and how it's implemented. If you want to follow along, download a copy of my AzureCommand class.
There are many ways to implement security on the web. The original SQL Server Data Services that Microsoft debuted in 2008 used basic auth. For whatever reason, Microsoft decided that wasn't the way to go so they implemented two different security schemes for Azure, one for Table Storage and one for both Queues and Blobs. But they share some similarities so we'll talk about them both at a high level before we dig into the specifics.
Account and Access Key
The first thing that's required is the account information provide at the Microsoft Azure Provisioning Portal. If you don't have an account, you can sign up for one and get started. When you create a Storage Project, you'll be provided with three key pieces of data: Endpoints, Primary Access Key and Secondary Access Key. The Endpoints are URIs that point to this project's instances of the storage capabilities. These are:
Endpoints:
http://{account}.blob.core.windows.net/
http://{account}.queue.core.windows.net/
http://{account}.table.core.windows.net/
The second point is the Access Key, a Primary and a Secondary one. They look something like this:
+13B6vI9rTISMIYh2MRqLVqEsd41uJzPcX8vGLBvRu0IcxoQTbgbZCD3yqTFF+cD1dIyl04dZhn46anD4egWDg==
On the plus side, you can regenerate these whenever you want (so the one above is no longer valid). On the negative, you need to share this with the application to be able to access it. Which means it needs to be somewhere that it can be easily changed but if it's not encrypted it won't do any good. Also, there's really only one authentication. The API Documentation makes it sound like the plan is to have more authentication roles available, but they aren't there yet.
Azure uses these two pieces of information, along with a couple of other pieces, to create a hash to determine if you're allowed to access the information.
Verb
The first of these other pieces of information is the HTTP Method (verb) you will be using:
- GET
- HEAD
- POST
- PUT
- DELETE
- TRACE
- MERGE (kind of - this isn't a standard HTTP Method, though there's a push to make it wone)
Content-MD5,Content-Type and Date
The next piece of data is the Content-MD5. This is an MD5 hash of the body, if one is being sent. Next is the Content-Type of the data being sent and the Date (in UTC format). The first two can be empty strings.
Canonicalization
Then come two of the more interesting items, CanonicalizedHeaders and a CanonicalizedResource.
Canon, as used from church law to the Star Trek Universe, is really little more than the standard for how things are. In the case of CanonicalizedHeaders and the CanonicalizedResource, it's how they should be created in order to be consistent, for the purposes of hashing data. To canonicalize headers, we first take only those headers that start with x-ms-. Next, we assemble those headers in order from the HttpRequestHeader. If there's a duplicate name, append the value of the second header entry to the end of the first separated by a comma. And make sure that all values are trimmed. And that the header names are lower cased. And... Actually, let's look at the code that would be easier:
/* Constructing the CanonicalizedHeaders Element To construct the CanonicalizedHeaders portion of the string required for the signature, follow these steps: 1. Retrieve all headers for the resource that begin with x-ms-, including the x-ms-date header. 2. Convert each HTTP header name to lowercase. 3. Sort the container of headers lexicographically by header name, in ascending order. 4. Combine headers with the same name into one header. The resulting header should be a name-value pair of the format "header-name:comma-separated-value-list", without any white space between values. Important: The comma-separated list of headers is not ordered by the header values but by the order in which the headers appear in the request.The list of headers must be in the correct order to properly authenticate the request. 5. Replace any breaking white space with a single space. 6. Trim any white space around the colon in the header. 7. Finally, append a new line character to each canonicalized header in the resulting list.
Construct the CanonicalizedHeaders element by concatenating all headers in this list into a single string.
*/public string CanonicalizeHeaders(WebHeaderCollection hdrCollection)
{
StringBuilder retVal = new StringBuilder();// Look for header names that start with "x-ms-" // Then sort them in case-insensitive manner. ArrayList httpStorageHeaderNameArray = new ArrayList();
Hashtable ht = new Hashtable();foreach (string key in hdrCollection.AllKeys)
{
if (key.ToLowerInvariant().StartsWith("x-ms-", StringComparison.Ordinal))
{
if (ht.Contains(key.ToLowerInvariant()))
{ ht[key.ToLowerInvariant()] = string.Format("{0},{1}", ht[key.ToLowerInvariant()],
hdrCollection[key].ToString().Replace("\n", string.Empty).Replace("\r", string.Empty).Trim());
}
else {
httpStorageHeaderNameArray.Add(key.ToLowerInvariant());
ht.Add(key.ToLowerInvariant(),
hdrCollection[key].ToString().Replace("\n", string.Empty).Replace("\r", string.Empty).Trim());
}
}
}
httpStorageHeaderNameArray.Sort();// Now go through each header's values in the sorted order and append them to the canonicalized string. foreach (string key in httpStorageHeaderNameArray)
{
retVal.AppendFormat("{0}:{1}\n", key.Trim(), ht[key].ToString());
}
return retVal.ToString();
}
With the CanonicalizedHeaders you are left with the CanonicalizedResource. This is a version of the Uri you are using. The short version is "/{accountname}/{rest of the Uri from the root except parameters unless the parameter is 'comp'." My first reaction to that was Huh? The second was that I was beginning to understand why REST folks were saying this was REST-ful or REST-like instead of a REST interface. My third reaction was, I'll bypass the hard version of this. Which is why the code has some workarounds for getting COMPONENTs of object, which is what comp stands for. The current library isn't as usable as I'd like but it works for now and will get improved as I go along. So, for an example of how to do this, let's look at the Url and CanonicalUrl for a couple of examples. If you want to get a list of tables, you make a GET call to http://{account}.table.core.windows.net/Tables. The CanonicalUrl for this is /{account}/Tables. That's simple enough. But let's look at something a bit more complex, an ATS Query.
If I have a table named Demonstrations and want to query all of the data with a PartitionKey of CalendarEntry, then your Url would behttp://{accountname}.table.core.windows.net/demonstrations()?$filter=(PartitionKey eq 'CalendarEntry'). But, since this Url uses parameters that aren't comp parameters, the CanonicalizedUrl is /{accountname}/demonstrations(). I've not implemented anything to canonicalize the Url but you can expect to find that sometime this week.
Putting it all Together
In the meantime, we've got all of the pieces to put together the special secret hash key. First we create a string containing the VERB, Content-MD5, Content-Type, Date, CanonicalizedHeaders and Canonicalized resource, each on a separate line. The easiest way to do this is through use of string.Format, so I start with my formatted string and turn it into an encoded byte array. Then I create a hasher based on the HMACSHA256 protocol return a string in the format of "Shared Key {account}:{hashed data}" that needs added to the header collection as a value named "authorization".
string fmtStringToSign = "{0}\n{1}\n{2}\n{3:R}\n{4}{5}";// {4} is the header which always returns at least one \nstring hdr = CanonicalizeHeaders(client.RequestHeaders);string authValue = string.Format(fmtStringToSign, method, contentMD5, contentType, "", hdr, resource);byte[] signatureByteForm = System.Text.Encoding.UTF8.GetBytes(authValue);
System.Security.Cryptography.HMACSHA256 hasher = new System.Security.Cryptography.HMACSHA256(Convert.FromBase64String(auth.SharedKey));// Now build the Authorization header
String authHeader = String.Format(CultureInfo.InvariantCulture, "{0} {1}:{2}", "SharedKey",
auth.Account, System.Convert.ToBase64String(hasher.ComputeHash(signatureByteForm) ));Now, that only works for Blob Storage and Queue Storage. If you want to use Tables, that's a much simpler path. It doesn't worry about canonicalized Headers and uses a simpler SHA hash method but works in a similar fashion.
Final thoughts
I understand how to create the security necessary for accessing Azure Storage, and I hope you do as well. But there are a couple of things to think about.
First, there's only two keys to access the data and both have what amounts to Administrative rights. It's the account name and either the primary or shared key. You can generate a new key but that will break anything that uses the existing key. Ok, before anyone else points it out, Blob storage does allow for anonymous access to an image but that's still a far cry from any type of reasonable security. While i could write a .NET WinForm application to use Azure Storage data, there is no good way to secure the access.
I'd love to have someone from Microsoft explain why they are using this form of authentication. I know there has to be some reason behind it but it doesn't make sense to me. I think it's overly complex and doesn't actually provide any extra layer of security. I'm willing to be wrong, but I've not seen any explanation of why it's there.
So, I hope this has been valuable to you. Comment or contact me via Twitter and stay tuned for tomorrow's installment on Azure Table Storage.