Register
Wednesday, February 08, 2012
 
Support this site Minimize
 
 
 
  
 
 Blog
  
Cloud Computing Thoughts Minimize
 
 
 
  
 
History Minimize
   
 
  
 
Cloud Computing Thoughts Minimize
   
 
  
 
Cloud Computing Thoughts Minimize
 
Jun23

Written by:Josef Finsel
6/23/2009 6:35 PM 

 So far, this series has covered an Introduction to Microsoft Azure, the way that Azure Storage handles Security, an Introduction to Azure Table Storage, more on Azure Table Storage and Azure Queue Storage. Today I venture into Azure Blob Storage. If you want to follow along, download a copy of my AzureCommand class. You also might want to create an Microsoft Azure Account. I should state that I am not looking at the locally hosted development storage, only at the cloud hosted one.

When we talk about BLOBs, we are talking about Binary Large Objects, not an amoeba like alien that terrorized Downington, PA (although it appears that a blob is a blob and not really a BLOb). And BLOBs generally can be thought of as files. These files exist in...

Containers

Containers in Azure Blob Storage (ABS) are defined URIs. A container can only exist at the root so the URI for a container is:

http://{account}.blob.core.windows.net/{container}

Container names have some minor requirements. The container name must be a valid DNS name, conforming to the following rules:

  • Container names must start with a letter or number, and can contain only letters, numbers, and the dash (-) character.
  • Every dash (-) character must be immediately preceded and followed by a letter or number.
  • All letters in a container name must be lowercase.
  • Container names must be from 3 through 63 characters long.

Failing to follow these rules will result in getting a status code 400 (Bad Request) or 403 (Forbidden), depending on how malformed the request is. For instance, attempting to create a queue called "-invalid-queue-name" will return a 400 (Bad Request) but "-invalid queue-name" returns 403 (Forbidden). This is an inconsistency that will cause you to tear your hair out trying to track down an authentication bug that isn't!

Yes, I said the same thing in my post on Queues, but it applies here as it's the same code base and the same problem.

There are three basic things you can do to a Container: Create a new Container (POST), delete an existing Container (DELETE) and get some of the information about a container (GET). In addition, you can save and update Metadata associated with a Container and set whether or not it supports anonymous GET access.

Creating a Container

Creating a Container is easy (as long as it is named correctly). Execute a POST against http://{account}.container.core.windows.net/{Container} (CanonicalUrl is /{account}/{Container}). You can include Metadata headers (see below) but they are not required. Successfully creating a Container returns a 201 (Created).

Getting Information About a Container

Getting information about a Container is easy. Execute a GET or a HEAD against http://{account}.container.core.windows.net/{Container} (CanonicalUrl is /{account}/{Container}). The difference between HEAD and GET is that responses to HEAD calls never return a body and are more efficient, especially since all of the data about a Container is in the headers. In this case that would be any header that begins with x-ms-meta-, which is how ABS stores and returns Metadata.

Deleting a Container

Deleting a Container is easy. Execute a DELETE against http://{account}.container.core.windows.net/{Container} (CanonicalUrl is /{account}/{Container}). Successfully deleting a Container returns a 204 (No Content).

NOTE: Deleting a Container deletes all BLOBs in the Container. You are given no warning that BLOBs exist in the Container at the time you delete it!

Container Metadata

Container Metadata is handled through HttpHeaders. Any header you create that begins with x-ms-meta- will be treated as Metadata and returned when you execute a GET or a HEAD against the Container. All you can do is replace all of the Metadata on the Container with a new set by executing a POST against http://{account}.container.core.windows.net/{Container}?comp=metadata (CanonicalUrl is /{account}/{Container}?comp=metadata). My sample application and web interface both show a Delete Metadata but really I'm just executing the POST with a blank set of Metadata, which effectively removes all Metadata headers.

Setting the ACL on a Container

ACL stands for Access Control List. For ABS Containers it really means that you're setting a flag that denotes whether BLOBs in the container can be downloaded without authentication (thus appearing as standard GET requests to browser). One example of this is the image below:

This image actually resides on one of my ABS accounts where the ACL has been set to allow the public to download it. Setting and determining the ACL is handled through HttpHeaders, similar to the way Metadata is handled. The specific header is x-ms-prop-publicaccess. Execute a POST against http://{account}.container.core.windows.net/{Container}?comp=acl (CanonicalUrl is /{account}/{Container}) with the HttpHeader x-ms-prop-publicaccess set to true to grant access and false to deny it. Executing a GET against http://{account}.container.core.windows.net/{Container}?comp=acl (CanonicalUrl is /{account}/{Container})  is the only way to retrieve that header's value and must be done using security.

Now that we have created a Container, we can store information in it, specifically...

BLOBs

BLOBs have some interesting and curious parameters. For one thing, the largest blob you can store in ABS is 50GB. But any attempt to store a BLOB larger than 64 MB requires posting the BLOB in 4MB increments. Which means you can upload up to a 64MB file with a single post command but anything larger will require you to split it up. You can Create (POST) a Blob, Delete (DELETE) a Blob, Retrieve (HEAD) information about a Blob, Get (GET) a Blob, List the Blobs in a Container and Retrieve and Set Metadata about a Blob. We'll cover these and then I'll go into special cases like handling Blobs>64MB using Blocks and Copying a Blob. But first, let's talk about...

Naming Blobs

Blobs are not, exactly files; Containers are not exactly folders, and Containers cannot be nested. That means that you can't create a folder hierarchy that matches a physical folder hierarchy on your computer. But you can create a virtual one. The Blob name is everything after the Container name and a /, and you can include / in the Blob name. If you create a Blob named 200905/onepic.jpg and another called 200906/onepic.jpg, both within the Container images, then they will appear, to all intents and purposes, to be two images in two separate physical folders. Not only that, but listing Blobs within a container supports the ability to list directories, kind of.

Listing Blobs in a Container

Getting a list of Blobs requires making a request against a Container. Execute a GET against http://{account}.container.core.windows.net/{Container}?comp=list(CanonicalUrl is /{account}/{Container}?comp=list). If you use no additional URI parameters, you'll get a list of everything in the Container (up to 5,000 entries). But there are four URI Parameters that you can specify, two that help with filtering the Blob list and two that help manage how many blobs are returned. The first parameter is maxresut which will limit the number of Blobs in the list. Any value more than 5000 will result in a status of 400 (BadRequest). Any number between 1 and 5000 will result in getting just that number of Blobs returned. If there are more than the number requested, you'll get a NextMarker element to use as the value of the Marker parameter. To handle paging, you need to use the Marker parameter. The Marker parameter tells ABS where in the list to start returning data. With maxresults and marker you can page through a list of blobs in a container.

Filtering of the Blob list is handled with Prefix and DelimiterPrefix filters the list so that only blob items that start with the prefix value appear in the list. Delimiter returns all Blobs that start with the specifid prefix and end with the delimiter. Given these two options, you might think you can create and walk virtual directories, but it's not that easy. These are a little more complicated, so let's take a concrete example. Let's say your container has the following blobs defined in it:

  • 1/CalendarEntry.xml
  • 1/2/CalendarEntry.xml
  • 1/2/3/CalendarEntry.xml

A request for Prefix=1/ would return all three entries but adding delimiter=/ doesn't return anything. If you want to use virtual directories then you really need to add a couple more blobs to act as directory holders. Make them 0 length Blobs and have the Blob end with / and you'll be all set. Now your list of Blobs will look like this:

  • 1/
  • 1/2/
  • 1/2/3/
  • 1/CalendarEntry.xml
  • 1/2/CalendarEntry.xml
  • 1/2/3/CalendarEntry.xml

And you can use the Prefix and Delimiter to get lists of "directories" and then lists of  Blobs in those "directories".

Getting a Blob

Execute a GET against http://{account}.blob.core.windows.net/{Container}/{Blob} (Canonical URL is /{account}/{Container}/{Blob}).That's really all there is to it. If the container is public, you don't even need to use authorization. The Blob will be returned in the body of the HttpResponse, just as it is with the image above. The HttpResponseHeaders will contain the Content-Length (how many bytes long the object is), the Content-Type (what the MIME content type is) and any metadata you have associated with the object. If the Container's ACL doesn't allow public access, then you will need to include the authorization header. Returns 200 (OK).

Getting Information about a Blob

Getting information about a Blob is done using HEAD rather than GET so it doesn't return the actual Blob itself.

Deleting a Blob

To delete a blob, Execute DELETE against http://{account}.blob.core.windows.net/{Container}/{Blob} (Canonical URL is/{account}/{Container}/{Blob}). Success returns a 202 (Accepted).

Creating a Blob

Creating a Blob that's less than 64 MB in size is as simple as Executing PUT against http://{account}.blob.core.windows.net/{Container}/{Blob} (Canonical URL is /{account}/{Container}/{Blob}) and include the Content Type and any metadata you want to use. The short version of this looks like this, after handling all of the authorization and metadata and headers:

 

Stream requestStream = req.GetRequestStream();
requestStream.Write(Content, 0, (int)ContentLength);
requestStream.Flush();
HttpWebResponse response = (HttpWebResponse)req.GetResponse();
response.Close();

It takes the Blob in the form of a byte array and passes it to the HttpRequest.RequestStream as a stream. And that's it. Really, it's that simple. If you want to handle any size file, however, you'll need to support Blocks. But we'll talk about that tomorrow, as this is getting long enough already.

Tags:

1 comment(s) so far...

Re: A REST-ful Look at Azure Blob Storage

Microsoft has sunk to a new low . . . using “trial traps” . . . yes one must contact customer support to cancel a trial. There’s plenty of opportunity to sign up for more services but they make it very hard to cancel. I called Microsoft support . . . all the options for Azure support were available EXCEPT the one to cancel . . . it says they are CLOSED! I’m just going to tell my cedit card company to charge back all Microsoft fees and move on to open source . . . there’s plenty of free software out there that works suffciently well given the cost. Azure is lack luster to say the least and high priced given the limited services. BTW: I was a devoted Microsoft customer for 15 years . . . I’m also a MCSE . . . but I’m giving up on Microsoft because they really are indifferent, at best, to the impact they have on customers.

By not given on  10/18/2010 6:40 AM

Your name:
Your email:
(Optional) Email used only to show Gravatar.
Your website:
Title:
Comment:
Security Code
Enter the code shown above in the box below
Add Comment  Cancel 
 
 
  
 
Privacy Statement | Terms Of Use Copyright 2009-2010 by Azure-Architect.com