Today I'm going to look deeper at some aspects of Azure Table storage that didn't make it into the last post. If you want to follow along, download a copy of my AzureCommand class. You also might want to create an Microsoft Azure Account and load in some data. If you're looking for some data to play with, load a copy of test data. I should state that I am not looking at the locally hosted development storage, only at the cloud hosted one.
eTags
HTTP Etags are the way that we address changes to data that we don't know about. Every time we request a discrete bit of data from ATS, it includes an Etag header. Some requests, such as lists of Entitites, include the Etag as an attribute on the entry:
<entry m:etag="W/"datetime'2009-06-02T13%3A26%3A03.814Z'"">
(Note: that etag would actually be W/"datetime'2009-06-02T13:26:03.814Z'" but the necessity of making it safe for the attribute required the changes you see. Keep that in mind, depending on where you get the ETag from.)
When you go to DELETE, MERGE or UPDATE an entity, ATS compares the ETag you send in with the one that exists. If they match, then the command completes. If they don't, ATS returns a 400 (Bad Request). This is important because both person A and person B could get an Entity from ATS. A modifies the data and MERGEs it while B is still editing. When B goes to PUT, it would override everything A has done except that the very act of A updating the information, changed the ETag so B's request will fail. B can then get the information (and a new ETag) and remodify it and there is no data loss.
In ATS, ETags are passed in through the HttpRequestHeaders using the If-Match Header. If the value in the If-Match header matches the ETag value, everything is good to go. But that's not the only thing you can do. You can pass an If-Match value of "*", which will match any ETag. So, if you wanted to delete a Entity and didn't care whether or not the data in it had changed, you could pass it an ETag of "*" and ATS will delete the Entity. My AzureCommand class overloads the Tables and Entities functions to allow for passing in the If-Match data or for just forcing the Update/Merge/Delete.
Querying ATS Data
It's all well and good to have data, to be able to add, update and delete data, but data is really only useful when you can query it to find something useful.
Get a Specific Entity
The easiest way to query an Entity is to use it's PartitionKey and RowKey to get the exact Entity you want. In fact, if you get a list of Entities back, the element is a formatted example of the GET required for that entity, for example:
<id>http://{account}.table.core.windows.net/{tablename}(PartitionKey='CalendarEntry',RowKey='1')id>Execute a GET against http://{account}.table.core.windows.net/{tablename}(PartitionKey='{partitionKey}',RowKey='{rowKey}') (CanonicalUrl is /{account}/{tablename}(PartitionKey='{partitionKey}',RowKey='{rowKey}')) This will return the one Entity that matches, with a status code of 200 (OK). But the real fun comes with building filtered queries, especially if you like challenges.
Get a Filtered List of Entities
ATS uses a LINQ-like syntax to query data. I say LINQ-like because it doesn't support all of the syntax that LINQ does. While you can find all the details here, the basics are that you can use:
- eq (Equals)
- lt (Less than)
- gt (Greater than)
- le (Less than or equal)
- ge (Greater than or equal)
- ne (not equal)
In addition, you can use and, or and not. And you can only use 15 discrete comparisons. And there's no Ordering of data. But let's work with what we have. If I want to use my Calendar demo data and select all of the calendar dates for one website, I write a LINQ statement that reads:
PartitionKey eq 'CalendarEntry' and websiteid eq 'W7'
This will return all of the Entities I am interested in, which is a good start. If I wanted to get those Entities for a specific period in time, I need to remember that I specified that the eventdate property is a DateTime datatype and specify that in my LINQ statement:
PartitionKey eq 'CalendarEntry' and websiteid eq 'W7' and eventdate ge datetime'2010-01-01' and eventdate le datetime'2010-03-01'
If I forget to specify the datetime to cast the value, it gets ignored. So, while the above LINQ returns 3 Entities, the following returns all Entities with a PartitionKey of CalendarEntry, websiteid of W7 and an eventdate >= January 1, 2010:
PartitionKey eq 'CalendarEntry' and websiteid eq 'W7' and eventdate ge datetime'2010-01-01' and eventdate le '2010-03-01'
There are a couple of other items to remember about LINQ and ATS. Numeric properties should not include quotes and boolean can be used just by referencing the property name. Let's take a look at the following example Entity:
<m:properties><d:PartitionKey>CalendarEntryd:PartitionKey><d:RowKey>1530d:RowKey><d:Timestamp m:type="Edm.DateTime">2009-06-04T12:59:36.335Zd:Timestamp><d:Counter m:type="Edm.Int32">15000d:Counter><d:Forsooth m:type="Edm.Boolean">trued:Forsooth><d:eventdate m:type="Edm.DateTime">2012-04-09T00:00:00Zd:eventdate><d:eventdescription>Easter Mondayd:eventdescription><d:eventdetails>d:eventdetails><d:websiteid>W7d:websiteid>m:properties>
This Entity would be included in the following LINQ:
Counter eq 1500
Notice that the 1500 is not between single quotes. If it had been, this Entity would not have been returned. Also, since it has a Boolean with a value of true (remember, Boolean values must be entered as true/false or 1/0 and are case sensitive), this Entity would be included with the following LINQ:
Forsooth
I do not recommend writing you LINQ this way, however, as the lack of the property is defined as false. If my entire table only contains a few Entities with the Boolean Property "Forsooth", some that contain true and some that contain false, a LINQ for not Forsooth returns every Entity that does not contain a property of Forsooth that is true. In other words, the non-existence of the property defines it as false. This is a different behavior than you'll find if you use Forsooth eq false. So my recommendation is to always specify the value of what you are looking for. Now that I've covered the basics of LINQ as it applies to ATS, let's look at the mechanics of it.
Execute GET against http://{account}.table.core.windows.net/{table}()?$filter={LINQ} (CanonicalUrl is/finseldemos/demonstrations()). The {LINQ} section contains the actual LINQ filter you want to execute. The results are returned in the body.
$top
ATS returns up to 1000 matching entities at a time. But we may want some control over that. That's what the $top parameter is for. By including $top={number} you can limit the number of Entities returned to you. Executing GET against http://{account}.table.core.windows.net/{table}()?$top=5 will return the first five Entities in the table, sorted in PartitionKey, RowKey order. It will also return Continuation data.
x-ms-continuation-
Whenever your httpResultsHeader contains one or more entries with x-ms-continuation- as the start of the key, then you know that you have more information to get. You could have more information because ATS reached its 1000 Entity limit or because it reached the limit based on the $top parameter. To get the next set of data, you need to append one parameter to the request you just made for each x-ms-continuation- header you have. So http://{account}.table.core.windows.net/{table}()?$top=5 will return two continuation headers, x-ms-continuation-NextPartitionKey and x-ms-continuation-NextRowKey. So our next request will be to http://{account}.table.core.windows.net/{table}()?$top=5&NextPartitionKey={x-ms-continuation-NextPartitionKey}&NextRowKey={x-ms-continuation-NextRowKey}. Every subsequent call will need to remove all parameters that start with Next and add any continuation headers.
More on tables
I had delayed this a day because MSDN is showing some new functionality for tables, such as Entity Group Transactions, which isn't functional yet. $top is also new, but it works. Finally, you're supposed to be able to use the same authentication pattern on tables as you use on queues and blobs, but that's not working either. So, sometime in the next week or so, you can expect to see an additional post that covers these, once they are functional.
And that wraps up the basics of Azure Table Storage. But we're not through with it yet. There's much more to cover, including some of the Management that an RDBMS does that Azure doesn't supply. Before we can cover that, however, we need to talk about Blob Storage and Worker Roles.
So, tune in next week for part 6 when I'll cover Blob Storage on Monday, Worker Roles on Tuesday and then we can dig into Sagas and Materialized Views and all sorts of other fun.