Cache-Conditional HTTP Requests

For whatever reason, HTTP’s cache-conditional headers confuse me. I think I’ve figured out how to keep them straight in my head though: the trick is to read “If-<Whatever>” as “Do the ‘normal’ thing if <Whatever> is satisfied”. In this context, the “normal” thing means what the server would have done had “If-<Whatever>” not been used. For example, “If-Modified-Since: <X>” in a GET request means “Send me the content if it has been modified since (i.e. after) <X>”.

I think what confused me was “What action should the server take if the condition is met?” The header names do not indicate what action to take; merely what condition must be met in order for some action to be performed. After several readings of the spec (RFC 2616), and some thought, I realized that the implicit action (in the event that the condition is met) is to do whatever the server would have done, had the cache-conditional headers not been present, which usually means just sending the content back to the requester.

Cache-conditionality is based on two values, known as validators: (most recent) modification time, and entity tags. These correspond to the “Last-Modified” and “ETag” response headers. Furthermore, cache-control headers may be positive or negative. Here is a table summarizing how we’ve just categorize cache-conditional headers:

Sense\Validator modification time entity tag
Positive If-Modified-Since If-Match
Negative If-Unmodified-Since If-None-Match

This gives a basic outline of the cache-conditional headers, but in order to really understand them, one must understand what they were designed for, and in what situations you’d generally use them.

Modification Time

The “If-Modified-Since” header is probably the easiest to understand, but what about “If-Unmodified-Since”? Why would a requester be interested in the content if it hasn’t changed? The basic instinct behind this questions is right: you would normally only be interested in content if it has been updated. This header is not for normal requests; it is for subrange requests. When we think about a typical request, subrange requests probably don’t come to mind. That’s why the use of this header probably isn’t obvious.

Let’s briefly turn our attention to subrange requests. When a requester makes a subrange request, he presumably already has a portion of the content, and he wants to fill in the rest, without having to download the entire thing. In order to do this, he needs to make sure that the content on the server hasn’t changed; otherwise, the requester will be stitching together pieces from two different versions. In general, the result of such a combination is not going to make sense.

CatdogWhat the requester needs is a way to tell the server, “I want a portion of a particular version of the content found at this address”. This is exactly where “If-Unmodified-Since” comes in. Using my technique for understanding cache-conditional headers that I described at the beginning, “If-Unmodified-Since: …” means “Send me the content if it hasn’t been modified after … (since that’s the version that I want to complete)”.

Entity Tags

“If-None-Match” is used in a similar way to “If-Modified-Since”: it is used to detect when changes have been made so that the server only needs to send the content when the requester’s cached copy is no longer valid. There are a couple of odd things about this header that you may have noticed:

  1. Unlike “If-Modified-Since”, this is a negative header. Don’t let that fool you; “If-None-Match” is still about detecting updates so that only updated content is sent back.
  2. As “None” suggests, many entity tags can be passed; whereas, “If-Modified-Since” is implicitly singular. The underlying reason for this difference is that (modification) time follows an ordering; whereas, entity tags do not.

Notice that in the typical case where only one “If-None-Match” entity tag is passed, the semantics can be stated as “Do the normal thing if the entity does not match…” which is probably less awkward sounding than the more general rendering “Do the normal thing if the entity does not match any of …”. If you prefer, you can think in terms of the former.

The opposite of “If-None-Match” has a simpler name, “If-Match”. Perhaps surprisingly, both headers are plural. A more analogous name would have been “If-Any-Match”. Naming aside, the use of “If-Match” may not be intuitive at first, as with the opposite of “If-Modified-Since”. This header can also be used to make subrange requests conditional, although the spec does not specifically discuss this use. Instead, it focuses on how this header is used in non-subrange requests:

  1. “The purpose of this feature is to allow efficient updates of cached information with a minimum amount of transaction overhead.” Imagine several caches along a request chain: a browser cache, and two shared network caches. Each of them has a stale cache entry, but each cache has a different version. All of the caches can discover whether their respective entries are still valid within the same request.
  2. “It is also used, on updating requests, to prevent inadvertent modification of the wrong version of a resource.” In other words, a requester can use this to make sure that it does not clobber changes that some other requester might have made.

Interactions

Now that we know about all the headers that make requests cache-conditional, let’s think about how they work together. What happens when you use more than one of these things at a time? In general, when multiple conditions are specified in a single request, all of the conditions must be met in order for the server to do the normal thing.

There are only two combinations that have defined behavior (the behavior of other combinations is not defined):

  • If-Modified-Since + If-None-Match: As noted above, both of these are for detecting when the content has changed.
  • If-Unmodified-Since + If-Match: Analogously, both of these headers are used to detect when the content has not changed.

For the sake of simplicity, we’ll say that the other combinations are not allowed, because their behavior is undefined. Technically, the specification does not forbid requesters from using those combinations, but servers are not required to behave in any particular way when they receive such requests.

Notice that you cannot combine different headers that use the same validator i.e. the following are not allowed:

  • “If-Modified-Since” + “If-Unmodified-Since”: When request says “If-Modified-Since: <X>” and “If-Unmodified-Since: <Y>”, and X >= Y, it would not be possible to satisfy both conditions. In the other case (i.e. X < Y), it is possible for both conditions to be met, but a cache would never store two versions with different modification times.
  • “If-Match” + “If-None-Match”: This case is similar. If the “If-Match” entity tag is in the set of “If-None-Match” entity tags, then it would be impossible to satisfy both conditions. Otherwise, “If-None-Match” is superfluous.

There are two more combinations that we have not mentioned; this gives us a total of 6 (4 choose 2) combinations:

  • If-Modified-Since + If-Match: Imagine what using these together would mean: what the requester would be telling the server is, “If the content has been modified, and it is this particular version, then do the normal thing.”. Well, if the requester already know the version of the content that it wants to target, specifying a modification time would be superfluous.
  • If-Unmodified-Since + If-None-Match: As with the previous point, this combination would be sending conflicting signals: on the one hand, the implicit requester expectation of “If-Unmodified-Since” is that the content hasn’t been modified; whereas, “If-None-Match” has the opposite implicit requester expectation.

The problem with these two combinations is not that they are logically incompatible; the problem is that the implicit intents behind the headers are opposed, which is a good sign that the requester is confused. This is why the specification allows the server to do whatever it wants when it encounters such combinations.

If-Range: The Oddball

The final cache-conditional header is “If-Range”. Unlike the headers we have talked about so far, it does not make a request conditional, even though it’s name begins with “If-“. Instead, it is used as an accessory to conditional subrange GET requests. Here’s what the spec says:

Informally, its meaning is `if the entity is unchanged, send me the part(s) that I am missing; otherwise, send me the entire new entity’.

Advertisements
This entry was posted in Technical and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s