[Mono-dev] mono WebResponse caching
Daniel Lo Nigro
lists at dan.cx
Tue Jun 11 01:46:14 UTC 2013
>
> It appears that LastModified is always set to the datetime when the
> response comes back from the server
It's up to the server to set the Last-Modified header correctly.
ASP.NETsets it to the current date by default, but pages can override
this.
What I meant with my "nice to have" was cache revalidation using
Last-Modified. To do this, you save the Last-Modified value along with the
cached response. When your cache is expiring, you send a request to the
server, with an "If-Modified-Since" header containing the cached
Last-Modified value. If the server responds with "304 Not Modified", it
means the cached version is still valid, and you can extend its validity
rather than purging it. This can also be done with ETags by using the
"If-None-Match" request header.
Worth noting that for a standards-compliant caching implementation, if the
Cache-Control header contains "must-revalidate", you must send a request
back to the server once the cache expires, with If-Modified-Since
containing the Last-Modified value (and If-None-Match containing the ETag
value). You can only reuse the cached version if the server responds with
"304 Not Modified".
Caching is tricky to get right, would definitely be worth having some unit
tests around this. This page has a fantastic reference on Cache-Control
headers and what they do. Definitely worth reading:
http://www.mnot.net/cache_docs
On Tue, Jun 11, 2013 at 11:27 AM, Mark Lintner <mlintner at sinenomine.net>wrote:
> Good ideas. The MemoryCache in System.Runtime.Caching seems interesting
> but it is for our purposes inaccessible because the dll it lives in has a
> dependency on the System.dll and System.dll already has 3 cross
> dependencies. Still its not impossible. It seems like everything in the
> Cache-Control header is important. That's where max-age it. Ive already got
> no-cache implemented and it looks like max-age=0 means the same thing. It
> appears that LastModified is always set to the datetime when the response
> comes back from the server. You mention that it would be nice to have. Does
> that mean it is not working correctly. I haven't checked whether the value
> is different on Microsoft. I still don't understand how Microsofts web
> cache works. It is persistent in the sense that it is putting the urls in a
> directory. The urls have illegal file names. What is interesting is that
> the speed with which it returns a cached url is roughly equivalent to a
> Dictionary based cache I was testing.
>
>
> ------------------------------
>
> *From:* daniel at d15.biz [daniel at d15.biz] on behalf of Daniel Lo Nigro [
> lists at dan.cx]
> *Sent:* Sunday, June 09, 2013 7:36 AM
> *To:* Mark Lintner
> *Cc:* Greg Young; mono-devel-list at lists.ximian.com
>
> *Subject:* Re: [Mono-dev] mono WebResponse caching
>
> There'd be some code for output caching in Mono's ASP.NETimplementation, I wonder if any of it could be reused here. They're
> different concepts (ASP.NET is using output caching at the server-side
> whereas this is request caching at the client-side) but maybe some of the
> code could be reused, especially around cache header parsing.
>
> A caching implementation would definitely have to take into account, at
> the minimum:
>
> - Cacheability (public / private)
> - Max age and expiry dates
>
> And revalidation using ETags and Last-Modified dates would be a
> nice-to-have (doing the request with the old ETag and Last-Modified date
> and getting a 304 Not Modified back if the page hasn't been modified) but
> isn't entirely necessary with a basic implementation (don't both
> revalidating, just clear the cache when the max age is met)
>
> I'd personally cache the results in memory using System.Runtime.Caching<http://msdn.microsoft.com/en-us/library/system.runtime.caching.aspx>rather than on disk, as on-disk caching can get slow. I haven't checked if
> Mono supports this, though.
>
>
> On Wed, Jun 5, 2013 at 11:32 AM, Mark Lintner <mlintner at sinenomine.net>wrote:
>
>> Hi Greg,
>>
>>
>>
>> Great input. I have not gotten that far yet. Also the cache level is very
>> naïve. Everything caches except Bypass cache which doesn't cache. To check
>> the headers it would check the WebResponse Headers property coming
>> into TryInsert in DirectoryBackedResponseCache.cs . I would combine the
>> result of checking the header with the conditional ShouldCache which also
>> needs work. What we are looking for is comments like this, about what you
>> think needs to be in a usable implementation, gotchas, things to look out
>> for. Also if we are on the wrong track doing it this way I don't want to
>> waste time doing something which will not be useful. Assuming I'm not on
>> the wrong path, I will add the header check next. This is the kind of
>> feedback we need. I only tried to make a prototype of how caching could
>> work and then did some performance measurements. Obviously didn't add any
>> of the details yet. If this is a viable approach I will continue to fill
>> out the implementation guided by research and suggestions. Probably want to
>> limit it to what we really need to have. Any other suggestions, comments,
>> criticism would be greatly appreciated.
>>
>>
>>
>> Thanks,
>>
>> Mark
>> ------------------------------
>> *From:* Greg Young [gregoryyoung1 at gmail.com]
>> *Sent:* Tuesday, June 04, 2013 6:49 PM
>> *To:* Mark Lintner
>> *Cc:* mono-devel-list at lists.ximian.com
>> *Subject:* Re: [Mono-dev] mono WebResponse caching
>>
>> I would expect this to look at http caching headers in response when
>> caching such as expiration, whether something is cachable (public/private) etc
>> but don't see any of that code (or maybe I missed it). Can you point to
>> where it is in your implementation?
>>
>> On Tuesday, June 4, 2013, Mark Lintner wrote:
>>
>>> We have noticed that parts of the Mono Web functionality are not yet
>>> implemented. We are hoping to be able to add some of these missing pieces.
>>> Caching of
>>>
>>> HttpWebResponses is the first one we took a look at. It would appear
>>> that Microsoft takes advantage of the internet explorer caching. Since it
>>> would appear
>>>
>>> that an HttpWebResponse is fully populated when it is returned from the
>>> cache and the only content in the urls that are cached are the bytes that
>>> come in on
>>>
>>> the wire, Microsoft must return things from the cache at a very low
>>> level and and let the response populate the normal way. This did not seem
>>> possible to
>>>
>>> emulate when I looked at mono. I also wanted to cache to a directory so
>>> that the urls can be inspected or deleted.
>>>
>>>
>>> After looking at the code for awhile I realized that I could save
>>> additional information from the response, ie: headercollection,
>>> cookiecontainer, Status
>>>
>>> code etc in the same file as the responsestream. A response can be
>>> serialized but not all of it, most importantly the responsestream is not
>>> serialized. I
>>>
>>> wrote a prototype with some helper functions, the only change to
>>> existing mono code would be in HttpWebRequest.GetResponse. Before calling
>>> BeginGetResponse
>>>
>>> I call a function that tries to retrieve the url from the cache. If it
>>> is not there, BeginGetResponse is called as normal, the response is grabbed
>>> from
>>>
>>> EndGetResponse and passed to a method to insert into the cache. This
>>> insert method will open a filename in a directory designated as mono_cache
>>> with the name
>>>
>>> produced by calling HttpUtility.Encode so it is a legal filename. Also
>>> you can only read from the responsestream once, which means that if read
>>> from it, save
>>>
>>> it to a file, then the responsestream is no longer accessible from the
>>> WebResponse when it is returned to the calling code. If sream copy worked
>>> it could be
>>>
>>> used here. One way to fix the problem is to call GetResponseStream(),
>>> read it to the end, which produces a string, then finally convert the
>>> string to bytes.
>>>
>>> Then you can serialize the bytes to the open file, serialize the
>>> headercollection and ultimately individually add any field you want in the
>>> response when it
>>>
>>> is retrieved from the cache, then close the file. Then create a new
>>> WebConnectionData instance and populate it with headers and all the
>>> relevent fields from
>>>
>>> the old response, finally passing the bytes of data to a memory stream
>>> constructor and assign it to the WebConnectionData's stream member. Then I
>>> construct a
>>>
>>> new HttpWebResponse, passing the WebConnectionData and other arguments
>>> which are taken from the existing request and the recently retrieved
>>> response. The
>>>
>>> response is then returned to getResponse then finally to the calling
>>> code, good as new. Next time the url is requested the cached file is
>>> deserialized, one
>>>
>>> member at a time then a WebConnectionData is setup and finally, a
>>> HttpWebResponse is constructed with the info read from the file, the
>>> WebConnectionData and
>>>
>>> from the members of the new request. I concentrated on prototyping
>>> caching of HttpWebResponses. Only a simple cache algorithm, no Ftp, no File
>>> etc. No cache
>>>
>>> aging etc. There are many other things that would have to be done for
>>> this to be usable.
>>> Using the prototype I requested www.google.com and it took .1601419
>>> sec to retrieve and cache it, including all the mechanics described above.
>>> The
>>>
>>> second time returning it from the cache, building the HttpWebResponse
>>> took .0007671. This is a 288 times speedup. That kind of performance
>>> increase, from a
>>>
>>> prototype that is not yet optimized, is compelling. I must add that I
>>> measured the same experiment with google under the .Net 4.5 implementation
>>> and the
>>>
>>> speedup is 769x. Obviously thats something to work toward. Even falling
>>> short of that level of speedup, some implementation of response caching for
>>> mono
>>>
>>> would greatly benefit the users who need better performance of services,
>>> rest etc.
>>>
>>> So far this is the only mono code that would change to support caching,
>>>
>>>
>>> from HttpWebRequest.cs:
>>>
>>>
>>>
>>> public override WebResponse GetResponse()
>>>
>>> {
>>>
>>> WebResponse response = ResponseCache.TryRetrieve (this);
>>>
>>>
>>>
>>> if (response == null)
>>>
>>> {
>>>
>>> WebAsyncResult result = (WebAsyncResult) BeginGetResponse (null,
>>> null);
>>>
>>> response = EndGetResponse (result);
>>>
>>> response.IsFromCache = false;
>>>
>>> HttpWebResponse response2 = ResponseCache.TryInsert
>>> ((HttpWebRequest)this, (HttpWebResponse)response);
>>>
>>> response.Close ();
>>>
>>> return response2;
>>>
>>> }
>>>
>>>
>>>
>>> return response;
>>>
>>> }
>>>
>>>
>>>
>>> The code can stand some improvement. It is a prototype and it shows. I
>>> have included the file ResponseCache.cs which the above
>>>
>>> code hooks into. We were wondering whether anyone has tried to implement
>>> response caching before and whether there is any intent to in the future.
>>> What are
>>>
>>> your thoughts about the approach described above and shown in the
>>> ResponseCache.cs file and the 3 other files I included. Bear in mind it
>>>
>>> would be refactored so that it fit into the mono architecture a bit
>>> better, it needs to be generalized, different strategies for different
>>> concrete response
>>>
>>> types need to be plugged in, configuration. and a plethora of things I
>>> missed. If you feel this approach has merit I can detail all that
>>> seperatly. If you
>>>
>>> feel a different approach would work better, please let us know, in the
>>> end were looking to make mono more robust in some of the ways that are
>>> expected but
>>>
>>> not yet implemented.
>>> The main thing I want to show here is how it is possible, to wedge a
>>> cache strategy into mono's request pipeline without changing or
>>> destabilizing
>>>
>>> mono itself. The cache implementation could change also, which is one
>>> way that this is code falls short is it totally violates open-closed. Is
>>> this a
>>>
>>> viable approach? There would of course be much more to do, that I have
>>> not even mentioned here. Your comments would be appreciated.
>>>
>>> The files can be found here.
>>>
>>>
>>> http://pastebin.com/FCefsiDx
>>> http://pastebin.com/8qBh8mMW
>>> http://pastebin.com/xVKLaidG
>>> http://pastebin.com/NMVNv30T
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Le doute n'est pas une condition agréable, mais la certitude est absurde.
>>
>> _______________________________________________
>> Mono-devel-list mailing list
>> Mono-devel-list at lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20130611/22ed4004/attachment-0001.html>
More information about the Mono-devel-list
mailing list