[Mono-dev] mono WebResponse caching

Mark Lintner mlintner at sinenomine.net
Tue Jun 11 01:27:32 UTC 2013


Good ideas. The MemoryCache in System.Runtime.Caching seems interesting but it is for our purposes inaccessible because the dll it lives in has  a dependency on the System.dll and System.dll already has 3 cross dependencies. Still its not impossible. It seems like everything in the Cache-Control header is important. That's where max-age it. Ive already got no-cache implemented and it looks like max-age=0 means the same thing. It appears that LastModified is always set to the datetime when the response comes back from the server. You mention that it would be nice to have. Does that mean it is not working correctly. I haven't checked whether the value is different on Microsoft. I still don't understand how Microsofts web cache works. It is persistent in the sense that it is putting the urls in a directory. The urls have illegal file names. What is interesting is that the speed with which it returns a cached url is roughly equivalent to a Dictionary based cache I was testing.



________________________________

From: daniel at d15.biz [daniel at d15.biz] on behalf of Daniel Lo Nigro [lists at dan.cx]
Sent: Sunday, June 09, 2013 7:36 AM
To: Mark Lintner
Cc: Greg Young; mono-devel-list at lists.ximian.com
Subject: Re: [Mono-dev] mono WebResponse caching

There'd be some code for output caching in Mono's ASP.NET<http://ASP.NET> implementation, I wonder if any of it could be reused here. They're different concepts (ASP.NET<http://ASP.NET> is using output caching at the server-side whereas this is request caching at the client-side) but maybe some of the code could be reused, especially around cache header parsing.

A caching implementation would definitely have to take into account, at the minimum:

  *   Cacheability (public / private)
  *   Max age and expiry dates

And revalidation using ETags and Last-Modified dates would be a nice-to-have (doing the request with the old ETag and Last-Modified date and getting a 304 Not Modified back if the page hasn't been modified) but isn't entirely necessary with a basic implementation (don't both revalidating, just clear the cache when the max age is met)

I'd personally cache the results in memory using System.Runtime.Caching<http://msdn.microsoft.com/en-us/library/system.runtime.caching.aspx> rather than on disk, as on-disk caching can get slow. I haven't checked if Mono supports this, though.


On Wed, Jun 5, 2013 at 11:32 AM, Mark Lintner <mlintner at sinenomine.net<mailto:mlintner at sinenomine.net>> wrote:

Hi Greg,



Great input. I have not gotten that far yet. Also the cache level is very naïve. Everything caches except Bypass cache which doesn't cache. To check the headers it would check the WebResponse Headers property coming into TryInsert in DirectoryBackedResponseCache.cs . I would combine the result of checking the header with the conditional ShouldCache which also needs work.  What we are looking for is comments like this, about what you think needs to be in a usable implementation, gotchas, things to look out for. Also if we are on the wrong track doing it this way I don't want to waste time doing something which will not be useful. Assuming I'm not on the wrong path, I will add the header check next. This is the kind of feedback we need. I only tried to make a prototype of how caching could work and then did some performance measurements. Obviously didn't add any of the details yet. If this is a viable approach I will continue to fill out the implementation guided by research and suggestions. Probably want to limit it to what we really need to have. Any other suggestions, comments, criticism would be greatly appreciated.



Thanks,

Mark

________________________________
From: Greg Young [gregoryyoung1 at gmail.com<mailto:gregoryyoung1 at gmail.com>]
Sent: Tuesday, June 04, 2013 6:49 PM
To: Mark Lintner
Cc: mono-devel-list at lists.ximian.com<mailto:mono-devel-list at lists.ximian.com>
Subject: Re: [Mono-dev] mono WebResponse caching

I would expect this to look at http caching headers in response when caching such as expiration, whether something is cachable (public/private) etc but don't see any of that code (or maybe I missed it). Can you point to where it is in your implementation?

On Tuesday, June 4, 2013, Mark Lintner wrote:

We have noticed that parts of the Mono Web functionality are not yet implemented. We are hoping to be able to add some of these missing pieces. Caching of

HttpWebResponses is the first one we took a look at. It would appear that Microsoft takes advantage of the internet explorer caching. Since it would appear

that an HttpWebResponse is fully populated when it is returned from the cache and the only content in the urls that are cached are the bytes that come in on

the wire, Microsoft must return things from the cache at a very low level and and let the response populate the normal way. This did not seem possible to

emulate when I looked at mono. I also wanted to cache to a directory so that the urls can be inspected or deleted.

 After looking at the code for awhile I realized that I could save additional information from the response, ie: headercollection, cookiecontainer, Status

code etc in the same file as the responsestream. A response can be serialized but not all of it, most importantly the responsestream is not serialized. I

wrote a prototype with some helper functions, the only change to existing mono code would be in  HttpWebRequest.GetResponse. Before calling BeginGetResponse

I call a function that tries to retrieve the url from the cache. If it is not there, BeginGetResponse is called as normal, the response is grabbed from

EndGetResponse and passed to a method to insert into the cache. This insert method will open a filename in a directory designated as mono_cache with the name

produced by calling HttpUtility.Encode so it is a legal filename. Also you can only read from the responsestream once, which means that if read from it, save

it to a file, then the responsestream is no longer accessible from the WebResponse when it is returned to the calling code. If sream copy worked it could be

used here. One way to fix the problem is to call GetResponseStream(), read it to the end, which produces a string, then finally convert the string to bytes.

Then you can serialize the bytes to the open file, serialize the headercollection and ultimately individually add any field you want in the response when it

is retrieved from the cache, then close the file. Then create a new WebConnectionData instance and populate it with headers and all the relevent fields from

the old response, finally passing the bytes of data to a memory stream constructor and assign it to the WebConnectionData's stream member. Then I construct a

new HttpWebResponse, passing the WebConnectionData and other arguments which are taken from the existing request and the recently retrieved response. The

response is then returned to getResponse then finally to the calling code, good as new. Next time the url is requested the cached file is deserialized, one

member at a time then a WebConnectionData is setup and finally, a HttpWebResponse is constructed with the info read from the file, the WebConnectionData and

from the members of the new request. I concentrated on prototyping caching of HttpWebResponses. Only a simple cache algorithm, no Ftp, no File etc. No cache

aging etc. There are many other things that would have to be done for this to be usable.
  Using the prototype I requested www.google.com<http://www.google.com> and it took .1601419 sec to retrieve and cache it, including all the mechanics described above. The

second time returning it from the cache, building the HttpWebResponse took .0007671. This is a 288 times speedup. That kind of performance increase, from a

prototype that is not yet optimized, is compelling. I must add that I measured the same experiment with google under the .Net 4.5 implementation and the

speedup is 769x. Obviously thats something to work toward. Even falling short of that level of speedup, some implementation of response caching for mono

would greatly benefit the users who need better performance of services, rest etc.

So far this is the only mono code that would change to support caching,

from HttpWebRequest.cs:



  public override WebResponse GetResponse()

  {

   WebResponse response = ResponseCache.TryRetrieve (this);



   if (response == null)

   {

    WebAsyncResult result = (WebAsyncResult) BeginGetResponse (null, null);

    response = EndGetResponse (result);

    response.IsFromCache = false;

    HttpWebResponse response2 = ResponseCache.TryInsert ((HttpWebRequest)this, (HttpWebResponse)response);

    response.Close ();

    return response2;

   }



   return response;

  }



The code can stand some improvement. It is a prototype and it shows.  I have included the file ResponseCache.cs which the above

code hooks into. We were wondering whether anyone has tried to implement response caching before and whether there is any intent to in the future. What are

your thoughts about the approach described above and shown in the ResponseCache.cs file and the 3 other files I included. Bear in mind it

would be refactored so that it fit into the mono architecture a bit better, it needs to be generalized, different strategies for different concrete response

types need to be plugged in, configuration. and a plethora of things I missed. If you feel this approach has merit I can detail all that seperatly. If you

feel a different approach would work better, please let us know, in the end were looking to make mono more robust in some of the ways that are expected but

not yet implemented.
 The main thing I want to show here is how it is possible, to wedge a cache strategy into mono's request pipeline without changing or destabilizing

mono itself. The cache implementation could change also, which is one way that this is code falls short is it totally violates open-closed.  Is this a

viable approach? There would of course be much more to do, that I have not even mentioned here. Your comments would be appreciated.

The files can be found here.

http://pastebin.com/FCefsiDx
http://pastebin.com/8qBh8mMW
http://pastebin.com/xVKLaidG
http://pastebin.com/NMVNv30T






--
Le doute n'est pas une condition agréable, mais la certitude est absurde.

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list at lists.ximian.com<mailto:Mono-devel-list at lists.ximian.com>
http://lists.ximian.com/mailman/listinfo/mono-devel-list


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ximian.com/pipermail/mono-devel-list/attachments/20130611/ac636c9e/attachment.html>


More information about the Mono-devel-list mailing list