[Mono-dev] BufferedStream.ReadByte and WriteByte are extremely inefficient
Kornél Pál
kornelpal at gmail.com
Sun Jan 10 13:03:34 EST 2010
Hi,
BufferedStream should buffer the data in both directions that is
actually implemented for Read and Write methods but not for ReadByte and
WriteByte methods.
The most efficient implementation would be to directly use the same
buffer as Read and Write uses and move the common buffer
filling/flushing functionality into helper methods (or inlined if only a
few lines could be shared).
Also note that assuming a good FileStream (or any other already buffered
stream) implementation BufferedStream cannot perform better, so if you
know that your underlaying stream is already buffered, you shouldn't
wrap it in a BufferedStream.
The idea behind BufferedStream is that you access the stream in small
chunks (smaller than bufferSize) so I believe that the performance
should be optimized for cases when the data being read is already in
buffer and for the cases when the data being written is written to the
buffer. Of course using a fixed one-byte array would speed up the
operation but to maximize the buffered performance ReadByte should not
directly call Read, at least not in the case when data is available in
the internal buffer.
Kornél
Alan McGovern wrote:
> Alternatively you could just allocate and retain a 1 byte array inside
> the BufferedStream class and constantly re-use it. BinaryReader already
> uses this approach. As BufferedStream is sealed, the exact approach used
> doesn't particularly matter as the user can't tell the difference.
>
> Would any of the other Stream subclasses benefit from similar
> optimisations?
>
> Alan.
>
> On Sun, Jan 10, 2010 at 3:35 PM, Tom Philpot <tom.philpot at logos.com
> <mailto:tom.philpot at logos.com>> wrote:
>
> It looks like the implementation of ReadByte and WriteByte in
> BufferedStream follow the default behavior for Stream by allocating
> a 1-byte array and then calling Read() or Write(). This is exactly
> what the MSDN docs for ReadByte and WriteByte ask implementers NOT
> to do:
>
> >From
> http://msdn.microsoft.com/en-us/library/system.io.stream.readbyte.aspx
> Notes to Implementers:
>
> The default implementation on Stream creates a new single-byte array
> and then calls Read. While this is formally correct, it is
> inefficient. Any stream with an internal buffer should override this
> method and provide a much more efficient version that reads the
> buffer directly, avoiding the extra array allocation on every call.
>
> Granted, most people don't really ever read one byte at a time from
> a Stream, but in our case, we need to.
>
> In my simple tests, reading 1.5 GB of data using ReadByte from
> BufferedStream versus FileStream yielded the following the following
> results (Late 2008 MacBook 13.3" 2.0 Ghz, 5400 rpm disk):
> Test #1:
> Using default BufferedStream ReadByte implementation: 763.814 seconds
> Using FileStream ReadByte implementation: 43.53 seconds
>
> Test #2:
> Using default BufferedStream ReadByte implementation: 765.427 seconds
> Using FileStream ReadByte implementation: 42.678 seconds
>
> Obviously the alloc and GC cost of this one byte array is huge. I
> just thought I'd throw this out there in case one of the Mono devs
> (or someone else) wanted to work on a patch before I got a chance to
> submit one sometime on Monday.
>
> Thanks,
> Tom
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> <mailto:Mono-devel-list at lists.ximian.com>
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
More information about the Mono-devel-list
mailing list