[Mono-dev] Should we replace MemoryStream?

Avery Pennarun apenwarr at gmail.com
Tue Nov 10 11:34:23 EST 2009


On Tue, Nov 10, 2009 at 11:24 AM, Robert Jordan <robertj at gmx.net> wrote:
> Right, but MemoryStream is pretty prevalent and one of its frequent
> usage pattern is:
>
> var ms = new MemoryStream () or MemoryStream(somepredictedsize);
> // fill ms with some stream APIs
> ms.Close ();
> var bytes = ms.GetBuffer ();
> // pass `bytes' to byte[] APIs (e.g. unmanaged world)

But my argument is that your line

  // fill ms with some stream APIs

might or might not result in the array being reallocated even in the
*naive* implementation.  Each reallocation will cause a copy of the
entire buffer every time.

Conversely, a chunked implementation would reallocate-and-copy the
data at most once, when you call GetBuffer().  So it is strictly
equal-or-better than the naive implementation, in terms of
reallocations and copies.

The only exception is if someone provides a huge somepredictedsize; if
you decide that "gosh, that's way too big for a single chunk!" and
allocate less than the predicted size, and then they use up the whole
predicted size so you allocate more chunks, and then they call
GetBuffer, you will be slower because you do one copy instead of zero.
 However, this is avoidable by simply honouring somepredictedsize and
allocating the initial chunk to be requested size.  If an app does
that and gets tons of fragmentation, well, they can stop requesting
such huge buffers.

>> For example, the first call to GetBuffer() could "coagulate" the
>> chunks into a single big array (perhaps with extra space at the end),
>> and then *keep that array*.  Subsequent calls to GetBuffer() could
>> avoid the copy.
>
> GetBuffer () is usually called only once per instance.

The argument in this thread is that "usually" is not good enough.  If
some programs call GetBuffer() more than once and the chunked stream
is inefficient in that case, it would be unacceptable.  I'm not
endorsing the behaviour of calling GetBuffer over and over, but simply
saying that it's easy to implement a chunked stream where this problem
is avoided (and I've done so in the past; in fact it's the most
obvious way to implement it).

Have fun,

Avery


More information about the Mono-devel-list mailing list