[Mono-bugs] [Bug 574597] New: [Regression?] poor compression in mono 2.6 with System.IO.Compression.GZipStream

bugzilla_noreply at novell.com bugzilla_noreply at novell.com
Wed Jan 27 22:14:15 EST 2010


http://bugzilla.novell.com/show_bug.cgi?id=574597

http://bugzilla.novell.com/show_bug.cgi?id=574597#c0


           Summary: [Regression?] poor compression in mono 2.6 with
                    System.IO.Compression.GZipStream
    Classification: Mono
           Product: Mono: Class Libraries
           Version: 1.2.0
          Platform: x86-64
        OS/Version: Linux
            Status: NEW
          Severity: Normal
          Priority: P5 - None
         Component: System
        AssignedTo: mono-bugs at lists.ximian.com
        ReportedBy: htl10 at users.sourceforge.net
         QAContact: mono-bugs at lists.ximian.com
          Found By: ---
           Blocker: ---


Created an attachment (id=339278)
 --> (http://bugzilla.novell.com/attachment.cgi?id=339278)
use NO_FLUSH instead of SYNC_FLUSH on compression

User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-GB; rv:1.9.1.6)
Gecko/20100107 Fedora/3.5.6-1.fc12 Firefox/3.5.6

posted to mono-devel and thought I'll file this properly, with some editing:

--- On Wed, 27/1/10, Hin-Tak Leung <> wrote:


> I have a small application which writes gzip'ed data like
> this:
>
> sw = new BinaryWriter(new GZipStream(new
> FileStream(filename,
>       FileMode.Create, FileAccess.Write,
> FileShare.None),
>       CompressionMode.Compress, true
>       );
> sw.Write(...);
> sw.Write(...);
> sw.Flush();
> sw.Close();
>
> It use to work fine with mono 2.4, and still does in a way
> with mono 2.6 .

.. What happens is that the output file is now a lot larger.

>
> The uncompressed data is 1234127 bytes, and still
> recoverable in a indentical manner from the output with gzip
> -dc; but the output file is now 8126815 bytes with mono
> 2.6.x instead of 282363 bytes under mono 2.4.x 

....


I found the source of my problem with the poor GZipStream compression with mono
2.6 - it is r138254, which rewrites the gzipstream implementation:

------------------
Author: gonzalo <gonzalo at e3ebcda4-bce8-0310-ba0a-eca2169e7518>
Date:   Tue Jul 21 02:25:00 2009 +0000

    2009-07-20 Gonzalo Paniagua Javier <gonzalo at novell.com>

        * Makefile.am: replaced zlib_macros.c with zlib-helper.c
        * zlib_macros.c: Removed file.
        * zlib-helper.c: new interface for DeflateStream. Flush() actually
        does something.
------------------

The problem is that this change makes the zlib code compress per each write
(before the change, it buffers by the zlib default, which is trying to compress
per 32k input). Most of my little program culculates and write 1 byte out, so
it is mostly 5-byte zlib header overhead + 1 byte! and the file size is
increased about 6 times.

I have tried and tested this patch, which restore the zlib default to the
compression code to buffer data before compression. With this patch, I can just
replace libMonoPosixHelper.so and get compression behavior similiar to mono
2.4.

So the mono-2.4 behavior and with this patch, 1.2MB data is written out as 280k
; without this patch, mono 2.6 writes 8.1MB out (about x6). Curiously,
Microsoft .Net's runtime's Gzipstream implementation seems to do something
between - it generates a filesize of 2MB, which possibly means it buffers and
compresses by 4-byte chunks, not 1 byte and not 32k. (5-byte overhead + 2-3
bytes after compression).

---------------------
diff --git a/support/zlib-helper.c b/support/zlib-helper.c
index 1d61c3d..f8f1587 100644
--- a/support/zlib-helper.c
+++ b/support/zlib-helper.c
@@ -207,7 +207,7 @@ WriteZStream (ZStream *stream, guchar *buffer, gint length)
             zs->next_out = stream->buffer;
             zs->avail_out = BUFFER_SIZE;
         }
-        status = deflate (stream->stream, Z_SYNC_FLUSH);
+        status = deflate (stream->stream, Z_NO_FLUSH);
         if (status != Z_OK && status != Z_STREAM_END)
             return status;

----------------------

Reproducible: Always

Steps to Reproduce:
1. see skeleton code and description above.
2.
3.
Actual Results:  
data increases in size.

Expected Results:  
data decrease in size.

NO_FLUSH is the default

-- 
Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.


More information about the mono-bugs mailing list