[Mono-dev] ASCII Strings Proposal

Jon Purdy jopur at microsoft.com
Wed Jul 27 18:45:46 UTC 2016

For reference, only the following small patches were required to run Xamarin Studio:

libgit2sharp: https://github.com/evincarofautumn/libgit2sharp/commit/4508aa2157448456a6a35733e0040ae2686302dd
Roslyn: https://github.com/evincarofautumn/roslyn/commit/8945af94ece76c54525facb1a2458e5370d56a09
maccore: https://github.com/evincarofautumn/maccore/commit/f67a77d27ae51864e38ebc1857ec58ea7ac23519

These are of course experimental, but I want to give a sense of how much work it is to patch code that depends on the current String representation.

From: Jonathan Purdy <jopur at microsoft.com>
Date: Wednesday, July 27, 2016 at 11:35 AM
To: "mono-devel-list at lists.dot.net" <mono-devel-list at lists.dot.net>
Cc: "dotnet-runtime-dev at lists.dot.net" <dotnet-runtime-dev at lists.dot.net>
Subject: ASCII Strings Proposal

I have written a description of my prototype implementation of adaptive ASCII/UTF-16 strings in Mono:



> For historical reasons, System.String uses the UCS-2 character encoding, that is, UTF-16 without surrogate pairs.

> However, most strings in typical .NET applications consist solely of ASCII characters, leading to wasted space: half of the bytes in a string are likely to be null bytes!

> Since strings are immutable, we can scan the character data when the string is constructed, then dynamically select an encoding, thereby saving 50% of string memory in most cases.

I would like to solicit feedback on this proposal from runtime developers and users alike. In particular:

- Specific objections regarding performance characteristics, compatibility issues, &c.

- Questions about unclear or underspecified parts of the proposal

- Real-world use cases that would benefit from this optimization

- Suggestions for suitable real-world benchmarks

Thank you!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dot.net/pipermail/mono-devel-list/attachments/20160727/78678711/attachment.html>

More information about the Mono-devel-list mailing list