[Mono-dev] ASCII Strings Proposal
jopur at microsoft.com
Wed Jul 27 18:45:46 UTC 2016
For reference, only the following small patches were required to run Xamarin Studio:
These are of course experimental, but I want to give a sense of how much work it is to patch code that depends on the current String representation.
From: Jonathan Purdy <jopur at microsoft.com>
Date: Wednesday, July 27, 2016 at 11:35 AM
To: "mono-devel-list at lists.dot.net" <mono-devel-list at lists.dot.net>
Cc: "dotnet-runtime-dev at lists.dot.net" <dotnet-runtime-dev at lists.dot.net>
Subject: ASCII Strings Proposal
I have written a description of my prototype implementation of adaptive ASCII/UTF-16 strings in Mono:
> For historical reasons, System.String uses the UCS-2 character encoding, that is, UTF-16 without surrogate pairs.
> However, most strings in typical .NET applications consist solely of ASCII characters, leading to wasted space: half of the bytes in a string are likely to be null bytes!
> Since strings are immutable, we can scan the character data when the string is constructed, then dynamically select an encoding, thereby saving 50% of string memory in most cases.
I would like to solicit feedback on this proposal from runtime developers and users alike. In particular:
- Specific objections regarding performance characteristics, compatibility issues, &c.
- Questions about unclear or underspecified parts of the proposal
- Real-world use cases that would benefit from this optimization
- Suggestions for suitable real-world benchmarks
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Mono-devel-list