[Mono-dev] [PATCH] Use UTF-8 encoding for source files in mcs tree and for ChangeLogs

Miguel de Icaza miguel at ximian.com
Wed Aug 16 11:18:42 EDT 2006


Hello,

     We are in the process of cooking up 1.1.17, I would like to
postpone this patch until 1.1.17 is released, so we have more time for
testing.

     1.1.17 will be done next week.



> Kornél Pál wrote:
> > Hi,
> > 
> > Note that I posted the patch for mcs tree uncompressed but it was too 
> > large and hasn't been approved to the list yet so I resend it zipped.
> > 
> > Currently source files (*.cs; *.vb) use different encodings:
> > - ASCII
> > - Latin 1
> > - UTF-8 (without BOM)
> > - UTF-8 (with BOM)
> > 
> > The same is true for ChangeLogs but there were mixed encodings as well.
> > 
> > Our mcs compile doen't recongnize UTF-8 without BOM so those files are
> > incorrectly compiled if they were in Latin 1.
> > 
> > All of our source files should use the same encoding to be consistent that
> > makes code more maintainable as well. UTF-8 without BOM seems to be a good
> > solution as it provides support for every possible Unicode character so 
> > this
> > is a long-term solution for the problem.
> > 
> > I used the attached Latin1ToUtf8.cs to convert the encoding of source files
> > but I revised each modified character to make sure that the file was
> > converted from the right encoding to UTF-8.
> > 
> > Additional modifications:
> > 
> > mcs/build/config-default.make: Use UTF-8 as the default encoding for
> > compilers
> > mcs/class/Managed.Windows.Forms/Makefile: Removed CODEPAGE as UTF-8 is the
> > default
> > mcs/class/Microsoft.VisualBasic/Makefile: Removed CODEPAGE as UTF-8 is the
> > default
> > 
> > Replaced unnecessary (same visual character) non-ASCII characters with 
> > ASCII
> > characters in:
> > mcs/class/Mono.GetOptions/GetOptTest/GetOptTester.cs
> > mcs/class/System.Drawing/Test/DrawingTest/Exocortex.DSP/src/ComplexF.cs
> > mcs/class/System.Drawing/Test/DrawingTest/Exocortex.DSP/src/Complex.cs
> > mcs/class/Microsoft.VisualBasic/Test/standalone/6797.vb
> > 
> > Note that removig BOM and the above replacements resulted in 129 more ASCII
> > files that were non-ASCII because of BOM or non-ASCII characters.
> > 
> > I think that there are no side effects of these patches but comments are
> > welcome.
> > 
> > Please review and approve the patches.
> 
> Thanks a bunch for the patch! I attached the result of my audit
> as "only meaningful code changes" i.e. I believe I read the
> entire changes ;-)
> 
> If no objection in reasonable days, let's check the patch in.
> 
> Atsushi Eno
> plain text document attachment (mcs-utf8-reduced.diff)
> Index: mcs/build/config-default.make
> ===================================================================
> --- mcs/build/config-default.make	(revision 63811)
> +++ mcs/build/config-default.make	(working copy)
> @@ -6,8 +6,8 @@
>  # DO NOT EDIT THIS FILE! Create config.make and override settings
>  # there.
>  
> -# Use ISO-8859-1 (Latin 1) as the default encoding for compilers
> -CODEPAGE = 28591
> +# Use UTF-8 as the default encoding for compilers
> +CODEPAGE = 65001
>  
>  RUNTIME_FLAGS = 
>  TEST_HARNESS = $(topdir)/class/lib/$(PROFILE)/nunit-console.exe
> Index: mcs/class/Mono.Data.SqliteClient/Test/SqliteCommandUnitTests.cs
> ===================================================================
> --- mcs/class/Mono.Data.SqliteClient/Test/SqliteCommandUnitTests.cs	(revision 63811)
> +++ mcs/class/Mono.Data.SqliteClient/Test/SqliteCommandUnitTests.cs	(working copy)
> @@ -19,7 +19,7 @@
>  		readonly static string _uri = "SqliteTest.db";
>  		readonly static string _connectionString = "URI=file://" + _uri + ", version=3";
>  		static SqliteConnection _conn = new SqliteConnection (_connectionString);
> -		readonly static string stringvalue = "my keyboard is better than yours : ";
> +		readonly static string stringvalue = "my keyboard is better than yours : äöüß";
>  
>  		public SqliteCommandUnitTests()
>  		{
> @@ -128,7 +128,7 @@
>  		public void ScalarReturn()
>  		{
>  			// This should return the 1 line that got inserted in CreateTable() Test
> -			SqliteCommand cmd = new SqliteCommand("SELECT COUNT(*) FROM t1 WHERE  t LIKE '%'",_conn);
> +			SqliteCommand cmd = new SqliteCommand("SELECT COUNT(*) FROM t1 WHERE  t LIKE '%äöüß'",_conn);
>  			using(_conn)
>  			{
>  				_conn.Open();
> Index: mcs/class/Mono.GetOptions/GetOptTest/GetOptTester.cs
> ===================================================================
> --- mcs/class/Mono.GetOptions/GetOptTest/GetOptTester.cs	(revision 63811)
> +++ mcs/class/Mono.GetOptions/GetOptTest/GetOptTester.cs	(working copy)
> @@ -38,7 +38,7 @@
>  			return WhatToDoNext.GoAhead; 
>  		}
>  
> -		public override WhatToDoNext DoHelp() // uses parents OptionAttribute as is
> +		public override WhatToDoNext DoHelp() // uses parent's OptionAttribute as is
>  		{
>  			base.DoHelp();
>  			return WhatToDoNext.GoAhead; 
> Index: mcs/class/ByteFX.Data/mysqlclient/parameter.cs
> ===================================================================
> --- mcs/class/ByteFX.Data/mysqlclient/parameter.cs	(revision 63811)
> +++ mcs/class/ByteFX.Data/mysqlclient/parameter.cs	(working copy)
> @@ -358,7 +358,7 @@
>  				}
>  				else 
>  				{
> -					if (b == '\\' || b == '\'' || b == '"' || b == '`' || b == '')
> +					if (b == '\\' || b == '\'' || b == '"' || b == '`' || b == '´')
>  						newbytes[newx++] = (byte)'\\';
>  					newbytes[newx++] = b;
>  				}
> Index: mcs/class/System.Drawing/Samples/System.Drawing/FontDrawingAdv.cs
> ===================================================================
> --- mcs/class/System.Drawing/Samples/System.Drawing/FontDrawingAdv.cs	(revision 63811)
> +++ mcs/class/System.Drawing/Samples/System.Drawing/FontDrawingAdv.cs	(working copy)
> @@ -204,31 +204,31 @@
>  			gr.DrawRectangle( new Pen(Color.Green), rect2);			
>  			gr.DrawRectangle( new Pen(Color.Green), rect7);			
>  			
> -			str = "Ara que tinc &vint anys, ara que encara tinc fora,que no tinc l'nima morta, i em sento bullir la sang. (" + f1.Name + ")";			
> +			str = "Ara que tinc &vint anys, ara que encara tinc força,que no tinc l'ànima morta, i em sento bullir la sang. (" + f1.Name + ")";			
>  			gr.DrawString( str,	f1, new SolidBrush(Color.White), rect1, strfmt1);						
>  			gr.DrawString(flagProcessing(strfmt1), fonttxt, brushtxt, calcRect(rect1), strfmttxt);						                                    
>              		sz =  gr.MeasureString (str, f1, new SizeF (rect1.Width, rect1.Height), strfmt1, out chars, out lines);                             			                                
>  			Console.WriteLine("MeasureString str1 [" + str + "] " + sz + ";chars:" + chars + " lines:" + lines);
>  			
> -			str = "Ara que em sento capa de cantar si un altre canta. Avui que encara tinc veu i encara puc creure en dus (" + f2.Name + ")";
> +			str = "Ara que em sento capaç de cantar si un altre canta. Avui que encara tinc veu i encara puc creure en déus (" + f2.Name + ")";
>  			gr.DrawString(str, f2, new SolidBrush(Color.Red),rect2, strfmt2);														
>  			gr.DrawString(flagProcessing(strfmt2), fonttxt, brushtxt, calcRect(rect2), strfmttxt);						
>  			sz =  gr.MeasureString (str, f2, new SizeF (rect2.Width, rect2.Height), strfmt2, out chars, out lines);                             			                                			
>  			Console.WriteLine("MeasureString str2 [" + str + "] " + sz + ";chars:" + chars + " lines:" + lines);
>  			
> -			str = "Vull cantar a les pedres, la terra, l'aigua, al blat i al cam, que vaig trepitjant. (" + f3.Name + ")";
> +			str = "Vull cantar a les pedres, la terra, l'aigua, al blat i al camí, que vaig trepitjant. (" + f3.Name + ")";
>  			gr.DrawString(str,f3, new SolidBrush(Color.White), rect3, strfmt3);				
>  			gr.DrawString(flagProcessing(strfmt3), fonttxt, brushtxt, calcRect(rect3), strfmttxt);			
>  			sz =  gr.MeasureString (str, f3, new SizeF (rect3.Width, rect3.Height), strfmt3, out chars, out lines);                             			                                			
>  			Console.WriteLine("MeasureString str3 [" + str + "] " + sz + ";chars:" + chars + " lines:" + lines);
>  			
> -			str = "A la nit, al cel i a aquet mar tan nostre i al vent que al mat ve a besar-me el rostre (" + f4.Name + ")";				
> +			str = "A la nit, al cel i a aquet mar tan nostre i al vent que al matí ve a besar-me el rostre (" + f4.Name + ")";				
>  			gr.DrawString(str, f4, new SolidBrush(Color.Red),rect4, strfmt4);
>  			gr.DrawString(flagProcessing(strfmt4), fonttxt, brushtxt, calcRect(rect4), strfmttxt);			
>  			sz =  gr.MeasureString (str, f4, new SizeF (rect4.Width, rect4.Height), strfmt4, out chars, out lines);                             			                                			
>  			Console.WriteLine("MeasureString str4 [" + str + "] " + sz + ";chars:" + chars + " lines:" + lines);			
>  			
> -			str = "Vull cantar a les pedres, la terra, l'aigua, al blat i al cam, que vaig trepitjant. (" + f5.Name + ")";
> +			str = "Vull cantar a les pedres, la terra, l'aigua, al blat i al camí, que vaig trepitjant. (" + f5.Name + ")";
>  			gr.DrawString(str, f5, new SolidBrush(Color.White), rect5, strfmt5);
>  			gr.DrawString(flagProcessing(strfmt5), fonttxt, brushtxt, calcRect(rect5), strfmttxt);			
>  			sz =  gr.MeasureString (str, f5, new SizeF (rect5.Width, rect5.Height), strfmt5, out chars, out lines);                             			                                			
> @@ -240,7 +240,7 @@
>  			sz =  gr.MeasureString (str, f6, new SizeF (rect6.Width, rect6.Height), strfmt6, out chars, out lines);                             			                                			
>  			Console.WriteLine("MeasureString str6 [" + str + "] " + sz + ";chars:" + chars + " lines:" + lines);				
>  			
> -			str = "Vull plorar amb aquells que es troben tots sols, sense cap amor van passant pel mn.. (" + f5.Name + ")";
> +			str = "Vull plorar amb aquells que es troben tots sols, sense cap amor van passant pel món.. (" + f5.Name + ")";
>  			gr.DrawString(str, f5, new SolidBrush(Color.White), rect7, strfmt4);
>  			gr.DrawString(flagProcessing(strfmt4), fonttxt, brushtxt, calcRect(rect7), strfmttxt);			
>  			sz =  gr.MeasureString (str, f5, new SizeF (rect7.Width, rect7.Height), strfmt5, out chars, out lines);                             			                                			
> @@ -313,25 +313,25 @@
>  			strfmt14.HotkeyPrefix = HotkeyPrefix.Show;								
>  			strfmt14.FormatFlags = StringFormatFlags.DirectionRightToLeft;
>  			
> -			str = "Vull alar la veu,per cantar als homes que han nascut dempeus (" + f8.Name + ")";
> +			str = "Vull alçar la veu,per cantar als homes que han nascut dempeus (" + f8.Name + ")";
>  			gr.DrawString(str, f8, new SolidBrush(Color.White), rect8, strfmt8);
>  			gr.DrawString(flagProcessing(strfmt8), fonttxt, brushtxt, calcRect(rect8), strfmttxt);			
>  			sz =  gr.MeasureString (str, f8, new SizeF (rect8.Width, rect8.Height), strfmt8, out chars, out lines);                             			                                			
>  			gr.DrawRectangle(new Pen(Color.Red), new Rectangle (rect8.X, rect8.Y, (int)sz.Width, (int)sz.Height));			
>  			
> -			str = "I no tinc l'nima morta i  em sento bollir la sang (" + f9.Name + ")";
> +			str = "I no tinc l'ànima morta i  em sento bollir la sang (" + f9.Name + ")";
>  			gr.DrawString(str, f9, new SolidBrush(Color.White), rect9, strfmt9);
>  			gr.DrawString(flagProcessing(strfmt9), fonttxt, brushtxt, calcRect(rect9), strfmttxt);			
>  			sz =  gr.MeasureString (str, f9, new SizeF (rect9.Width, rect9.Height), strfmt9, out chars, out lines);                             			                                			
>  			gr.DrawRectangle(new Pen(Color.Red), new Rectangle (rect9.X, rect9.Y, (int)sz.Width, (int)sz.Height));			
>  			
> -			str = "I no tinc l'nima morta i  em sento bollir la sang (" + f10.Name + ")";
> +			str = "I no tinc l'ànima morta i  em sento bollir la sang (" + f10.Name + ")";
>  			gr.DrawString(str, f10, new SolidBrush(Color.White), rect10, strfmt10);
>  			gr.DrawString(flagProcessing(strfmt10), fonttxt, brushtxt, calcRect(rect10), strfmttxt);			
>  			sz =  gr.MeasureString (str, f10, new SizeF (rect10.Width, rect10.Height), strfmt10, out chars, out lines);                             			                                			
>  			gr.DrawRectangle(new Pen(Color.Red), new Rectangle (rect10.X, rect10.Y, (int)sz.Width, (int)sz.Height));			
>  			
> -			str = "I no tinc l'nima morta i  em sento bollir la sang (" + f11.Name + ")";
> +			str = "I no tinc l'ànima morta i  em sento bollir la sang (" + f11.Name + ")";
>  			gr.DrawString(str, f11, new SolidBrush(Color.White), rect11, strfmt11);
>  			gr.DrawString(flagProcessing(strfmt11), fonttxt, brushtxt, calcRect(rect11), strfmttxt);			
>  			sz =  gr.MeasureString (str, f11, new SizeF (rect11.Width, rect11.Height), strfmt11, out chars, out lines);                             			                                			
> @@ -342,12 +342,12 @@
>  			sz =  gr.MeasureString (str, f8, new SizeF (rect12.Width, rect12.Height), strfmt12, out chars, out lines);                             			                                						
>  			gr.DrawRectangle(new Pen(Color.Red), new Rectangle (rect12.X, rect12.Y, (int)sz.Width, (int)sz.Height));			
>  			
> -			str = "Nom\tCognom\tAdrea";
> +			str = "Nom\tCognom\tAdreça";
>  			gr.DrawString(str, f8, new SolidBrush(Color.White), rect13, strfmt13);
>  			sz =  gr.MeasureString (str, f8, new SizeF (rect13.Width, rect13.Height), strfmt13, out chars, out lines);                             			                                						
>  			gr.DrawRectangle(new Pen(Color.Red), new Rectangle (rect13.X, rect13.Y, (int)sz.Width, (int)sz.Height));			
>  			
> -			str = "Nom Cognom Adrea";
> +			str = "Nom Cognom Adreça";
>  			gr.DrawString(str, f8, new SolidBrush(Color.White), rect14, strfmt14);
>  			sz =  gr.MeasureString (str, f8, new SizeF (rect14.Width, rect13.Height), strfmt14, out chars, out lines);                             			                                						
>  			gr.DrawRectangle(new Pen(Color.Red), new Rectangle (rect14.X, rect14.Y, (int)sz.Width, (int)sz.Height));			
> Index: mcs/class/Managed.Windows.Forms/System.Windows.Forms.RTF/test.cs
> ===================================================================
> --- mcs/class/Managed.Windows.Forms/System.Windows.Forms.RTF/test.cs	(revision 63811)
> +++ mcs/class/Managed.Windows.Forms/System.Windows.Forms.RTF/test.cs	(working copy)
> @@ -226,32 +226,32 @@
>  				}
>  
>  				case Minor.EmDash: {
> -					Console.Write("");
> +					Console.Write("—");
>  					break;
>  				}
>  
>  				case Minor.EnDash: {
> -					Console.Write("");
> +					Console.Write("–");
>  					break;
>  				}
>  
>  				case Minor.LQuote: {
> -					Console.Write("");
> +					Console.Write("‘");
>  					break;
>  				}
>  
>  				case Minor.RQuote: {
> -					Console.Write("");
> +					Console.Write("’");
>  					break;
>  				}
>  
>  				case Minor.LDblQuote: {
> -					Console.Write("");
> +					Console.Write("“");
>  					break;
>  				}
>  
>  				case Minor.RDblQuote: {
> -					Console.Write("");
> +					Console.Write("”");
>  					break;
>  				}
>  
> Index: mcs/class/Managed.Windows.Forms/Makefile
> ===================================================================
> --- mcs/class/Managed.Windows.Forms/Makefile	(revision 63811)
> +++ mcs/class/Managed.Windows.Forms/Makefile	(working copy)
> @@ -3,9 +3,6 @@
>  
>  LIBRARY = System.Windows.Forms.dll
>  
> -# UTF-8
> -CODEPAGE = 65001
> -
>  LIB_MCS_FLAGS = /unsafe \
>  	/r:$(corlib) /r:System.dll /r:System.Xml.dll \
>  	/r:System.Drawing.dll /r:Accessibility.dll \
> Index: mcs/class/Mono.C5/Test/BasesTest.cs
> ===================================================================
> --- mcs/class/Mono.C5/Test/BasesTest.cs	(revision 63811)
> +++ mcs/class/Mono.C5/Test/BasesTest.cs	(working copy)
> @@ -266,8 +266,8 @@
>        public void CharequalityComparerViaBuilder()
>        {
>          SCG.IEqualityComparer<char> h = EqualityComparer<char>.Default;
> -        char s = '';
> -        char t = '';
> +        char s = 'å';
> +        char t = 'å';
>          char u = 'r';
>  
>          Assert.AreEqual(s.GetHashCode(), h.GetHashCode(s));
> Index: mcs/class/Mono.C5/UserGuideExamples/AnagramStrings.cs
> ===================================================================
> --- mcs/class/Mono.C5/UserGuideExamples/AnagramStrings.cs	(revision 63811)
> +++ mcs/class/Mono.C5/UserGuideExamples/AnagramStrings.cs	(working copy)
> @@ -69,7 +69,7 @@
>  
>      public static SCG.IEnumerable<String> ReadFileWords(String filename)
>      {
> -      Regex delim = new Regex("[^a-zA-Z0-9-]+");
> +      Regex delim = new Regex("[^a-zæøåA-ZÆØÅ0-9-]+");
>        using (TextReader rd = new StreamReader(filename, Encoding.GetEncoding("iso-8859-1")))
>        {
>          for (String line = rd.ReadLine(); line != null; line = rd.ReadLine())
> Index: mcs/class/Mono.C5/UserGuideExamples/AnagramHashBag.cs
> ===================================================================
> --- mcs/class/Mono.C5/UserGuideExamples/AnagramHashBag.cs	(revision 63811)
> +++ mcs/class/Mono.C5/UserGuideExamples/AnagramHashBag.cs	(working copy)
> @@ -64,7 +64,7 @@
>  
>      public static SCG.IEnumerable<String> ReadFileWords(String filename, int n)
>      {
> -      Regex delim = new Regex("[^a-zA-Z0-9-]+");
> +      Regex delim = new Regex("[^a-zæøåA-ZÆØÅ0-9-]+");
>        Encoding enc = Encoding.GetEncoding("iso-8859-1");
>        using (TextReader rd = new StreamReader(filename, enc))
>        {
> Index: mcs/class/Mono.C5/UserGuideExamples/Anagrams.cs
> ===================================================================
> --- mcs/class/Mono.C5/UserGuideExamples/Anagrams.cs	(revision 63811)
> +++ mcs/class/Mono.C5/UserGuideExamples/Anagrams.cs	(working copy)
> @@ -64,7 +64,7 @@
>  
>      public static SCG.IEnumerable<String> ReadFileWords(String filename, int n)
>      {
> -      Regex delim = new Regex("[^a-zA-Z0-9-]+");
> +      Regex delim = new Regex("[^a-zæøåA-ZÆØÅ0-9-]+");
>        Encoding enc = Encoding.GetEncoding("iso-8859-1");
>        using (TextReader rd = new StreamReader(filename, enc))
>        {
> Index: mcs/class/Mono.C5/UserGuideExamples/AnagramTreeBag.cs
> ===================================================================
> --- mcs/class/Mono.C5/UserGuideExamples/AnagramTreeBag.cs	(revision 63811)
> +++ mcs/class/Mono.C5/UserGuideExamples/AnagramTreeBag.cs	(working copy)
> @@ -64,7 +64,7 @@
>  
>      public static SCG.IEnumerable<String> ReadFileWords(String filename, int n)
>      {
> -      Regex delim = new Regex("[^a-zA-Z0-9-]+");
> +      Regex delim = new Regex("[^a-zæøåA-ZÆØÅ0-9-]+");
>        Encoding enc = Encoding.GetEncoding("iso-8859-1");
>        using (TextReader rd = new StreamReader(filename, enc))
>        {
> Index: mcs/class/Microsoft.VisualBasic/Test/standalone/6612.vb
> ===================================================================
> --- mcs/class/Microsoft.VisualBasic/Test/standalone/6612.vb	(revision 63811)
> +++ mcs/class/Microsoft.VisualBasic/Test/standalone/6612.vb	(working copy)
> @@ -23,7 +23,7 @@
>  Public Class TestClass
>      Public Function Test() As String
>          'BeginCode    
> -        Dim c As Char = ""
> +        Dim c As Char = "æ"
>          Return asc(c)
>          'EndCode
>      End Function
> Index: mcs/class/Microsoft.VisualBasic/Makefile
> ===================================================================
> --- mcs/class/Microsoft.VisualBasic/Makefile	(revision 63811)
> +++ mcs/class/Microsoft.VisualBasic/Makefile	(working copy)
> @@ -7,9 +7,6 @@
>  LIBRARY = Microsoft.VisualBasic.dll
>  LIBRARY_NEEDS_POSTPROCESSING = yes
>  
> -# UTF-8
> -CODEPAGE = 65001
> -
>  LIB_MCS_FLAGS = /r:$(corlib) /r:System.dll /r:System.Windows.Forms.dll @Microsoft.VisualBasic.dll.resources
>  TEST_MCS_FLAGS = -nowarn:0618 -nowarn:219 -nowarn:169
>  
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
-- 
Miguel de Icaza <miguel at ximian.com>



More information about the Mono-devel-list mailing list