[Mono-dev] mcs default encoding: Latin1 or not

Rafael Teixeira monoman at gmail.com
Fri Aug 26 09:30:29 EDT 2005


Just to comment a bit.

We have at least two decisions to make coming from this discussion:

- What default encoding should mcs use? 

I prefer to use the local current encoding (Encoding.Defaut), so it
works for files edited with the commonly used editors for each
platform (gedit and I believe MonoDevelop, follow that for instance)
read/saved without specifiying another encoding. On windows it would
also work as notepad/write/vs.net follow the current codepage.

- What standard encoding all of our source files in mono repository
should use to keep things workable for hackers from all cultures?

Here we should stick to utf-8 (and for those that like to use vim
inside cygwin, I think we should ask vim hackers to make it support it
in that platform).  Remember that while most code in mono use only
ASCII identifiers (as it follows MS API, or is new code sticking to
coding guidelines), author names for sure and even some other
commentary text may contain non-ASCII characters.

The derived question is: should the files have the BOM marker? I would
say that for easing mixed mono/.net (mcs/csc) work they should, but
regrettably while VS.NET editor does make a good job of
detecting/hiding/preserving the BOM marker AFAIK most linux editors
doesn't, showing it as the specially-typed space character it is. I do
think we can make MonoDevelop mimic VS.NET in that regard but many
mono hackers use other editors, or a multitude of them (besides MD, I
use gedit extensively when doing quick fixes from the console). So I
don't have a firm opinion on this sub-issue, input is welcome.

Regards, 

On 8/26/05, Atsushi Eno <atsushi at ximian.com> wrote:
> Hi,
> 
> > If you don't like ISO 28591 because it's foreign, why do you want to use
> > ASCII in source files?:)
> 
> Well, ASCII is not foreign for Japanese. All of iso-2022-jp /
> shift_jis / euc-jp don't contradict ASCII and it is actually
> part of those encodings.
> 
> I know there used to be non-ASCII based encodings such as Indian
> ISSCII-7, Arabic ASMO 449, Banguradesh BDS 1520:1995 etc. but I
> don't know any modern encoding that contradicts ASCII (I don't
> think it is possible to publish world-ready applications with
> those encodings).
> 
> So AFAIK ASCII is safe, the GCM for us. Latin1 is not the case.
> 
> > I personally hate the fact of having code pages but this has historical
> > reasons. I think UTF-8 is a good solution as it is international,
> > culture-neutral and ASCII compatible.
> >
> > I think we are living in the age of Unicode. So there is no reason to use
> > ASCII. It's OK to use only ASCII in identifiers and use English in comments
> > and texts but I don't think we shouldn't take advantage of Unicode. We can
> > use it for names for example.
> 
> Can we edit UTF8 files on vim on cygwin? No. This fact simply tells
> that we are not living in the age of Unicode.
> 
> I heard a story - there was a Japanese or Chinesee who used Chinese
> character in his (or her) blog which are aggregated in somewhere
> (I don't remember the details) and that person got blamed of using
> Chinese, even though it is written in utf-8 encoding.
> 
> > I think mcs should use Encoding.Default as default encoding as I think this
> > is nearest to the user's need and provides compatiblity with csc.exe.
> 
> > But we should use UTF-8 without signature (BOM) for our .cs source code
> > files and explicitly specify for mcs to use UTF-8.
> 
> Why? I think we *should* use BOM as we discussed before that mcs
> (nor csc) does not autodetect encoding correctly.
> 
> Here I guess that you think BOM-less UTF8 sources could be edited
> in Latin1 editors. What happens if I put CJK ideographs? Actually
> we all (really all) Japanese hackers said that they feel reluctant
> to edit those files that contain Latin1 letters, because our
> usual editors does not support Latin1 (even as a candidate of
> encodings to save file).
> 
> Atsushi Eno
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
> 


-- 
Rafael "Monoman" Teixeira
---------------------------------------
I'm trying to become a "Rosh Gadol" before my own eyes. 
See http://www.joelonsoftware.com/items/2004/12/06.html for enlightment.
It hurts!



More information about the Mono-devel-list mailing list