[MonoDevelop] Souce files are UTF-8... are we sure?
Steve Deobald
steve@citygroup.ca
Thu, 8 Apr 2004 00:36:29 -0600 (CST)
Hey guys (mostly Todd),
I wrote this tonight (er, yesterday?), with no luck:
// src/Addins/DisplayBindings/SourceEditor/SourceEditorBuffer.cs:
public static SourceEditorBuffer CreateTextBufferFromFile (string filename)
{
FileStream fs = new FileStream(filename, FileMode.Open);
fs.Position = 0;
byte[] preamble = Encoding.UTF8.GetPreamble();
for (int j = 0; j < preamble.Length; j++)
{
if (preamble[j] != fs.ReadByte())
{
System.Console.WriteLine("CreateTextBufferFromFile(): file is not
UTF-8. Skipping.");
return (null);
}
}
System.Console.WriteLine("CreateTextBufferFromFile(): file is UTF-8.
Loading into sourcebuffer...");
SourceEditorBuffer buff = new SourceEditorBuffer ();
buff.LoadFile (filename);
return buff;
}
// end
So I weaseled my XP box back from a very cute girl who was over here using
it so I could write a test case of .NET running this code properly. I
wrote `Class.cs' that you can find here:
http://nofeet.com/_garbage/enc_bug/
...and tested it against `blah.txt' and `blah.exe' found in that same
directory.
Just before submitting the System.Text bugreport, however, I tried running
the same test case using those 2 Windows files on this FC1/mono box. Lo
and behold, it recognizes the .txt as UTF-8 (which was set in Notepad)
just fine.
>From playing around, I know the MD code above recognizes all text files
(in a 'hello world' GTK# app, or the MD source tree) as Encoding.ASCII -
but it also recognizes binary files this way, unfortunately.
Does anyone have any suggestions? Are the files in the MD doubtlessly
UTF-8? (In which case I'll have to file this bug.) Or is it possible that
they are encoded differently?
Just thought I'd check with you guys before I posted a bug that wasn't.
Thanks!
.steve