[Mono-list] Character coding auto-detection in plain-text files

Pedro Castro mail at pedrocastro.org
Sat Mar 10 07:35:16 EST 2007


Hi,

This comes first as a question: is there currently a way to autodetect
encodings in text files / strings?

I realize there isn't, so would like ask if someone's interested on
going forward with this. Mozilla has a great detector, written in C,
which has been ported to other languages, like Java
(http://jchardet.sourceforge.net/) and Python
(http://chardet.feedparser.org/) for instance. A port exists in C# but
is very outdated
(http://www.conceptdevelopment.net/Localization/NCharDet/).

This library would be of great help to many applications, mostly those
working with files in different encodings, but basically any
application reading plain-text files.

-- 
Pedro Castro
http://www.pedrocastro.org


More information about the Mono-list mailing list