[Mono-list] mcs compiles on linux. Now what?
Paolo Molaro
lupus@ximian.com
Fri, 8 Mar 2002 16:06:07 +0100
On 03/08/02 Dan Lewis wrote:
> Is there any way to find out how much of this is spent in the lexer? MCS uses a
> custom lexer, and in particular uses a hashtable lookup to recognize keywords.
>
> String.GetHashCode() is computed in C# at the moment. It should definitely have
> an icall (btw I'm not saying that icalls are the way to make things faster --
> but it's such a fundamental operation). Also it is not cached, although strings
> are supposed to be immutable, right? Perhaps change it to:
>
> public override int GetHashCode () {
> if (!is_hashed) {
> // compute hash_code
> is_hashed = true;
> }
>
> return hash_code;
> }
>
> This may/may not make any difference. As ever, profiling's your best weapon :)
String.GetHashCode() accounts for 1.3% of the total time spent compiling,
so its not an obvious candidate for optimizations:-)
Here is some relevant data:
Method name Total (ms) Calls
Mono.CSharp.Driver::ProcessFile(1) 214055 28
Mono.CSharp.Driver::parse(1) 214051 28
Mono.CSharp.CSharpParser::parse(0) 214008 28
Mono.CSharp.CSharpParser::yyparse(1) 214007 28
Mono.CSharp.Tokenizer::token(0) 163886 161657
Mono.CSharp.Tokenizer::xtoken(0) 163273 161657
Mono.CSharp.Tokenizer::peekChar(0) 25279 888884
Mono.CSharp.Tokenizer::is_number(1) 24166 19362
Mono.CSharp.Tokenizer::getChar(0) 17076 888825
Mono.CSharp.Tokenizer::decimal_digits(1) 13687 19335
Mono.CSharp.Tokenizer::is_punct(2) 7934 290123
Mono.CSharp.Tokenizer::advance(0) 4676 161685
Mono.CSharp.Tokenizer::is_keyword(1) 4247 56199
Mono.CSharp.Tokenizer::handle_preprocessing_directive(0) 2216 410
Mono.CSharp.Tokenizer::get_cmd_arg(2) 2081 410
Mono.CSharp.Tokenizer::is_identifier_part_character(1) 1960 355818
Mono.CSharp.Tokenizer::escape(1) 1544 49420
Mono.CSharp.Tokenizer::adjust_int(1) 1430 19327
Mono.CSharp.Tokenizer::GetKeyword(1) 980 16811
System.Text.StringBuilder::Append(1) 77733 453475
System.Char::IsLetter(1) 1400 734049
System.Char::IsDigit(1) 731 444015
So it looks like StringBuilder::Append() gets a huge chunk and next to
it are IO functions and many small functions that add up. I'd need to
add call graph info to have more precise data, but this should give an
idea.
> In general custom lexers are slower than machine generated ones. I did some
> work a long time ago on porting a fast lexer generator to C# -- I could dig it
> up if there's need for it.
This is miguel's call.
lupus
--
-----------------------------------------------------------------
lupus@debian.org debian/rules
lupus@ximian.com Monkeys do it better