[Monodevelop-devel] Creating SyntaxMode without regular expressions?

Fri Oct 22 02:50:54 EDT 2010

Hi

No - this is may change in the future too. It's basically splitting up a 
text into tokens.
I've 2 tokenizer runs : spans and chunks. Spans help to reduce the 
amount of re-parses.
Spans are comments, strings etc. - they just help to set the correct 
color for a chunk. A chunk is the real token.
It's just a segment (offset/length) that has a color attached to.

The spanstack is the stack that contains the spans - it comes from 
outside (atm it's stored for each line). It's a stack because spans can 
contain each other but aren't allowed to overlap. If you want something 
with a start & end and stuff in between then a span is the thing you're 
looking for and you need to modify the span parser.
ScanSpan scans for the span beginnings and scanspanend for the endings. 
I would leave it with this model.

I plan to work on the highlighting engine in the future - I've put up 
some code in a private branch, but I don't think I've time to do it in 
the mid-term future. I've only documented the syntax mode xml stuff - 
making own highlightings per code is one of the newer features and a bit 
messy (that's why I want to clean up that stuff).

Regards
Mike
> Hi Mike,
> thanks for the reply. I think that in my scenario, using an existing
> tokenizer would be the easiest option (because it is already there,
> efficient and tested). I was looking at CSharpSyntaxMode.cs briefly,
> but I didn't quite understood how it works. Is there any example with
> some comments or a brief overview article about this?
>
> As far as I can see, the C# example implements a custom "SpanParser"
> and overrides two virtual methods "ScanSpan" and "ScanSpanEnd". These
> methods somehow manipulate "spanStack" (where does that come from?),
> but if I understand it correctly, this provides "Span" objects which
> specify start and end using regular expressions. I'd like to generate
> tokens with locations to mark their start/end and some color
> information, so should I create my own "SpanParser"? You also
> mentioned chunk parser - what is the difference between this one and
> span parser? Sorry for so many questions - I'm new to MonoDevelop and
> I couldn't find any documentation on this part (aside from the
> description of XML format) and the existing C# syntax mode contains
> only a few comments...
>
> Below is the existing F# parsing that I'd like to use - to give you an
> idea of what I'd like to map to the MonoDevelop API.
>
> Thanks!
> Tomas
>
>
> class TokenInformation {
>    public int LeftColumn { get; } // Start location at the current line
>    public int RightColumn { get; } // End location at the current line
>    public TokenColorKind ColorClass { get; } // Color as simple 'enum'
> }
>
> class SourceTokenizer {
>    public SourceTokenizer(string[] defines, string source); // Takes
> list of #define symbols and source
>    LineTokenizer CreateLineTokenizer(string line); // Create parser for
> a line (passed in as a string)
> }
>
> class LineTokenizer {
>    // Read next token on the line (takes current state of parser and
> returns a state after parsing)
>    public Tuple<TokenInformation, State>  ScanToken(State state);
> }
>
>
> On Thu, Oct 21, 2010 at 7:00 AM, Mike Krüger<mkrueger at novell.com>  wrote:
>> Hi
>>
>> If you really want to make a complex syntax highlighting you need to
>> write a highlighter in c#.
>>
>> See:
>> main/src/addins/CSharpBinding/MonoDevelop.CSharp.Highlighting/CSharpSyntaxMode.cs
>>
>> btw. you could create a custom chunk parser as well as a custom span parser.
>>
>> Regards
>> Mike
>>> Hi,
>>> I have been working on MonoDevelop language binding for F# and I have
>>> one question regarding colorization (the SyntaxMode class). Creating
>>> XML based syntax mode is quite easy, so I'm using that for now, but
>>> there are a few things that would need to be improved (e.g. F# allows
>>> you to have nested multi-line comments and I'd like to eventually
>>> implement support for #if, etc.)
>>>
>>> The F# compiler exposes a very simple tokenizer that I could use - you
>>> give it a line from the source file and it parses the line returning a
>>> sequence of tokens (with location, color information and other
>>> possibly useful things). I was wondering if I could just implement
>>> SyntaxMode by calling the F# tokenizer, but I don't see any way of
>>> creating SyntaxMode that would just return e.g. a list with starting
>>> and ending colum&    color for each line.
>>>
>>> I found some syntax modes that create custom SpanParser which returns
>>> stack of Span objects, but that still marks the start/end of a span
>>> using Regex. Is there a more direct way of providing colorization?
>>>
>>> Thanks!
>>> Tomas Petricek
>>> _______________________________________________
>>> Monodevelop-devel-list mailing list
>>> Monodevelop-devel-list at lists.ximian.com
>>> http://lists.ximian.com/mailman/listinfo/monodevelop-devel-list
>> _______________________________________________
>> Monodevelop-devel-list mailing list
>> Monodevelop-devel-list at lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/monodevelop-devel-list
>>