Highlight custom Python-like language for wxStyledTextCtrl

Hi there,

I am writing a very simple code editor for an application I’m working on. It uses a custom programming language that is similar to Python, but has some unique elements to it.

I’ve used LaTeX a lot in the past, where you can define an entirely new language really freaking easily. So easily, in fact, that I can’t even believe how hard this seems to be to do in wxPython.

Here is an entire language definition in LaTeX:

\lstset{%

sensitive=false,

showtabs=false,

showspaces=false,

showstringspaces=false,

breaklines=true,

breakatwhitespace=true,

keepspaces=true,

basicstyle=\ttfamily\small,

backgroundcolor=\color{white}

}

\lstdefinelanguage{Spin}{

classoffset=0,

morekeywords={con, var, obj, pub, pri, dat},

morekeywords={chipver, clkmode, \_clkmode, clkfreq, \_clkfreq, clkset, \_xinfreq, \_stack, \_free, rcfast, rcslow, xinput, xtal1, xtal2, xtal3, pll1x, pll2x, pll4x, pll8x, pll16x},

morekeywords={cogid, cognew, coginit, cogstop, reboot},

morekeywords={locknew, lockret, lockclr, lockset, waitcnt, waitpeq, waitpne, waitvid},

morekeywords={bytefill, wordfill, longfill, bytemove, wordmove, longmove, lookup, lookupz, lookdown, lookdownz, strsize, strcomp},

morekeywords={string, constant, float, round, trunc, file},

morekeywords={dira, dirb, ina, inb, outa, outb, cnt, ctra, ctrb, frqa, frqb, phsa, phsb, vcfg, vscl, par, spr},

morekeywords={true, false, posx, negx, pi, result},

morekeywords={org, fit, res},

keywordstyle={\color{darkblue} \bf},

classoffset=1,

morekeywords={byte, word, long},

keywordstyle={\color{darkblue} \bf},

classoffset=2,

morekeywords={clkset, cogid, coginit, cogstop},

morekeywords={locknew, lockret, lockclr, lockset, waitcnt, waitpeq, waitpne, waitvid},

morekeywords={if\_always, if\_never, if\_e, if\_ne, if\_a, if\_b, if\_ae, if\_be, if\_c, if\_nc, if\_z, if\_nz, if\_c\_eq\_z, if\_c\_ne\_z, if\_c\_and\_z, if\_c\_and\_nz, if\_nc\_and\_z, if\_nc\_and\_nz, if\_c\_or\_z, if\_c\_or\_nz, if\_nc\_or\_z, if\_nc\_or\_nz, if\_z\_eq\_c, if\_z\_ne\_c, if\_z\_and\_c, if\_z\_and\_nc, if\_nz\_and\_c, if\_nz\_and\_nc, if\_z\_or\_c, if\_z\_or\_nc, if\_nz\_or\_c, if\_nz\_or\_nc},

morekeywords={call, djnz, jmp, jmpret, tjnz, tjz, ret, nr, wr, wc, wz},

morekeywords={rdbyte, rdword, rdlong, wrbyte, wrword, wrlong},

morekeywords={abs, absneg, neg, negc, negnc, negz, negnz, min, mins, max, maxs, add, addabs, adds, addx, addsx, sub, subabs, subs, subx, subsx, sumc, sumnc, sumz, sumnz, mul, muls, and, andn, or, xor, ones, enc, rcl, rcr, rev, rol, ror, shl, shr, sar, cmp, cmps, cmpx, cmpsx, cmpsub, test, testn, mov, movs, movd, movi, muxc, muxnc, muxz, muxnz, hubop, nop},

keywordstyle={\color{darkblue} \bf},  

classoffset=3,

morekeywords={if, elseif, ifnot, elseifnot, else, case, other, repeat, from, to, step, until, while, next, quit, return, abort},

keywordstyle={\color{darkblue} \bf},  

classoffset=4,

alsoletter={+,-,=,:,\^,|,~,?,<,>,!,@,\#},

morekeywords={+,-,--,++,\^\^,||,~,~~,?,|<,>|,!,NOT,@,@@,=,:=,+=,-=,*=,**=,*,**,/,//,/=,//=,\#>,\#>=,<\#,<\#=,~>,~>=,<<,<<=,>>,>>=,<-,<-=,->,->=,><,><=,\&,\&=,|,|=,\^,\^=,AND,AND=,OR,OR=,==,===,<>,<>=,<,<=,>,>=,=<,=<=,=>,=>=},

keywordstyle={\color{darkblue} \bf},  

classoffset=0,

morecomment=[l]{'},

morecomment=[n]{\{}{\}},

commentstyle=\color{green},

numberstyle=\color{pink}

}

It seems like to do the same thing in Scintilla, I have to rewrite the entire Lexer class from scratch, and I don’t understand why that would be necessary. I don’t want to write a brand new lexer. I just want to add custom delimiters and highlight certain words. Also, this is a case insensitive language, and I can’t find any indication that case-insensitive languages are even supported by Scintilla.

I would love to RTFM, but for the life of me, I can’t find anywhere that actually explains how to do this, other than giant API listings and suggestions that I read the entire Scintilla C++ source code.

Is there anyone out there who can point me in the right direction? Thanks in advance.

Hi,

Have you looked at Editra? It’s included with wxPython. Perhaps it has pointers for what you need. (There is a docs directory with it.)

Maybe ed_style.py is the place to start?

HTH a little. Good luck.

Cheers,

Scott.

···

On Tue, Jun 17, 2014 at 11:51 PM, Brett Weir brett.m.weir@gmail.com wrote:

Hi there,

I am writing a very simple code editor for an application I’m working on. It uses a custom programming language that is similar to Python, but has some unique elements to it.

I’ve used LaTeX a lot in the past, where you can define an entirely new language really freaking easily. So easily, in fact, that I can’t even believe how hard this seems to be to do in wxPython.

Here is an entire language definition in LaTeX:

\lstset{%

sensitive=false,
showtabs=false,
showspaces=false,
showstringspaces=false,
breaklines=true,
breakatwhitespace=true,
keepspaces=true,
basicstyle=\ttfamily\small,
backgroundcolor=\color{white}

}

\lstdefinelanguage{Spin}{

classoffset=0,
morekeywords={con, var, obj, pub, pri, dat},
morekeywords={chipver, clkmode, \_clkmode, clkfreq, \_clkfreq, clkset, \_xinfreq, \_stack, \_free, rcfast, rcslow, xinput, xtal1, xtal2, xtal3, pll1x, pll2x, pll4x, pll8x, pll16x},
morekeywords={cogid, cognew, coginit, cogstop, reboot},
morekeywords={locknew, lockret, lockclr, lockset, waitcnt, waitpeq, waitpne, waitvid},
morekeywords={bytefill, wordfill, longfill, bytemove, wordmove, longmove, lookup, lookupz, lookdown, lookdownz, strsize, strcomp},
morekeywords={string, constant, float, round, trunc, file},
morekeywords={dira, dirb, ina, inb, outa, outb, cnt, ctra, ctrb, frqa, frqb, phsa, phsb, vcfg, vscl, par, spr},
morekeywords={true, false, posx, negx, pi, result},
morekeywords={org, fit, res},
keywordstyle={\color{darkblue} \bf},
classoffset=1,
morekeywords={byte, word, long},
keywordstyle={\color{darkblue} \bf},
classoffset=2,
morekeywords={clkset, cogid, coginit, cogstop},
morekeywords={locknew, lockret, lockclr, lockset, waitcnt, waitpeq, waitpne, waitvid},
morekeywords={if\_always, if\_never, if\_e, if\_ne, if\_a, if\_b, if\_ae, if\_be, if\_c, if\_nc, if\_z, if\_nz, if\_c\_eq\_z, if\_c\_ne\_z, if\_c\_and\_z, if\_c\_and\_nz, if\_nc\_and\_z, if\_nc\_and\_nz, if\_c\_or\_z, if\_c\_or\_nz, if\_nc\_or\_z, if\_nc\_or\_nz, if\_z\_eq\_c, if\_z\_ne\_c, if\_z\_and\_c, if\_z\_and\_nc, if\_nz\_and\_c, if\_nz\_and\_nc, if\_z\_or\_c, if\_z\_or\_nc, if\_nz\_or\_c, if\_nz\_or\_nc},
morekeywords={call, djnz, jmp, jmpret, tjnz, tjz, ret, nr, wr, wc, wz},
morekeywords={rdbyte, rdword, rdlong, wrbyte, wrword, wrlong},
morekeywords={abs, absneg, neg, negc, negnc, negz, negnz, min, mins, max, maxs, add, addabs, adds, addx, addsx, sub, subabs, subs, subx, subsx, sumc, sumnc, sumz, sumnz, mul, muls, and, andn, or, xor, ones, enc, rcl, rcr, rev, rol, ror, shl, shr, sar, cmp, cmps, cmpx, cmpsx, cmpsub, test, testn, mov, movs, movd, movi, muxc, muxnc, muxz, muxnz, hubop, nop},
keywordstyle={\color{darkblue} \bf},  
classoffset=3,
morekeywords={if, elseif, ifnot, elseifnot, else, case, other, repeat, from, to, step, until, while, next, quit, return, abort},
keywordstyle={\color{darkblue} \bf},  
classoffset=4,
alsoletter={+,-,=,:,\^,|,~,?,<,>,!,@,\#},
morekeywords={+,-,--,++,\^\^,||,~,~~,?,|<,>|,!,NOT,@,@@,=,:=,+=,-=,*=,**=,*,**,/,//,/=,//=,\#>,\#>=,<\#,<\#=,~>,~>=,<<,<<=,>>,>>=,<-,<-=,->,->=,><,><=,\&,\&=,|,|=,\^,\^=,AND,AND=,OR,OR=,==,===,<>,<>=,<,<=,>,>=,=<,=<=,=>,=>=},
keywordstyle={\color{darkblue} \bf},  
classoffset=0,
morecomment=[l]{'},
morecomment=[n]{\{}{\}},
commentstyle=\color{green},
numberstyle=\color{pink}

}

It seems like to do the same thing in Scintilla, I have to rewrite the entire Lexer class from scratch, and I don’t understand why that would be necessary. I don’t want to write a brand new lexer. I just want to add custom delimiters and highlight certain words. Also, this is a case insensitive language, and I can’t find any indication that case-insensitive languages are even supported by Scintilla.

I would love to RTFM, but for the life of me, I can’t find anywhere that actually explains how to do this, other than giant API listings and suggestions that I read the entire Scintilla C++ source code.

Is there anyone out there who can point me in the right direction? Thanks in advance.

You received this message because you are subscribed to the Google Groups “wxPython-users” group.

To unsubscribe from this group and stop receiving emails from it, send an email to wxpython-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

There are a few things out around the net that will help you understand what you need to start to do.
Writing new code for doing this is almost inevitable…

Look at Editras code it is a base start: http://editra.org/

There is also this, which is an interesting approach:
http://www.janthor.com/sketches/index.php?/archives/14-Mixing-HTML-and-TeX-in-a-StyledTextCtrl-in-wxPython.html

And while this is a bit dated(wx.28) it has some infos to help you figure out some stuff: http://www.yellowbrain.com/stc/

Example: Creating the WizBAIN lexer.
It is similar in nature to python itself as python is what it is written in, so I use the python lexer as the base. But what about comment lines. ; is a comment in WizBAIN, like in ini files.
…So the choice would be to use dummy lexer and parse two ctrls styles into one
or
…Just parse if the ; is in a string style when the ; char is added to the document and use stc.EVT_STC_STYLENEEDED or similar with a regex to do the dirty work.
or
…Rewrite the whole lexer from scratch using a base class, similar to the way editra does things.
or
…Simply ask Neil @ Scintilla to add an option to the python lexer to optionally color ; as commentline also, similar to ini PROPERTIES files. This option would perform the best of all options performance-wise, but may not be accepted into the base scintilla code.

The path to completion of your custom lexer is you own. Nobody can decide what is best but you.

1 Like