Recommending – DEELX “regex” library

DEELX is a simple (Perl compatible) regular expression engine coded in pure C++. It is a research project of RegExLab.

After a long search for a nice, clean, simple, Unicode-compatible, C++ regex library, I finally stumbled upon DEELX library.

There sure are many very good regex libraries available, but mostly they are too bloated (and/or complicated) or Unicode-unaware.
E.g. PCRE++ is a nice library with a clean interface, but it lacks Unicode support. The PCREPP library (“official” C++ wrap of PCRE provided by Google) is too heavy-weight for what I needed – a simple “find a match and that’s it” job. RegEx++, currently part of the Boost library – the same […and many more libraries, but you get the picture].

Don’t take me wrong – the above mentioned libraries are extremely useful and well designed, but they’re mainly for the “serious work”.

Main advantage of the DEELX library (at least from my point of view) is that it is provided as a template; thus supporting Unicode via wchar_t is as simple, as changing the template argument.
And of course, with the ever better compilers, the code can get heavily optimized based on what you’re doing with the library.

Consider the following code:

CRegexpT <char> regexp("\\d+\\.?\\d*|\\.\\d+");
MatchResult result = regexp.Match("such as 1, 234, 12.5, .78 etc");
while(result.IsMatched())
{
    printf("%.*s\n", result.GetEnd() - result.GetStart(), txt + result.GetStart());
    result = regexp.Match(txt, result.GetEnd());
}

while the Unicode version is:

CRegexpT <wchar_t> regexp(L"\\d+\\.?\\d*|\\.\\d+");
MatchResult result = regexp.Match(L"such as 1, 234, 12.5, .78 etc");
while(result.IsMatched())
{
    printf("%.*S\n", result.GetEnd() - result.GetStart(), txt + result.GetStart());
    result = regexp.Match(txt, result.GetEnd());
}

All you have to do is change char to wchar_t, add L in front of literal string constants, and change s to S in printf() (or change printf() to wprintf()).

And that’s it!

Library’s interface is pretty much straight-forward, have a look for yourself.

Happy coding! ๐Ÿ™‚

2 responses to “Recommending – DEELX “regex” library

  1. I agree 100% !
    I spent hours looking for a simple regex library for validating “not so simple” user inputs.
    The “good” libraries where huge and sometimes(boost) coming with even bigger packages, complicated to compile etc etc.
    DEELX did the job perfectly in 4 minutes since I dropped the single file on my project. DEELX good job!

  2. setlocale(LC_ALL, “”); // follow current system locale
    wchar_t ptn[] = L”(‘?[0-9]{2,4})([ ]{0,3})([๋…„/,.-]?)([ ]{0,3})(1[0-2]|0?[1-9])([ ]{0,3})([์›”/,.-]?)([ ]{0,3})([1-2][0-9]|3[0-1]|0?[1-9])([ ]{0,3})([์ผ]?)”;
    wchar_t txt[] = L”2013๋…„12์›”9์ผ00-4″;

    Result:2013๋…„12
    Expected:2013๋…„12์›”9์ผ

    Do you know what did I wrong?

    Below regex tools show me the expected.
    http://gskinner.com/RegExr/
    – Regex Match Tracer (RegExLab.com deelx)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s