Javascript RegExp Unicode Character Class Tester

Javascript developers can test regular expressions against all unicode characters with my Javascript RegExp Unicode Character Class Tester

Very useful if you need to know exactly which characters  match a RegExp like /\s/

Interesting results from Firefox 2

  • \d matches 159 digit characters from many alphabets (not just Basic Latin)
  • \s matches 22 different types of whitespace: [\t\n\v\f\r \u00a0\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000]

Useful docs:

Hope you find this useful and feel free to leave comments.

6 Responses to “Javascript RegExp Unicode Character Class Tester”

  1. Steve Says:

    Will, this is cool. Thanks.

  2. Will Moffat Says:

    Cheers ;-)

  3. Steve Says:

    Hi again, Will. I recently found myself wanting to know what certain zero-width assertions like /^/m matched exactly, so I ended up creating a somewhat similar test page, which also allows you to show test results side-by-side.

  4. Marcus Says:

    Nice tool.

    One slight error – the test \uffff fails to find the character \uffff. I’ve not looked at the code to see where the problem is, but I’m guessing that you have a for loop that is something like

    for ( i = 0 ; i < 65535 ; i++ ) {

    }

    when the second condition should read i < 65536 or i <= 65535.

  5. Will Moffat Says:

    @Marcus – thanks for spotting the bug and suggesting the fix. Should work now.

  6. Tomalak Says:

    Thanks for this little gem. Simple and useful, just like good software should be. :-)

    Maybe it’s worth linking the Unicode Block Range RegExp Generator (http://kourge.net/projects/regexp-unicode-block), as it is somewhat related to this tool.

    Cheers!

Leave a Reply