Javascript developers can test regular expressions against all unicode characters with my Javascript RegExp Unicode Character Class Tester
Very useful if you need to know exactly which characters match a RegExp like /\s/
Interesting results from Firefox 2
- \d matches 159 digit characters from many alphabets (not just Basic Latin)
- \s matches 22 different types of whitespace: [\t\n\v\f\r \u00a0\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000]
Useful docs:
- The RegExp documentation on the Mozilla Developers Center \
- The Unicode Basic Multilingual Plane (chars 0 to 65535)
Hope you find this useful and feel free to leave comments.
June 7, 2007 at 5:13 am |
Will, this is cool. Thanks.
June 7, 2007 at 6:26 am |
Cheers
January 4, 2008 at 1:18 am |
Hi again, Will. I recently found myself wanting to know what certain zero-width assertions like /^/m matched exactly, so I ended up creating a somewhat similar test page, which also allows you to show test results side-by-side.
July 30, 2008 at 8:14 am |
Nice tool.
One slight error – the test \uffff fails to find the character \uffff. I’ve not looked at the code to see where the problem is, but I’m guessing that you have a for loop that is something like
for ( i = 0 ; i < 65535 ; i++ ) {
…
}
when the second condition should read i < 65536 or i <= 65535.
July 30, 2008 at 9:47 am |
@Marcus – thanks for spotting the bug and suggesting the fix. Should work now.
November 20, 2008 at 8:34 am |
Thanks for this little gem. Simple and useful, just like good software should be.
Maybe it’s worth linking the Unicode Block Range RegExp Generator (http://kourge.net/projects/regexp-unicode-block), as it is somewhat related to this tool.
Cheers!