Javascript developers can test regular expressions against all unicode characters with my Javascript RegExp Unicode Character Class Tester
Very useful if you need to know exactly which characters match a RegExp like /\s/
Interesting results from Firefox 2
- \d matches 159 digit characters from many alphabets (not just Basic Latin)
- \s matches 22 different types of whitespace: [\t\n\v\f\r \u00a0\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u200b\u2028\u2029\u3000]
Useful docs:
- The RegExp documentation on the Mozilla Developers Center \
- The Unicode Basic Multilingual Plane (chars 0 to 65535)
Hope you find this useful and feel free to leave comments.
Will, this is cool. Thanks.
Cheers
Hi again, Will. I recently found myself wanting to know what certain zero-width assertions like /^/m matched exactly, so I ended up creating a somewhat similar test page, which also allows you to show test results side-by-side.
Nice tool.
One slight error – the test \uffff fails to find the character \uffff. I’ve not looked at the code to see where the problem is, but I’m guessing that you have a for loop that is something like
for ( i = 0 ; i < 65535 ; i++ ) {
…
}
when the second condition should read i < 65536 or i <= 65535.
@Marcus – thanks for spotting the bug and suggesting the fix. Should work now.
Thanks for this little gem. Simple and useful, just like good software should be.
Maybe it’s worth linking the Unicode Block Range RegExp Generator (http://kourge.net/projects/regexp-unicode-block), as it is somewhat related to this tool.
Cheers!
Just thanks, I was having some weird problems trying to regexp some unicode text. Your site showed me which characteres were making trouble and now everything works like charm.