JavaScript native2ascii - Convert Unicode strings to ASCII compatible \uxxxx notation
Motivation: Converting Unicode text in a way that it only contains ASCII characters. Why? Diacritics (accents) can
cause many problems. Some closed source browsers process JavaScript files as iso8859-1 regardless of the Content-Type
header sent by the server. This makes non-ASCII characters display as beautiful question marks...
This odd behaviour can be circumvented by bringing text constants to a '\u'-escaped hexadecimal notation in the source file. A perfect commandline tool for this very task is native2ascii which is part of the Java SDK, but sometimes you need this functionality on the client side. A bit of inspiration and 15 minutes yielded the following dozen lines of JavaScript.
The code
/*jslint browser: false, white: true, undef: true, nomen: true */
function native2ascii(str) {
var out = "";
for (var i = 0; i < str.length; i++) {
if (str.charCodeAt(i) < 0x80) {
out += str.charAt(i);
} else {
var u = "" + str.charCodeAt(i).toString(16);
out += "\\u" + (u.length === 2 ? "00" + u : u.length === 3 ? "0" + u : u);
}
}
return out;
}
Try it!
Type or copy text containing special characters into the first text box, the script prints the results into the second one.