API Reference
Technical reference for the Text Unicode Converter's underlying functionality and integration options.
Core Functions
Text to Unicode Conversion
convertTextToUnicode(text, format)
Converts text to Unicode code points in specified format.
Parameters:
text
(string): Input text to convertformat
(string): Output format ('decimal', 'hex', 'unicode-escape', 'html-entity')
Returns: (string) Unicode representation
Example:
convertTextToUnicode('Hello', 'decimal');
// Returns: "72 101 108 108 111"
convertTextToUnicode('Hello', 'hex');
// Returns: "U+0048 U+0065 U+006C U+006C U+006F"
getCodePoint(char)
Gets the Unicode code point for a single character.
Parameters:
char
(string): Single character
Returns: (number) Unicode code point
Example:
getCodePoint('A');
// Returns: 65
getCodePoint('😀');
// Returns: 128512
Unicode to Text Conversion
convertUnicodeToText(unicode, format)
Converts Unicode code points to text.
Parameters:
unicode
(string): Unicode inputformat
(string): Input format ('decimal', 'hex', 'unicode-escape', 'html-entity')
Returns: (string) Converted text
Example:
convertUnicodeToText('72 101 108 108 111', 'decimal');
// Returns: "Hello"
convertUnicodeToText('U+0048 U+0065 U+006C U+006C U+006F', 'hex');
// Returns: "Hello"
fromCodePoint(code)
Creates a character from Unicode code point.
Parameters:
code
(number): Unicode code point
Returns: (string) Character
Example:
fromCodePoint(65);
// Returns: "A"
fromCodePoint(128512);
// Returns: "😀"
Format Specifications
Decimal Format
- Pattern: Space-separated decimal numbers
- Range: 0 to 1114111 (0x10FFFF)
- Example:
72 101 108 108 111
Hexadecimal Format
- Pattern: U+ followed by 4-6 hex digits
- Range: U+0000 to U+10FFFF
- Example:
U+0048 U+0065 U+006C U+006C U+006F
Unicode Escape Format
- Pattern: \uXXXX or \u{XXXXXX}
- Range: \u0000-\uFFFF or \u{0}-\u{10FFFF}
- Example:
\u0048\u0065\u006C\u006C\u006F
HTML Entity Format
- Pattern: &#xXXXX; or &#XXXX;
- Range: � to � or � to �
- Example:
Hello
Validation Functions
isValidUnicode(code)
Validates if a code point is valid Unicode.
Parameters:
code
(number): Unicode code point
Returns: (boolean) True if valid
Example:
isValidUnicode(65); // true
isValidUnicode(128512); // true
isValidUnicode(999999); // false
isValidUnicode(-1); // false
isValidFormat(input, format)
Validates if input matches specified format.
Parameters:
input
(string): Input to validateformat
(string): Expected format
Returns: (boolean) True if valid
Example:
isValidFormat('65 66 67', 'decimal'); // true
isValidFormat('U+0041 U+0042', 'hex'); // true
isValidFormat('\\u0041\\u0042', 'unicode-escape'); // true
isValidFormat('AB', 'html-entity'); // true
Utility Functions
normalizeUnicode(text, form)
Normalizes Unicode text using specified form.
Parameters:
text
(string): Input textform
(string): Normalization form ('NFC', 'NFD', 'NFKC', 'NFKD')
Returns: (string) Normalized text
Example:
normalizeUnicode('é', 'NFC'); // "é" (U+00E9)
normalizeUnicode('é', 'NFD'); // "é" (U+0065 U+0301)
getUnicodeBlock(code)
Gets the Unicode block name for a code point.
Parameters:
code
(number): Unicode code point
Returns: (string) Block name
Example:
getUnicodeBlock(65); // "Basic Latin"
getUnicodeBlock(128512); // "Emoticons"
getUnicodeBlock(19968); // "CJK Unified Ideographs"
isSurrogatePair(high, low)
Checks if two code points form a valid surrogate pair.
Parameters:
high
(number): High surrogate code pointlow
(number): Low surrogate code point
Returns: (boolean) True if valid pair
Example:
isSurrogatePair(0xd83d, 0xde00); // true (😀)
isSurrogatePair(0x0041, 0x0042); // false
Error Handling
Error Types
InvalidUnicodeError
Thrown when Unicode code point is invalid.
Properties:
code
: Invalid code pointmessage
: Error description
FormatError
Thrown when input format is invalid.
Properties:
input
: Invalid inputformat
: Expected formatmessage
: Error description
RangeError
Thrown when code point is out of valid range.
Properties:
code
: Out-of-range code pointmin
: Minimum valid value (0)max
: Maximum valid value (0x10FFFF)
Error Handling Example
try {
const result = convertUnicodeToText('999999', 'decimal');
} catch (error) {
if (error instanceof InvalidUnicodeError) {
console.log(`Invalid Unicode code: ${error.code}`);
} else if (error instanceof FormatError) {
console.log(`Invalid format: ${error.message}`);
}
}
Performance Considerations
Memory Usage
- Text Processing: O(n) memory usage for input text
- Large Inputs: Processed in chunks to prevent memory issues
- History Storage: Limited to 50 entries to manage memory
Processing Speed
- Small Text: < 1ms for typical text (up to 1000 characters)
- Large Text: Linear time complexity O(n)
- Format Conversion: Minimal overhead for format changes
Optimization Tips
- Batch Processing: Process multiple characters together
- Format Selection: Choose most efficient format for your use case
- Input Validation: Validate input early to avoid processing errors
- Memory Management: Clear large inputs when not needed
Browser Compatibility
Supported Browsers
- Chrome: 41+ (full support)
- Firefox: 29+ (full support)
- Safari: 10+ (full support)
- Edge: 12+ (full support)
Feature Support
- String.fromCodePoint(): Modern browsers
- String.prototype.codePointAt(): Modern browsers
- Unicode Normalization: Modern browsers
- Surrogate Pairs: All modern browsers
Polyfills
For older browsers, consider polyfills:
// String.fromCodePoint polyfill
if (!String.fromCodePoint) {
String.fromCodePoint = function () {
var chars = [];
for (var i = 0; i < arguments.length; i++) {
var code = arguments[i];
if (code > 0x10ffff) {
throw new RangeError('Invalid code point');
}
if (code <= 0xffff) {
chars.push(String.fromCharCode(code));
} else {
code -= 0x10000;
chars.push(
String.fromCharCode(0xd800 + (code >> 10)),
String.fromCharCode(0xdc00 + (code & 0x3ff)),
);
}
}
return chars.join('');
};
}
Integration Examples
Web Component
class UnicodeConverter extends HTMLElement {
connectedCallback() {
this.innerHTML = `
<input type="text" id="input" placeholder="Enter text">
<select id="format">
<option value="decimal">Decimal</option>
<option value="hex">Hexadecimal</option>
</select>
<div id="output"></div>
`;
this.querySelector('#input').addEventListener('input', (e) => {
const text = e.target.value;
const format = this.querySelector('#format').value;
const result = convertTextToUnicode(text, format);
this.querySelector('#output').textContent = result;
});
}
}
customElements.define('unicode-converter', UnicodeConverter);
Node.js Module
module.exports = {
convertTextToUnicode,
convertUnicodeToText,
isValidUnicode,
normalizeUnicode,
};