Documentation Core Tools Text Unicode Converter

API Reference

Technical reference for the Text Unicode Converter's underlying functionality and integration options.

Core Functions

Text to Unicode Conversion

`convertTextToUnicode(text, format)`

Converts text to Unicode code points in specified format.

Parameters:

text (string): Input text to convert
format (string): Output format ('decimal', 'hex', 'unicode-escape', 'html-entity')

Returns: (string) Unicode representation

Example:

convertTextToUnicode('Hello', 'decimal');
// Returns: "72 101 108 108 111"

convertTextToUnicode('Hello', 'hex');
// Returns: "U+0048 U+0065 U+006C U+006C U+006F"

`getCodePoint(char)`

Gets the Unicode code point for a single character.

Parameters:

char (string): Single character

Returns: (number) Unicode code point

Example:

getCodePoint('A');
// Returns: 65

getCodePoint('😀');
// Returns: 128512

Unicode to Text Conversion

`convertUnicodeToText(unicode, format)`

Converts Unicode code points to text.

Parameters:

unicode (string): Unicode input
format (string): Input format ('decimal', 'hex', 'unicode-escape', 'html-entity')

Returns: (string) Converted text

Example:

convertUnicodeToText('72 101 108 108 111', 'decimal');
// Returns: "Hello"

convertUnicodeToText('U+0048 U+0065 U+006C U+006C U+006F', 'hex');
// Returns: "Hello"

`fromCodePoint(code)`

Creates a character from Unicode code point.

Parameters:

code (number): Unicode code point

Returns: (string) Character

Example:

fromCodePoint(65);
// Returns: "A"

fromCodePoint(128512);
// Returns: "😀"

Format Specifications

Decimal Format

Pattern: Space-separated decimal numbers
Range: 0 to 1114111 (0x10FFFF)
Example: 72 101 108 108 111

Hexadecimal Format

Pattern: U+ followed by 4-6 hex digits
Range: U+0000 to U+10FFFF
Example: U+0048 U+0065 U+006C U+006C U+006F

Unicode Escape Format

Pattern: \uXXXX or \u{XXXXXX}
Range: \u0000-\uFFFF or \u{0}-\u{10FFFF}
Example: \u0048\u0065\u006C\u006C\u006F

HTML Entity Format

Pattern: &#xXXXX; or &#XXXX;
Range: � to � or � to �
Example: Hello

Validation Functions

`isValidUnicode(code)`

Validates if a code point is valid Unicode.

Parameters:

code (number): Unicode code point

Returns: (boolean) True if valid

Example:

isValidUnicode(65); // true
isValidUnicode(128512); // true
isValidUnicode(999999); // false
isValidUnicode(-1); // false

`isValidFormat(input, format)`

Validates if input matches specified format.

Parameters:

input (string): Input to validate
format (string): Expected format

Returns: (boolean) True if valid

Example:

isValidFormat('65 66 67', 'decimal'); // true
isValidFormat('U+0041 U+0042', 'hex'); // true
isValidFormat('\\u0041\\u0042', 'unicode-escape'); // true
isValidFormat('&#x41;&#x42;', 'html-entity'); // true

Utility Functions

`normalizeUnicode(text, form)`

Normalizes Unicode text using specified form.

Parameters:

text (string): Input text
form (string): Normalization form ('NFC', 'NFD', 'NFKC', 'NFKD')

Returns: (string) Normalized text

Example:

normalizeUnicode('é', 'NFC'); // "é" (U+00E9)
normalizeUnicode('é', 'NFD'); // "é" (U+0065 U+0301)

`getUnicodeBlock(code)`

Gets the Unicode block name for a code point.

Parameters:

code (number): Unicode code point

Returns: (string) Block name

Example:

getUnicodeBlock(65); // "Basic Latin"
getUnicodeBlock(128512); // "Emoticons"
getUnicodeBlock(19968); // "CJK Unified Ideographs"

`isSurrogatePair(high, low)`

Checks if two code points form a valid surrogate pair.

Parameters:

high (number): High surrogate code point
low (number): Low surrogate code point

Returns: (boolean) True if valid pair

Example:

isSurrogatePair(0xd83d, 0xde00); // true (😀)
isSurrogatePair(0x0041, 0x0042); // false

Error Handling

Error Types

`InvalidUnicodeError`

Thrown when Unicode code point is invalid.

Properties:

code: Invalid code point
message: Error description

`FormatError`

Thrown when input format is invalid.

Properties:

input: Invalid input
format: Expected format
message: Error description

`RangeError`

Thrown when code point is out of valid range.

Properties:

code: Out-of-range code point
min: Minimum valid value (0)
max: Maximum valid value (0x10FFFF)

Error Handling Example

try {
  const result = convertUnicodeToText('999999', 'decimal');
} catch (error) {
  if (error instanceof InvalidUnicodeError) {
    console.log(`Invalid Unicode code: ${error.code}`);
  } else if (error instanceof FormatError) {
    console.log(`Invalid format: ${error.message}`);
  }
}

Performance Considerations

Memory Usage

Text Processing: O(n) memory usage for input text
Large Inputs: Processed in chunks to prevent memory issues
History Storage: Limited to 50 entries to manage memory

Processing Speed

Small Text: < 1ms for typical text (up to 1000 characters)
Large Text: Linear time complexity O(n)
Format Conversion: Minimal overhead for format changes

Optimization Tips

Batch Processing: Process multiple characters together
Format Selection: Choose most efficient format for your use case
Input Validation: Validate input early to avoid processing errors
Memory Management: Clear large inputs when not needed

Browser Compatibility

Supported Browsers

Chrome: 41+ (full support)
Firefox: 29+ (full support)
Safari: 10+ (full support)
Edge: 12+ (full support)

Feature Support

String.fromCodePoint(): Modern browsers
String.prototype.codePointAt(): Modern browsers
Unicode Normalization: Modern browsers
Surrogate Pairs: All modern browsers

Polyfills

For older browsers, consider polyfills:

// String.fromCodePoint polyfill
if (!String.fromCodePoint) {
  String.fromCodePoint = function () {
    var chars = [];
    for (var i = 0; i < arguments.length; i++) {
      var code = arguments[i];
      if (code > 0x10ffff) {
        throw new RangeError('Invalid code point');
      }
      if (code <= 0xffff) {
        chars.push(String.fromCharCode(code));
      } else {
        code -= 0x10000;
        chars.push(
          String.fromCharCode(0xd800 + (code >> 10)),
          String.fromCharCode(0xdc00 + (code & 0x3ff)),
        );
      }
    }
    return chars.join('');
  };
}

Integration Examples

Web Component

class UnicodeConverter extends HTMLElement {
  connectedCallback() {
    this.innerHTML = `
      <input type="text" id="input" placeholder="Enter text">
      <select id="format">
        <option value="decimal">Decimal</option>
        <option value="hex">Hexadecimal</option>
      </select>
      <div id="output"></div>
    `;

    this.querySelector('#input').addEventListener('input', (e) => {
      const text = e.target.value;
      const format = this.querySelector('#format').value;
      const result = convertTextToUnicode(text, format);
      this.querySelector('#output').textContent = result;
    });
  }
}

customElements.define('unicode-converter', UnicodeConverter);

Node.js Module

module.exports = {
  convertTextToUnicode,
  convertUnicodeToText,
  isValidUnicode,
  normalizeUnicode,
};

Was this page helpful?

On this page

Core Functions Format Specifications Validation Functions Utility Functions Error Handling Performance Considerations Browser Compatibility Integration Examples