w

API Reference

Technical reference for the Text Unicode Converter's underlying functionality and integration options.

Core Functions

Text to Unicode Conversion

convertTextToUnicode(text, format)

Converts text to Unicode code points in specified format.

Parameters:

  • text (string): Input text to convert
  • format (string): Output format ('decimal', 'hex', 'unicode-escape', 'html-entity')

Returns: (string) Unicode representation

Example:

convertTextToUnicode('Hello', 'decimal');
// Returns: "72 101 108 108 111"

convertTextToUnicode('Hello', 'hex');
// Returns: "U+0048 U+0065 U+006C U+006C U+006F"

getCodePoint(char)

Gets the Unicode code point for a single character.

Parameters:

  • char (string): Single character

Returns: (number) Unicode code point

Example:

getCodePoint('A');
// Returns: 65

getCodePoint('😀');
// Returns: 128512

Unicode to Text Conversion

convertUnicodeToText(unicode, format)

Converts Unicode code points to text.

Parameters:

  • unicode (string): Unicode input
  • format (string): Input format ('decimal', 'hex', 'unicode-escape', 'html-entity')

Returns: (string) Converted text

Example:

convertUnicodeToText('72 101 108 108 111', 'decimal');
// Returns: "Hello"

convertUnicodeToText('U+0048 U+0065 U+006C U+006C U+006F', 'hex');
// Returns: "Hello"

fromCodePoint(code)

Creates a character from Unicode code point.

Parameters:

  • code (number): Unicode code point

Returns: (string) Character

Example:

fromCodePoint(65);
// Returns: "A"

fromCodePoint(128512);
// Returns: "😀"

Format Specifications

Decimal Format

  • Pattern: Space-separated decimal numbers
  • Range: 0 to 1114111 (0x10FFFF)
  • Example: 72 101 108 108 111

Hexadecimal Format

  • Pattern: U+ followed by 4-6 hex digits
  • Range: U+0000 to U+10FFFF
  • Example: U+0048 U+0065 U+006C U+006C U+006F

Unicode Escape Format

  • Pattern: \uXXXX or \u{XXXXXX}
  • Range: \u0000-\uFFFF or \u{0}-\u{10FFFF}
  • Example: \u0048\u0065\u006C\u006C\u006F

HTML Entity Format

  • Pattern: &#xXXXX; or &#XXXX;
  • Range: � to � or � to �
  • Example: Hello

Validation Functions

isValidUnicode(code)

Validates if a code point is valid Unicode.

Parameters:

  • code (number): Unicode code point

Returns: (boolean) True if valid

Example:

isValidUnicode(65); // true
isValidUnicode(128512); // true
isValidUnicode(999999); // false
isValidUnicode(-1); // false

isValidFormat(input, format)

Validates if input matches specified format.

Parameters:

  • input (string): Input to validate
  • format (string): Expected format

Returns: (boolean) True if valid

Example:

isValidFormat('65 66 67', 'decimal'); // true
isValidFormat('U+0041 U+0042', 'hex'); // true
isValidFormat('\\u0041\\u0042', 'unicode-escape'); // true
isValidFormat('AB', 'html-entity'); // true

Utility Functions

normalizeUnicode(text, form)

Normalizes Unicode text using specified form.

Parameters:

  • text (string): Input text
  • form (string): Normalization form ('NFC', 'NFD', 'NFKC', 'NFKD')

Returns: (string) Normalized text

Example:

normalizeUnicode('é', 'NFC'); // "é" (U+00E9)
normalizeUnicode('é', 'NFD'); // "é" (U+0065 U+0301)

getUnicodeBlock(code)

Gets the Unicode block name for a code point.

Parameters:

  • code (number): Unicode code point

Returns: (string) Block name

Example:

getUnicodeBlock(65); // "Basic Latin"
getUnicodeBlock(128512); // "Emoticons"
getUnicodeBlock(19968); // "CJK Unified Ideographs"

isSurrogatePair(high, low)

Checks if two code points form a valid surrogate pair.

Parameters:

  • high (number): High surrogate code point
  • low (number): Low surrogate code point

Returns: (boolean) True if valid pair

Example:

isSurrogatePair(0xd83d, 0xde00); // true (😀)
isSurrogatePair(0x0041, 0x0042); // false

Error Handling

Error Types

InvalidUnicodeError

Thrown when Unicode code point is invalid.

Properties:

  • code: Invalid code point
  • message: Error description

FormatError

Thrown when input format is invalid.

Properties:

  • input: Invalid input
  • format: Expected format
  • message: Error description

RangeError

Thrown when code point is out of valid range.

Properties:

  • code: Out-of-range code point
  • min: Minimum valid value (0)
  • max: Maximum valid value (0x10FFFF)

Error Handling Example

try {
  const result = convertUnicodeToText('999999', 'decimal');
} catch (error) {
  if (error instanceof InvalidUnicodeError) {
    console.log(`Invalid Unicode code: ${error.code}`);
  } else if (error instanceof FormatError) {
    console.log(`Invalid format: ${error.message}`);
  }
}

Performance Considerations

Memory Usage

  • Text Processing: O(n) memory usage for input text
  • Large Inputs: Processed in chunks to prevent memory issues
  • History Storage: Limited to 50 entries to manage memory

Processing Speed

  • Small Text: < 1ms for typical text (up to 1000 characters)
  • Large Text: Linear time complexity O(n)
  • Format Conversion: Minimal overhead for format changes

Optimization Tips

  1. Batch Processing: Process multiple characters together
  2. Format Selection: Choose most efficient format for your use case
  3. Input Validation: Validate input early to avoid processing errors
  4. Memory Management: Clear large inputs when not needed

Browser Compatibility

Supported Browsers

  • Chrome: 41+ (full support)
  • Firefox: 29+ (full support)
  • Safari: 10+ (full support)
  • Edge: 12+ (full support)

Feature Support

  • String.fromCodePoint(): Modern browsers
  • String.prototype.codePointAt(): Modern browsers
  • Unicode Normalization: Modern browsers
  • Surrogate Pairs: All modern browsers

Polyfills

For older browsers, consider polyfills:

// String.fromCodePoint polyfill
if (!String.fromCodePoint) {
  String.fromCodePoint = function () {
    var chars = [];
    for (var i = 0; i < arguments.length; i++) {
      var code = arguments[i];
      if (code > 0x10ffff) {
        throw new RangeError('Invalid code point');
      }
      if (code <= 0xffff) {
        chars.push(String.fromCharCode(code));
      } else {
        code -= 0x10000;
        chars.push(
          String.fromCharCode(0xd800 + (code >> 10)),
          String.fromCharCode(0xdc00 + (code & 0x3ff)),
        );
      }
    }
    return chars.join('');
  };
}

Integration Examples

Web Component

class UnicodeConverter extends HTMLElement {
  connectedCallback() {
    this.innerHTML = `
      <input type="text" id="input" placeholder="Enter text">
      <select id="format">
        <option value="decimal">Decimal</option>
        <option value="hex">Hexadecimal</option>
      </select>
      <div id="output"></div>
    `;

    this.querySelector('#input').addEventListener('input', (e) => {
      const text = e.target.value;
      const format = this.querySelector('#format').value;
      const result = convertTextToUnicode(text, format);
      this.querySelector('#output').textContent = result;
    });
  }
}

customElements.define('unicode-converter', UnicodeConverter);

Node.js Module

module.exports = {
  convertTextToUnicode,
  convertUnicodeToText,
  isValidUnicode,
  normalizeUnicode,
};
Was this page helpful?