UTF-8 Byte Inspector — Code Point & Byte Viewer

Input11 chars · 1 lines

Output527 chars

10 chars · 17 UTF-8 bytes

Char    Code Pt   UTF-8           UTF-16
────    ───────   ─────           ──────
H       U+0048    48              0048
e       U+0065    65              0065
l       U+006C    6C              006C
l       U+006C    6C              006C
o       U+006F    6F              006F
SP      U+0020    20              0020
世       U+4E16    E4 B8 96        4E16
界       U+754C    E7 95 8C        754C
SP      U+0020    20              0020
👋      U+1F44B   F0 9F 91 8B     D83D DC4B

10 code points · 17 UTF-8 bytes · 11 UTF-16 units

About UTF-8 Byte Inspector — Code Point & Byte Viewer

The UTF Byte Inspector breaks any string into individual characters and shows each one's Unicode code point alongside its exact byte sequences in UTF-8 and UTF-16. It's built for developers debugging encoding issues — mojibake, mismatched byte counts, unexpected emoji widths, or surrogate pairs that throw off string length.

Paste a word, an emoji, or a tricky combining sequence and see precisely how it is stored: which code points it contains, how many UTF-8 bytes each takes, and how characters above U+FFFF split into UTF-16 surrogate pairs.

Everything runs entirely in your browser using the built-in TextEncoder and string APIs. Nothing you type is ever uploaded — the inspection happens offline on your device.

Features

Per-character breakdown: code point, UTF-8 bytes, and UTF-16 code units
Correct handling of emoji and supplementary characters (surrogate pairs)
Totals for code points and UTF-8 byte length at a glance
Readable labels for whitespace and control characters; works fully offline

How to use

Type or paste any text into the input pane.
Read the per-character table: each row shows the character, its code point, and its UTF-8 / UTF-16 bytes.
Check the totals line for the code-point count and overall UTF-8 byte length.
Copy the breakdown from the output pane to share or paste into a bug report.

Frequently asked questions

Why does an emoji count as more than one byte?

Most emoji are encoded above U+FFFF, so they take 4 bytes in UTF-8 and a 2-unit surrogate pair in UTF-16. That is why a single emoji can report a string length of 2 in JavaScript — the inspector shows exactly how it splits.

What is the difference between a code point and a byte?

A code point is the abstract Unicode number for a character (e.g. U+00E9 for é). A byte is how that code point is physically stored. UTF-8 uses 1–4 bytes per code point; UTF-16 uses one or two 16-bit code units. One character can map to several bytes.

Why does "é" sometimes show up as two entries?

There are two ways to write é: a single precomposed code point (U+00E9) or the letter e followed by a combining acute accent (U+0301). The inspector iterates code point by code point, so the combining form appears as two separate rows.

Is my text sent to a server?

No. All inspection happens locally in your browser using the built-in TextEncoder and string APIs. Your input never leaves your device.

Related tools

Everything runs locally in your browser — your input is never uploaded.