What this converter does
This Text to Data Converter turns readable text into its UTF-8 bytes, and then displays those bytes in one of three common representations:
- Binary (8-bit) — 8 bits per byte
- Octal — base-8 values per byte
- Hex — base-16 values per byte
It also works the other way around: paste binary / octal / hex bytes and decode them back into UTF-8 text.
This is ideal when you’re dealing with byte-level data (encodings, protocols, debugging, education), not when you’re doing math with big numbers.
How to use it
- Paste your text or your encoded bytes into the left box.
- Choose Encode (text → bytes) or Decode (bytes → text).
- Select the output/input format: Binary, Octal, or Hex.
- Optional switches:
- Batch by newline — treats each line as a separate conversion
- Trim lines — removes extra whitespace in batch mode
- Copy the result from the right box.
Nice detail: in batch mode, each line is processed independently, so one bad row won’t stop the entire conversion.
What “UTF-8 bytes” means (in plain English)
Computers don’t store characters directly — they store bytes.
- ASCII characters like
A,b,!are 1 byte in UTF-8. - Many other characters (Greek, emoji, symbols) take 2–4 bytes in UTF-8.
So a “character” is not always equal to a single byte.
Example: ASCII text
"Hi" → 48 69 (hex)
Example: non‑ASCII text
"Ω" → ce a9 (hex)
Example: emoji (multi‑byte)
"🙂" → f0 9f 99 82 (hex)
This is why decoding requires the correct byte sequence — UTF‑8 interprets those bytes into the original characters.
Format details
Binary (8-bit)
Binary output is shown as 8-bit bytes separated by spaces.
Hello → 01001000 01100101 01101100 01101100 01101111
Decoding rule: each group should be exactly 8 bits. If you paste 7-bit chunks, mixed separators, or incomplete bytes, decoding will fail for those chunks.
Octal
Octal output uses base‑8 values per byte.
Hello → 110 145 154 154 157
Decoding rule: each value must represent a byte (0–377 in octal). Values outside the byte range cannot be decoded.
Hexadecimal
Hex output is space-separated, typically shown in lowercase.
Hello → 48 65 6c 6c 6f
Decoding rule: hex bytes are 2 hex characters each (00–ff). If you paste a long hex string without separators, enable trimming/batch settings as needed and ensure it still resolves to even-length byte pairs.
Batch mode: convert many values at once
Batch mode is perfect when you have a list of strings or byte sequences.
Encode in batch mode:
Hello
World
Γειά
🙂
You’ll get one encoded output row per input row.
Decode in batch mode: paste one sequence per line, for example hex:
48 65 6c 6c 6f
57 6f 72 6c 64
ce 93 ce b5 ce b9 ce ac
f0 9f 99 82
Common use cases
- Learning & teaching: see how text becomes binary, octal, or hex
- Debugging encodings: spot wrong bytes or invalid UTF‑8 sequences
- Protocols & file formats: inspect byte streams in logs or docs
- CTF / security practice: quickly decode byte sequences to readable text
- Data cleaning: normalize mixed whitespace, process many rows with batch mode
Troubleshooting
“My decoded text looks wrong (� characters)”
That replacement character usually means the bytes aren’t valid UTF‑8 (or the byte boundaries are wrong). Check:
- You have complete bytes (8 bits for binary, 2 hex digits for hex)
- Your values are in the byte range (0–255 decimal)
- You didn’t accidentally mix formats (hex pasted while binary is selected)
“It decodes some lines but not others”
That’s expected in batch mode: invalid lines are skipped/flagged while valid lines still convert.