Hyphen / Dash Normalizer

Sort out hyphens, en dashes, em dashes, minus signs, and Japanese long sound marks by role instead of by shape alone.

URLs, dates, versions, CLI flags, and code are protected by default, and everything runs inside your browser.

Your text stays on this device. No upload, no account, no server-side processing.

Normalize punctuation-like dashes without breaking structured data

This tool treats Unicode look-alikes as different roles. It keeps technical tokens intact while helping you standardize editorial text, multilingual content, and Japanese mixed writing.

How to use

  1. Paste the text you want to clean up.
  2. Choose a preset and adjust the confidence level if the text is ambiguous.
  3. Review how many items were changed, protected, or left for manual review.
  4. Copy the result, save the report, or send the result back for another pass.

Examples

Turn ranges and parenthetical breaks into distinct marks

Input
Chapters 10-12 - revised edition
Output
Chapters 10–12 — revised edition

Use the Unicode minus sign for math

Input
Temperature dropped to -5 and x-1 = 0.
Output
Temperature dropped to −5 and x−1 = 0.

Protect dates, URLs, and CLI flags

Input
Release 2026-03-12, URL https://example.com/my-tool, flag --dry-run
Output
The date, URL, and CLI flag stay untouched while surrounding prose is normalized.

Keep Japanese long sound marks separate from dashes

Input
スーパー - A-B - 3-5kg
Output
スーパー ― A‐B ― 3–5kg

Key terms

Hyphen

A connector inside a word or compound term. Unicode provides dedicated hyphen characters such as U+2010.

En dash / Em dash

Dash characters commonly used for ranges, breaks, or parenthetical pauses, depending on the writing system and style guide.

Minus sign

The mathematical negative or subtraction symbol. Unicode assigns U+2212 for this role.

Long sound mark

A Japanese character used to extend vowel sounds in katakana words. It is not the same thing as a dash.

Notes from Unicode reality

  • The ASCII hyphen-minus is convenient to type, but it collapses several different punctuation roles into one character.
  • Japanese ー and Japanese-style ― can look similar in some fonts while still being different code points.
  • Over-normalizing technical text can break commands, versions, slugs, and URLs even when the output looks nicer.

FAQ

Is my text uploaded anywhere?

No. Protection, classification, and normalization all run in your browser only.

What gets protected automatically?

By default the tool protects URLs, emails, dates, times, versions, IDs, file paths, CLI flags, code blocks, inline code, and basic markup.

Why not replace every hyphen-minus blindly?

Because the ASCII hyphen-minus can mean a word hyphen, a range dash, a parenthetical dash, or a mathematical minus sign. A blind replace often breaks real data.

Will it change Japanese long sound marks?

Not by default. You can optionally normalize the half-width long sound mark to the full-width form, but the tool will not turn long sound marks into dashes.

Does it work with multilingual or RTL text?

Yes. Inputs and outputs use automatic text direction, and the page is designed so English can be the source for future translations.

What happens to ambiguous cases?

The default behavior is to preserve them and list them for review. You can switch to a stronger mode if your style guide prefers aggressive normalization.

Notes

  • Protection rules are safety-first heuristics. If you really want to normalize inside a protected token, turn that protection off first.
  • Aggressive normalization can conflict with house style, legal drafting rules, or domain-specific notation.
  • The shared URL contains settings only. It never includes the input text or the output text.