HTML Entity Encoding: Prevent XSS and Display Special Characters

2026-02-28 4 min read
htmlencodingsecurityxssweb-development

HTML entities let you display special characters that would otherwise be interpreted as code. Understanding encoding and decoding is essential for web development and security.

What are HTML entities?

HTML entities replace characters that have special meaning in HTML:

CharacterEntityName
<<Less than
>>Greater than
&&Ampersand
""Double quote
''Single quote

Without encoding, <script> in text would be interpreted as an actual script tag — a classic XSS vulnerability.

Why encoding matters

Security (XSS prevention)

Cross-site scripting (XSS) is one of the most common web vulnerabilities. It happens when user input is rendered as HTML without encoding:

<!-- User input: <script>alert('hacked')</script> -->

<!-- Without encoding (DANGEROUS): -->
<p><script>alert('hacked')</script></p>

<!-- With encoding (SAFE): -->
<p>&lt;script&gt;alert(&#39;hacked&#39;)&lt;/script&gt;</p>

Always encode user-generated content before inserting it into HTML.

Displaying code snippets

If you’re writing about HTML, you need entities to show tags as text:

<p>Use the &lt;strong&gt; tag for bold text.</p>

Without entities, the browser would render <strong> as actual bold formatting.

Special characters

Characters outside basic ASCII or characters with special meaning need encoding:

  • ©©
  • Non-breaking space →  

Numeric vs named entities

HTML supports two entity formats:

Named entities: &, <, © — readable but only available for common characters.

Numeric entities: &, <, © — work for any Unicode character using its code point. Also available in hex: &.

When to encode vs decode

Encode when:

  • Inserting user input into HTML
  • Displaying code examples on a webpage
  • Storing text that will be rendered in HTML context

Decode when:

  • Processing HTML content for plain text display
  • Extracting text from HTML for analysis
  • Converting HTML entities back to readable characters

Common pitfalls

Double encoding

&amp;amp; → shows as &amp; instead of &

Encoding already-encoded text creates a mess. Always encode from the raw source.

Forgetting the semicolon

&lt  <!-- Some browsers handle this, but it's not reliable -->
&lt; <!-- Always include the semicolon -->

Not encoding ampersands in URLs

<!-- Wrong: -->
<a href="page?a=1&b=2">

<!-- Correct: -->
<a href="page?a=1&amp;b=2">

A browser-based encoder/decoder lets you quickly convert text in both directions — paste HTML to decode entities, or paste plain text to encode special characters.

Try it yourself

Use the tool mentioned in this article — free, no sign-up, runs in your browser.

Open Tool