2013-06-25 17:22:37 +02:00
HtmlSanitizer
=============
HtmlSanitizer is a class for cleaning HTML fragments from constructs that can lead to [XSS attacks ](https://en.wikipedia.org/wiki/Cross-site_scripting ).
It uses the excellent C# jQuery port [CsQuery ](https://github.com/jamietre/CsQuery ) to parse, manipulate, and render HTML and CSS.
In order to facilitate different use cases, HtmlSanitizer can be customized at several levels:
- Configure allowed HTML tags through the property `AllowedTags` . All other tags will be stripped.
- Configure allowed HTML attributes through the property `AllowedAttributes` . All other attributes will be stripped.
- Configure allowed CSS property names through the property `AllowedCssProperties` . All other styles will be stripped.
2014-11-24 13:41:27 +01:00
- Configure allowed URI schemes through the property `AllowedSchemes` . All other URIs will be stripped.
2013-06-25 17:22:37 +02:00
- Configure HTML attributes that contain URIs (such as "src", "href" etc.) through the property `UriAttributes` .
- Provide a base URI that will be used to resolve relative URIs against.
2014-05-07 19:21:29 +02:00
- Cancelable events are raised before a tag, attribute, or style is removed.
2013-06-25 17:22:37 +02:00
2014-11-24 13:41:27 +01:00
### Tags allowed by default
2014-11-25 17:09:35 +01:00
`a, abbr, acronym, address, area, article, aside, b, bdi, big, blockquote, br, button, caption, center, cite, code, col, colgroup, data, datalist, dd, del, details, dfn, dir, div, dl, dt, em, fieldset, figcaption, figure, font, footer, form, h1, h2, h3, h4, h5, h6, header, hr, i, img, input, ins, kbd, keygen, label, legend, li, main, map, mark, menu, menuitem, meter, nav, ol, optgroup, option, output, p, pre, progress, q, rp, rt, ruby, s, samp, section, select, small, span, strike, strong, sub, summary, sup, table, tbody, td, textarea, tfoot, th, thead, time, tr, tt, u, ul, var, wbr`
2014-11-24 13:41:27 +01:00
### Attributes allowed by default
2014-11-25 17:09:35 +01:00
`abbr, accept, accept-charset, accesskey, action, align, alt, autocomplete, autosave, axis, bgcolor, border, cellpadding, cellspacing, challenge, char, charoff, charset, checked, cite, clear, color, cols, colspan, compact, contenteditable, coords, datetime, dir, disabled, draggable, dropzone, enctype, for, frame, headers, height, high, href, hreflang, hspace, ismap, keytype, label, lang, list, longdesc, low, max, maxlength, media, method, min, multiple, name, nohref, noshade, novalidate, nowrap, open, optimum, pattern, placeholder, prompt, pubdate, radiogroup, readonly, rel, required, rev, reversed, rows, rowspan, rules, scope, selected, shape, size, span, spellcheck, src, start, step, style, summary, tabindex, target, title, type, usemap, valign, value, vspace, width, wrap`
2014-11-24 13:41:27 +01:00
### CSS properties allowed by default
`background, background-attachment, background-color, background-image, background-position, background-repeat, border, border-bottom, border-bottom-color, border-bottom-style, border-bottom-width, border-collapse, border-color, border-left, border-left-color, border-left-style, border-left-width, border-right, border-right-color, border-right-style, border-right-width, border-spacing, border-style, border-top, border-top-color, border-top-style, border-top-width, border-width, bottom, caption-side, clear, clip, color, content, counter-increment, counter-reset, cursor, direction, display, empty-cells, float, font, font-family, font-size, font-style, font-variant, font-weight, height, left, letter-spacing, line-height, list-style, list-style-image, list-style-position, list-style-type, margin, margin-bottom, margin-left, margin-right, margin-top, max-height, max-width, min-height, min-width, opacity, orphans, outline, outline-color, outline-style, outline-width, overflow, padding, padding-bottom, padding-left, padding-right, padding-top, page-break-after, page-break-before, page-break-inside, quotes, right, table-layout, text-align, text-decoration, text-indent, text-transform, top, unicode-bidi, vertical-align, visibility, white-space, widows, width, word-spacing, z-index`
### URI schemes allowed by default
``http, https` `
### Default attributes that contain URIs
`action, background, dynsrc, href, lowsrc, src`
2013-06-25 17:22:37 +02:00
Usage
-----
2013-06-26 17:51:46 +02:00
Install the HtmlSanitizer NuGet package. Then:
2013-06-25 17:37:13 +02:00
2014-05-07 19:21:29 +02:00
```C#
var sanitizer = new HtmlSanitizer();
var html = @"< script > alert ( 'xss' ) < / script > < div onload = "" alert ( ' xss ' ) " " "
+ @"style=""background-color: test"">Test< img src = "" test . gif " " "
+ @"style=""background-image: url(javascript:alert('xss')); margin: 10px"">< / div > ";
var sanitized = sanitizer.Sanitize(html, "http://www.example.com");
Assert.That(sanitized, Is.EqualTo(@"< div style = "" background-color: test " " > "
+ @"Test< img style = "" margin: 10px " " src = "" http: / / www . example . com / test . gif " " > < / div > ");
```
2013-06-25 17:22:37 +02:00
License
-------
[MIT X11 ](http://en.wikipedia.org/wiki/MIT_License )