Opened 10 years ago

Closed 10 years ago

#2264 closed defect (fixed)

Railsport adds escaped HTML to Atom output

Reported by: avar Owned by: Tom Hughes
Priority: minor Milestone:
Component: website Version:
Keywords: Cc:


A changeset contains a URL the URL-linkifier will clash with the escape routine for the Atom output and the Atom feed will contain escaped HTML for the URL, e.g.:/browse/changeset/2380396

<td>ref = &lt;a href=""&gt;;/a&gt;</td>

This is the changeset that produced it.

Attachments (1)

new-sanitize.patch (2.5 KB) - added by avar 10 years ago.
A proof of concept sanitize patch

Download all attachments as: .zip

Change History (4)

comment:1 Changed 10 years ago by avar

The problem is that the XML builder list.atom.builder is using is escaping the HTML.

However the content can't be made to be un-escaped either. The reason is that Rails' own sanitize function doesn't do a good job of cleaning out HTML. If someone puts an unclosed HTML tag in a changeset, e.g.:

comment = <b>Unclosed

That will break the Atom feed for all XML parsers.

There's <a href="">better sanitizer</a> available on rubygems (<a href="">source</a>) which is whitelist based.

Attached is a patch that uses it. The patch is craptastic because I don't know how to add a utility function to the application so that it'll be seen by the template code. But it's a proof of concept.

With it HTML/XML output will be safe everywhere.

The downside is that if someone literally inserts e.g. "<b><a>" into a changeset tag that'll be rendered as "". I.e. stripped away.

I haven't found out how to make the sanitizer accept links but escape everything else.

Changed 10 years ago by avar

Attachment: new-sanitize.patch added

A proof of concept sanitize patch

comment:2 Changed 10 years ago by avar

Anyway until links can safely be included in our Atom feeds without a stray HTML element blowing the whole feed I've removed the autolinking. See [17621].

comment:3 Changed 10 years ago by tomhughes

Resolution: fixed
Status: newclosed

(In [17626]) Output tags to the feed as raw HTML so that links are not escaped (any real HTML in the tag has already been escaped). Closes #2264.

Note: See TracTickets for help on using tickets.