Opened 7 years ago

Last modified 5 years ago

#4382 new defect

Invalid character in GPX track description or changeset comments produces bad XML in API response

Reported by: don-vip Owned by: rails-dev@…
Priority: minor Milestone: OSM 0.6
Component: api Version:
Keywords: Cc:

Description

Hi,

We have a JOSM bug (#josm7648) that should be fixed on OSM server side as it will impact all applications processing XML API responses.

This track contains an invalid character in its description:
http://www.openstreetmap.org/user/CornyJoe/traces/1209300

(0x001A is not the expected German character). This cause XML parsers to fail when downloading GPX data in this area.

Is it possible to ensure GPX tracks descriptions replied by the API contain only valid XML characters ?

Thanks

Change History (11)

comment:1 Changed 7 years ago by simon04

The linked track does not contain the <desc> element. Instead, consider http://api.openstreetmap.org/api/0.6/trackpoints?bbox=10.038045799999999,49.7493614,10.0412616,49.750925599999995&page=0 and see <desc>Schnellstrae (15.0m)</desc> (around line 1246) and note the Ux001A character.

Notice that Ux001A is not an allowed XML character.

comment:2 Changed 6 years ago by bastik

Any update on this one?

comment:3 Changed 6 years ago by TomH

If there was I imagine the person fixing it would have commented here.

comment:4 Changed 6 years ago by stoecker

  • Priority changed from minor to major

We get lots of reports on this issue.

comment:5 follow-up: Changed 6 years ago by TomH

  • Priority changed from major to minor

Bumping the priority is not going to produce a magic code elf to fix it though - if it's important to you then a patch is what will get things changed, not a metadata change on a bug.

comment:6 Changed 6 years ago by don-vip

We just have implemented a workaround, as it appears unlikely to see this bug fixed someday.

comment:7 in reply to: ↑ 5 Changed 6 years ago by stoecker

Replying to TomH:

Bumping the priority is not going to produce a magic code elf to fix it though - if it's important to you then a patch is what will get things changed, not a metadata change on a bug.

Increasing priority of duplicate reports actually is normal.

We now have a workaround, but this issue still is not minor. The server delivers invalid XML.

comment:8 Changed 6 years ago by TomH

Well it turns out that this is very odd in fact... We are using libxml to return this so I find it hard to believe that it isn't doing the right escaping, but that does seem to be what is happening. Here is our code:

https://github.com/openstreetmap/openstreetmap-website/blob/master/app/controllers/api_controller.rb#L71

which is just invoking << on an XML::Node object, which is documented here:

http://xml4r.github.io/libxml-ruby/rdoc/classes/LibXML/XML/Node.html#method-i-3C-3C

in turn that calls the xmlNodeAddContent function in libxml:

http://xmlsoft.org/html/libxml-tree.html#xmlNodeAddContent

according to which the content being added should be "raw text, so unescaped XML special chars are allowed" which suggests that it should do the escaping.

comment:9 Changed 6 years ago by TomH

Of course I also don't understand why that trace has U+1A there... I assume it is meant to be ß but I can't imagine that any character set would have that at 0x1a.

comment:10 Changed 5 years ago by don-vip

  • Summary changed from Invalid character in GPX track description produces bad XML in API response to Invalid character in GPX track description or changeset comments produces bad XML in API response

There's another example of bad xml delivered by OSM API. Try to get history of node 415524175:

http://www.openstreetmap.org/node/415524175/history

You will see that comment text of changeset 1424555 (created by Potlatch 1.0) contains exotic characters. Then our XML parser is unable to parse it:

https://josm.openstreetmap.de/ticket/9647

Is there hope to see this bug fixed inside OSM API or do we need again to find a workaround ?

comment:11 Changed 5 years ago by don-vip

  • Milestone set to OSM 0.6
Note: See TracTickets for help on using tickets.