Opened 5 years ago

Last modified 5 years ago

#4519 new enhancement

Reverse geocoding gives city of nearest object, not city of current position

Reported by: AlphaTiger Owned by: geocoding@…
Priority: minor Milestone:
Component: nominatim Version:
Keywords: reverse geocoding Cc:

Description

Hi,

With some ways which go through multiple cities, reverse geocoding answers the wrong city, as the address given contains the city at the nearest object's centroid, and not the city of the position asked.

For instance, for this point :
http://www.openstreetmap.org/?mlat=48.503321&mlon=1.702448&zoom=14&layers=M
reverse nominatim answers :
http://nominatim.openstreetmap.org/reverse?lat=48.503321&lon=1.702448

The city of the point is not Bleury-Saint-Symphorien, it is clearly Ymeray; we get this city as answer as the nearest object is "L'Océane" (http://www.openstreetmap.org/browse/way/128525165) and its centroid is in Bleury-Saint-Symphorien.

Reverse geocoding should not "blindly" use the computed address of the nearest object, but should instead (try to) search for the point's current city, region, etc. to give a correct address.

I don't know enough how nominatim works, but wouldn't the SQL query in reverse.php lines 95-102 return the current city in the objects (maybe by removing rank considerations) ?

Thanks !

Change History (6)

comment:1 Changed 5 years ago by Vincent de Phily

FYI, the original report came from a question on help.osm.org. It might make the problem clearer.

comment:2 follow-up: Changed 5 years ago by cquest

The problem lies in using only ST_Distance to order the result.

A better result could be obtained by mixing the selected objets based on 2 different criterias: ST_Distance to name the nearest object (the name of the road/street) and ST_Distance(ST_Centroid()) to get the details about the city/town of the given location and not using the one of the centroid of the nearest objet which can actually be quite far away.

In order to do this, the SQL query needs to provide both ST_Distance and ST_Distance(ST_Centroid()) in its results (which should be minimal additionnal cost in the query) and have some additionnal post-processing in the php code only when the difference between both distances is significant.

comment:3 in reply to: ↑ 2 ; follow-up: Changed 5 years ago by lonvia

Replying to cquest:

The problem lies in using only ST_Distance to order the result.

Sorry but it is not as simple as that because currently there are no distances to any address objects calculated. Doing so would add a pretty heavy performance penalty and is simply not worth the trouble.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 5 years ago by cquest

Replying to lonvia:

Replying to cquest:

The problem lies in using only ST_Distance to order the result.

Sorry but it is not as simple as that because currently there are no distances to any address objects calculated. Doing so would add a pretty heavy performance penalty and is simply not worth the trouble.

Well, the code at https://trac.openstreetmap.org/browser/applications/utils/nominatim/website/reverse.php shows an "ORDER BY ST_Distance" at line 102... so distance is already calculated or I'm not looking at the right code.

comment:5 in reply to: ↑ 4 Changed 5 years ago by lonvia

Replying to cquest:

Replying to lonvia:

Sorry but it is not as simple as that because currently there are no distances to any address objects calculated. Doing so would add a pretty heavy performance penalty and is simply not worth the trouble.

Well, the code at https://trac.openstreetmap.org/browser/applications/utils/nominatim/website/reverse.php shows an "ORDER BY ST_Distance" at line 102... so distance is already calculated or I'm not looking at the right code.

You are not looking at the right code. This is the part where it finds the nearest object. The address parts are requested at line 133ff.

comment:6 Changed 5 years ago by lonvia

  • Priority changed from major to minor
  • Type changed from defect to enhancement

Using the address of the nearest object for the reverse query was a deliberate design decision. Calculating the address involves a lot of voodoo because it has to work world-wide and with different tagging schemas. It would be too expensive to do at each reverse request.

The other thing is that long ways become less and less frequent, the better a region gets mapped. Ways are split for taggings like maxspeed and for route relations and turn restrictions. The moment the way is split, the issue goes away by itself because each part can have a different address. So I wouldn't call it a major issue.

Note: See TracTickets for help on using tickets.