Skip to content
This repository has been archived by the owner on Jul 24, 2021. It is now read-only.

dividing of street names (handling of composita) #4827

Open
openstreetmap-trac opened this issue Jul 23, 2021 · 3 comments
Open

dividing of street names (handling of composita) #4827

openstreetmap-trac opened this issue Jul 23, 2021 · 3 comments

Comments

@openstreetmap-trac
Copy link

Reporter: jotpe
[Submitted to the original trac issue database at 12.47pm, Tuesday, 2nd April 2013]

Nominatim helps me to resolve incorrect divided streetnames like this:
"Main Str., 51149 Kln" to "Mainstr., 51149 Kln"
[http://nominatim.openstreetmap.org/search.php?q=Main+Str.%2C+51149+Kln]

German street names have often "Weg" as last name part. Is it possible to handle this correction also with "Weg".

Examples:

  1. Real street name is "Ziegeleiweg, 51149 Kln":
    Should be found with "Ziegelei Weg" or "Ziegelei-Weg", if no other hits are available.
  2. Real street name is "Urbacher Weg, 51149 Kln":
    Should be found with "Urbacherweg", if no other hits are available.
  3. To be complete: Real street name is "Theodor-Schnitzler-Weg, 51149 Kln":
    Can be found with "Theodor Schnitzler Weg", but should be also be findable with "Theodorschnitzlerweg", if no other hits are available.

Thanx

@openstreetmap-trac
Copy link
Author

Author: lonvia
[Added to the original trac issue at 9.38pm, Wednesday, 21st August 2013]

Normalization does split off "strasse" from German street names and the same could be done with "weg" and some other common German suffixes. That won't help for the third case, though. So a more general language-independent handling of composita is required here.

See also #4572, #4961

@openstreetmap-trac
Copy link
Author

Author: florian_rittmeier
[Added to the original trac issue at 11.41pm, Friday, 27th December 2013]

Regarding the third case: Would it be a good solution to modify osm2pgsql/output-gazetter.c so that it adds an additional alt_name containing the variant of the name without spaces and minus signs? So if the name tag holds a composita the non composita variant would be added as alternative.

The question is, should this only apply to the name tag or to all name like tags (tags like int_name, nat_name, loc_name...)?

@openstreetmap-trac
Copy link
Author

Author: lonvia
[Added to the original trac issue at 10.55am, Saturday, 28th December 2013]

This could even be done during indexing in sql by simply adding an unhyphened version to the search terms and it would be less of a hack there.

I don't see too much of an issue reducing hyphens(1) but I'm not sure about spaces. It is simply too difficult distinguish composite-like words (e.g. Freiberger Weg) and true multi-word names (e.g. Auf dem Berg) and would introduce a lot of bad search terms. They probably wouldn't do much harm for searching itself but we already have issues with DB indexes over the search terms growing too large, so the less unnecessary terms the better.

(1) Thinking a bit further, it might even be a good idea to always remove hyphens and full stops from the complete word while still adding the composita parts as partial words.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant