Opened 3 years ago

Last modified 8 months ago

#5256 new enhancement

Error-tolerant search for house numbers with spaces

Reported by: is-zUser Owned by: geocoding@…
Priority: major Milestone:
Component: nominatim Version:
Keywords: house number search Cc: service@…

Description

In house numbers with addition in Germany different formats are registered at the addresses (eg 1b or 1 B). With Search 1b are found not 1 B, Looking for 1 B is not found 1b. Address lookup results currently only available with the exact search input. Therefore, a house number search is recommended seach two versions (with and without space). Similar as a street search with and without abbreviation is possible.

Background:
Since there are no uniform rules on the wiki regarding the spelling, it is possible that both spellings are used. In general, most people used the notation 1 - no space. But since 2011, ruled the German standard DIN 5008 a notation with a space between the number and letter. Normally this is in Germany taught and sooner or later implemented in practice. Here are a few links (German):

http://www.sekretaerinnen-service.de/newsletterarticle.asp?his=17.42.55.8177&id=16679

http://www.klaus-kolb.de/DIN5008.pdf

http://www.fachlehrerseite.de/viewtopic.php?t=2975

Change History (5)

comment:1 Changed 3 years ago by is-zUser

Small Info: I've opend a Blog and seen this is not only a german problem.

comment:2 Changed 3 years ago by lonvia

  • Summary changed from Introduction of error-tolerant search for house numbers with additional to Error-tolerant search for house numbers with spaces

It would be fairly easy to remove all spaces from house numbers in the OSM data. That would solve the case, were '1 b' is mapped but '1b' is typed in the search box. The case with the other way around is not so simple to use solve because it involves correctly guessing that the term typed in the search box was a house number.

The drawback of this approach is that one cannot find '1 b' anymore when '1 b' was mapped. So the question is, which notation is used more often where.

Just to be clear. This issue is only with spaces in housenumbers. There is no issue with different case of the addendum. Changing ticket subject to reflect that.

comment:3 Changed 3 years ago by pnorman

Of the 2.9 million addr:housenumber on nodes matching starting with a digit and ending with a letter, 144k case-insensitive match '.* [a-z]'.

This doesn't tell you what people are searching for, but hope it helps with the data.

comment:4 Changed 3 years ago by lonvia

Thanks for the numbers. That's more than I expected.

comment:5 Changed 8 months ago by is-zUser

Here a examle: http://www.openstreetmap.org/node/3010554896
It's not from me. We need a unified solution. Thanks!

Note: See TracTickets for help on using tickets.