Opened 4 years ago

Last modified 4 years ago

#4895 new defect

In the search for street names in Spanish, stop words should be eliminated.

Reported by: Emilio Gomez Owned by: geocoding@…
Priority: major Milestone:
Component: nominatim Version:
Keywords: geocoding, spanish, french, stop words Cc:

Description

The Royal Spanish Academy indicates that the preposition "de" (of, in English) should never be omitted in the names of streets, avenues and promenades, unless the name is an adjective: "Calle de Esproceda", "Plaza de Colón", "Avenida de América", "Paseo de Gracia", in the first case, and "Calle Mayor" or "Plaza Nueva" for the second case.

But the right way to name a street is put the preposition de (of) after of the type of road (examples: Calle de Alcalá, Avenida de Pérez Galdos, Plaza de España, etc.), it's very common skip it when contracting the name. This peculiarity must be taken into account by search engines.

Nominatim not currently have this in mind, which makes the search engine not shows a lot of streets that actually exist in OpenStreetMap. Examples:

What makes also Nominatim unhelpful for geocoding reverse directions in this language.

Note that in Spanish besides the preposition de, another usual construct is to use el, la, los or las (all translate to the) just after de: Carretera del faro, Calle de los Caídos de la División Azul, Calle de las Descalzas. Note that "de el" contracts itself to "del" in most cases.

In this link there is a list of stop words that should be ignored in searches, which include the prepositions above.

Change History (4)

comment:1 Changed 4 years ago by Emilio Gomez

Sorry, the links were in reverse:

comment:2 Changed 4 years ago by Cyrille37

  • Keywords french added

comment:3 Changed 4 years ago by Cyrille37

Hi,

The same situation occurs with French. Many people stop using osm.org because it could not found many places, and this "de (of)" problem is in the top 10 raisons.

French example:
"Prieuré de Saint-Cosme, La Riche" => Find the correct place
"Prieuré Saint-Cosme, La Riche" => No result

Cheers.
Cyrille.

comment:4 Changed 4 years ago by lonvia

Nominatim does eliminate a few common stop words, mostly articles. Extending this list is not simple for two reasons: first, eliminating stop words from one language can cause heaps of trouble in another language (or in the case of Nominatim, it causes tons of problems with ref values). Second, we can't just add new stop words to an existing database without transforming all existing names. Not completely unsolvable but needs some more thinking.

If you want to extend the list of stop words for your own installation, extend the list of replacements in the code around here.

Note: See TracTickets for help on using tickets.