Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#5111 closed defect (wontfix)

'France' relationship probably broken

Reported by: zeenix@… Owned by: geocoding@…
Priority: major Milestone:
Component: nominatim Version:
Keywords: Cc:

Description

Our nominatim testcases in geocode-glib that search for "bonneville" with "fr_FR" locale are currently broken because Nominatim is not returning "country" in address object but only "country_code". The test case also breaks on the description of the place as we rely on country to construct a good description for place.

Change History (9)

comment:1 Changed 5 years ago by Sarah Hoffmann

Component: datasourcesnominatim
Owner: changed from mikel_maron@… to geocoding@…

comment:2 Changed 5 years ago by Sarah Hoffmann

It turns out that the relation for Metropolitan France has inconsistent data because the admin_level has been reduced from 2 to 3. admin_level has been updated to level 3 but rank_address is still at 4. That means that it clashes with the actual country relation for France in the place_address table overriding it as an address part, so France is no longer an official part of the address. When computing the address details, admin_level is used instead rank_address and then Metropolitan France becomes a state, so that there is no country in the address.

The proper way of fixing would be to replace a place in the placex table when the admin level changes. But in the case of the Metropolitan France relation that means that the entire country gets reindexed which is not really doable. I can neither think of any way to hack the change of the admin level into the placex and place_addressline tables in a way that would actually fix the problem above.

So I'm thinking of doing the replacement in placex and putting a hard limit of around 10000 on the number of linked objects to be reindexed. @twain47 any better ideas?

comment:3 Changed 5 years ago by Sarah Hoffmann

Resolution: fixed
Status: newclosed

Fixed the admin_level update in 98b93df/nominatim. Remains to be seen if poldi can handle the additional load or if the reindexing needs to be restricted further.

As for Metropolitan France: I've manually fixed the search ranks but it is not possible to reindex all of France to fix the addresses. They will sort themselves out with time when the OSM objects are edited for other reasons.

comment:4 Changed 5 years ago by zeenix@…

Resolution: fixed
Status: closedreopened

This doesn't seem to be fixed. I still don't get country in address in address for 'Bonneville, Rhône-Alpes' (place id: 97213041):

http://nominatim.openstreetmap.org/search?q=bonneville&limit=10&addressdetails=1&accept-language=fr-FR&format=xml

comment:5 Changed 5 years ago by Sarah Hoffmann

Resolution: wontfix
Status: reopenedclosed

As I said, the underlying update problem has been fixed but the data that was already wrong has not been corrected. Recomputing all of France would bring down the server for a day or so. So if you prefer the wording: data-wise it is a won't fix until the next full reimport.

comment:6 in reply to:  5 Changed 5 years ago by zeenix@…

Replying to lonvia:

As I said, the underlying update problem has been fixed but the data that was already wrong has not been corrected. Recomputing all of France would bring down the server for a day or so. So if you prefer the wording: data-wise it is a won't fix until the next full reimport.

When is that going to happen? How do i track this issue (so that we can re-enabled our failing test cases)? Why not keep this open until that is done. If you claim that issue has been fixed why resolve it to 'wontfix'?

comment:7 Changed 5 years ago by Sarah Hoffmann

This is a bug tracker for the software, not for the contents of the database. I've set it to won't fix now so that people don't keep reopening this bug just because there is still some bad data left.

Reimports happen infrequently, I can't tell you when the next one is going to be and we've never explicitly announced them as they happen without any downtime. If your tests rely on our servers always delivering the same data then you might want to reconsider your test strategy. OSM is a live database with lots of editing happening and breaks like this can happen. If you need the country information, you should use country_code from the address details as a fallback.

comment:8 in reply to:  7 Changed 5 years ago by zeenix@…

Replying to lonvia:

This is a bug tracker for the software, not for the contents of the database.

Ah ok, fair enough.

I've set it to won't fix now so that people don't keep reopening this bug just because there is still some bad data left.

I don't think the status/resolution of the bug is about telling people not to reopen it but you can use it however you like.

Reimports happen infrequently, I can't tell you when the next one is going to be and we've never explicitly announced them as they happen without any downtime.

It would be really nice to have a way to track that. I imagine we wont be the only one finding such bugs.

If your tests rely on our servers always delivering the same data then you might want to reconsider your test strategy. OSM is a live database with lots of editing happening and breaks like this can happen.

Its fine for tests to break every now and then and us having to either update our test cases or Nominatim database.

If you need the country information, you should use country_code from the address details as a fallback.

The issue is that we are testing localization in this case and country_code is not localized. Also country name (along with place's name and state etc) is used to create a nice user-friendly name for each search result in geocode-glib. We are also testing that code here. Country name is important here because the same place exists in other countries and this code of ours is therefore supposed to include the country name in the friendly name.

comment:9 Changed 5 years ago by Sarah Hoffmann

The localization test does not sound like it makes sense because localization happens on the server side. So, correct localization of the address parts should be tested by Nominatim internally not by your library. You can only test that you send the right parameters to trigger localization. Do that on a simple query like 'France' or 'Paris' and check that you get a different display_name back.

For the tests for user-friendly names, it makes more sense to rely on mock-up answers than fixing your tests every time the Nominatim database changes.

Note: See TracTickets for help on using tickets.