Skip to content
This repository has been archived by the owner on Jul 24, 2021. It is now read-only.

API changeset upload command - performance optimisation #3035

Closed
openstreetmap-trac opened this issue Jul 23, 2021 · 3 comments
Closed

API changeset upload command - performance optimisation #3035

openstreetmap-trac opened this issue Jul 23, 2021 · 3 comments

Comments

@openstreetmap-trac
Copy link

Reporter: morb_au
[Submitted to the original trac issue database at 1.29pm, Saturday, 5th June 2010]

(This refers to the rails_port from subversion r19645. When preparing this bug report I've just noticed the rails_port is now hosted through git. Since I am not familiar with git, I don't know if this request is a duplicate or already been fixed.)

Whenever I do an osmosis (0.32.x) --upload-xml-change to my "playpen" osm api server, it takes an extraordinary amount of time to complete compared to other steps in my data flow. For example I am importing an ogr dataset in chunks of up to 50,000 elements, and other steps such as ogr2ogr and "osmosis --read-pgsql --dataset-dump --read-xml --derive-change --write-xml-change" take a minute or two, but --upload-xml-change takes in the order of 40 minutes on an Amazon Web Services m1.small machine.

On the rails_port server the slow command appears to be of the form /api/0.6/changeset/1/upload

Looking at the rails_port logs, there are several actions related to Changeset Load and Changeset Update, that appear to be particularly slow to complete (2 to 6 seconds at times). See file attachment for an example. I can add more of the appropriate part of the logs if that would help.

Perhaps some optimisation could occur around rails ActiveRecord usage, taking into consideration the whole changeset upload appears to be wrapped around an atomic PostgreSQL transaction anyway. Intermediate updating of changesets.closed_at and changesets.num_changes seems pointless in the middle of a PostgreSQL transaction.

I'm not a rails programmer though. I don't know if the optimisation would be feasible.

@openstreetmap-trac
Copy link
Author

Author: TomH
[Added to the original trac issue at 10.58am, Sunday, 6th June 2010]

Something taking a long time is not, per se, a bug. Sometimes things are just hard and take a long time.

Both the statements you have indicated look like they probably only took as long as they did due to lock contention as there is no other reason (on even vaguely sensible hardware) why such simple statements should take so long otherwise.

If you have a concrete bug report and/or fix then please provide it but as it stands there is no interesting information in this bug as far as I can see.

@openstreetmap-trac
Copy link
Author

Author: morb_au
[Added to the original trac issue at 3.02am, Thursday, 10th June 2010]

Note I labelled this ticket as an enhancement on the wishlist, not a bug. I suppose I could have named the attachment more appropriately, sorry for the confusion.

I am convinced it's a valid topic for enhancement - mind you I don't have any other topological GIS database implementations to compare it with. Maybe all topological database inserts appear equally somnolent to the client.

I'm currently convinced that the slowness of the OSM API implementation is a slave to various rails implementation decisions, and indeed, the rails design philosophy. Therefore I'm going to try an "out of band" method that precomputes the additional database rows such that they may simply be added to an OSM API database using the "psql \copy" paradigm.

For my use case the "premature optimisation is the root of all evil" principle no longer applies. I appear to have exhausted all reasonable in-band optimisation opportunities. I realise my approach is brittle with respect to evolution of the API, and I don't mind. I realise my approach is essentially introducing aspects of multi master replication with respect to the allocation of row ids etc.

My intent is to report back to this ticket as to my success or otherwise using this alternate approach. I'm fine if you assign this ticket back to me.

@openstreetmap-trac
Copy link
Author

Author: TomH
[Added to the original trac issue at 2.30pm, Friday, 11th June 2010]

Yes, we know rails is a really shitty tool to use for bulk data processing like changeset upload and the map call. That's why we're gradually working towards rewriting that side of things in C++ to avoid all the problems with it.

We will not be accepting any patches that try and change the rails code to not actually look like rails code - that includes anything that tries to add rows to tables using bulk copy.

This trac is not your personal playpen - if you don't have a concrete bug report or reasonable enhancement request for the rails code then please don't raise tickets here. I don't care what crazy code you write for commonmap, but if it's not related to our code then please don't use this tracker for it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant