Opened 10 years ago

Closed 10 years ago

#3035 closed enhancement (wontfix)

API changeset upload command - performance optimisation

Reported by: morb_au Owned by: morb_au
Priority: minor Milestone: Wishlist
Component: api Version:
Keywords: api changeset upload optimisation slow inefficient performance Cc:


(This refers to the rails_port from subversion r19645. When preparing this bug report I've just noticed the rails_port is now hosted through git. Since I am not familiar with git, I don't know if this request is a duplicate or already been fixed.)

Whenever I do an osmosis (0.32.x) --upload-xml-change to my "playpen" osm api server, it takes an extraordinary amount of time to complete compared to other steps in my data flow. For example I am importing an ogr dataset in chunks of up to 50,000 elements, and other steps such as ogr2ogr and "osmosis --read-pgsql --dataset-dump --read-xml --derive-change --write-xml-change" take a minute or two, but --upload-xml-change takes in the order of 40 minutes on an Amazon Web Services m1.small machine.

On the rails_port server the slow command appears to be of the form /api/0.6/changeset/1/upload

Looking at the rails_port logs, there are several actions related to Changeset Load and Changeset Update, that appear to be particularly slow to complete (2 to 6 seconds at times). See file attachment for an example. I can add more of the appropriate part of the logs if that would help.

Perhaps some optimisation could occur around rails ActiveRecord? usage, taking into consideration the whole changeset upload appears to be wrapped around an atomic PostgreSQL transaction anyway. Intermediate updating of changesets.closed_at and changesets.num_changes seems pointless in the middle of a PostgreSQL transaction.

I'm not a rails programmer though. I don't know if the optimisation would be feasible.

Attachments (1)

devel.bug.txt (1.6 KB) - added by morb_au 10 years ago.
Sample rails development log of the slow ActiveRecord? database actions

Download all attachments as: .zip

Change History (5)

Changed 10 years ago by morb_au

Attachment: devel.bug.txt added

Sample rails development log of the slow ActiveRecord? database actions

comment:1 Changed 10 years ago by Tom Hughes

Resolution: invalid
Status: newclosed

Something taking a long time is not, per se, a bug. Sometimes things are just hard and take a long time.

Both the statements you have indicated look like they probably only took as long as they did due to lock contention as there is no other reason (on even vaguely sensible hardware) why such simple statements should take so long otherwise.

If you have a concrete bug report and/or fix then please provide it but as it stands there is no interesting information in this bug as far as I can see.

comment:2 Changed 10 years ago by morb_au

Resolution: invalid
Status: closedreopened

Note I labelled this ticket as an enhancement on the wishlist, not a bug. I suppose I could have named the attachment more appropriately, sorry for the confusion.

I am convinced it's a valid topic for enhancement - mind you I don't have any other topological GIS database implementations to compare it with. Maybe all topological database inserts appear equally somnolent to the client.

I'm currently convinced that the slowness of the OSM API implementation is a slave to various rails implementation decisions, and indeed, the rails design philosophy. Therefore I'm going to try an "out of band" method that precomputes the additional database rows such that they may simply be added to an OSM API database using the "psql \copy" paradigm.

For my use case the "premature optimisation is the root of all evil" principle no longer applies. I appear to have exhausted all reasonable in-band optimisation opportunities. I realise my approach is brittle with respect to evolution of the API, and I don't mind. I realise my approach is essentially introducing aspects of multi master replication with respect to the allocation of row ids etc.

My intent is to report back to this ticket as to my success or otherwise using this alternate approach. I'm fine if you assign this ticket back to me.

comment:3 Changed 10 years ago by morb_au

Owner: changed from Tom Hughes to morb_au
Status: reopenednew

comment:4 Changed 10 years ago by Tom Hughes

Resolution: wontfix
Status: newclosed

Yes, we know rails is a really shitty tool to use for bulk data processing like changeset upload and the map call. That's why we're gradually working towards rewriting that side of things in C++ to avoid all the problems with it.

We will not be accepting any patches that try and change the rails code to not actually look like rails code - that includes anything that tries to add rows to tables using bulk copy.

This trac is not your personal playpen - if you don't have a concrete bug report or reasonable enhancement request for the rails code then please don't raise tickets here. I don't care what crazy code you write for commonmap, but if it's not related to our code then please don't use this tracker for it.

Note: See TracTickets for help on using tickets.