Opened 10 years ago

Closed 10 years ago

#1674 closed enhancement (fixed)

[PATCH] Improve performance of large imports

Reported by: Travers Carter Owned by: Chris Browet
Priority: minor Milestone:
Component: merkaartor Version:
Keywords: Cc:

Description

When importing large OSM XML files, merkaartor gets slower and slower as the parsing progresses. Importing a 66Mb file takes 2:24 on my machine (8Gb Athlon64 x2 5000+), The first 50% or so of the parsing stage completes in around 20 seconds, but then just gets slower and slower as it progresses. After the import is complete the map takes a further ~22 seconds to render.

This may be related to http://trac.openstreetmap.org/ticket/1464 too, but I can't replicate that problem with a random 1.5Mb gpx track.

I ran cachegrind (valgrind) and found some things that help speed it up quite a bit. My main 66M test file is roughly this area: http://www.openstreetmap.org/?lat=-34.023&lon=150.796&zoom=9&layers=B000FTF extracted from australia.osm with osmosis.

Please consider applying the attached patches

merkaartor-osm-xml-import-speed.diff


Improves import time from 2 min 24 seconds to ~21 seconds

1) The linear tagList and tagValues in MapDocument? become a bottleneck with several thousand entries, I converted these these to a single QHash of QSets which shows a huge improvement in parse time, and makes the progress fairly linear.

2) Updating the progress bar for every object processed is quite slow, and so I changed it to only actually update it every 1% of progress that is made.

3) QDateTime::fromString(..., Qt::ISODate) is somewhat faster than QDateTime::fromString(..., "yyyy-MM-ddTHH:mm:ss") since there is no need for Qt to parse the format string on every call.

4) QUUID::toString() is surprisingly slow, I couldn't find a particularly nice solution, but I came up with a static table approach that while really ugly is around 10 times faster than the Qt built-in according to my cachegrind results.

merkaartor-tagselector-skipregex.diff


Improves drawing time from ~22 seconds to ~16 seconds

This one just tries to substitute the QRegExp matches for simple string compares where the selector rules don't use any Meta-characters by doing a little more work when first parsing the selector, it also performs isOneOf matches in cost order.

merkaartor-sort-mapfeatures.diff


Further improves drawing time from ~16 seconds to ~13 seconds

It seems that sorting a large list of QPointers is much more expensive than sorting native pointers, so much so that it's faster to convert the whole list of features to native pointers, sort it, and then convert it back. I separated this from the above patch because I'm not really sure what the risks of not using a guarded pointer for this step are, eg could map features be deleted while the sort is running?

Attachments (3)

merkaartor-xml-import-speed.diff (10.1 KB) - added by Travers Carter 10 years ago.
OSM XML Import performance improvements
merkaartor-tagselector-skipregex.diff (3.1 KB) - added by Travers Carter 10 years ago.
Improve performance of simple tagselector matches
merkaartor-sort-mapfeatures.diff (928 bytes) - added by Travers Carter 10 years ago.
Improve performance of map feature sorting

Download all attachments as: .zip

Change History (5)

Changed 10 years ago by Travers Carter

OSM XML Import performance improvements

Changed 10 years ago by Travers Carter

Improve performance of simple tagselector matches

Changed 10 years ago by Travers Carter

Improve performance of map feature sorting

comment:1 Changed 10 years ago by Chris Browet

Owner: changed from cbro@… to Chris Browet
Status: newassigned

Hi.

Thanks for all this.

I'll commit the first two patches as-is. They seems like nice optimizations. For the third, I'll remove the QPointer stuff altogether. Generally speaking, they are very inefficient with large lists.

The only purpose of them is that there seem to be at least one "black spot" in the code, where features are deleted but not removed from the layers. This leads to segfault when the layer is cleared.

Obviously, at segfault time, it is impossible to backtrack where the original problem is, so I guarded the pointers as a cheap workaround. I have to find something else, or better, find the black spots ;-)

comment:2 Changed 10 years ago by Chris Browet

Resolution: fixed
Status: assignedclosed

(In [14341]) FIX : Do not use guarded pointers (too slow) (closes #1674) FIX : Styles tag selection speed optimisation (by Trav) FIX : OSM/GPX import optimisations (by Trav) FIX : install .desktop file on "make install"

Note: See TracTickets for help on using tickets.