Skip to content
This repository has been archived by the owner on Jul 24, 2021. It is now read-only.

Massive GPX uploads can wedge the GPX upload queue for hours, days even #580

Closed
openstreetmap-trac opened this issue Jul 23, 2021 · 3 comments

Comments

@openstreetmap-trac
Copy link

Reporter: Steve Hosgood
[Submitted to the original trac issue database at 10.26am, Wednesday, 31st October 2007]

Surely there should be some sort of "fair scheduler" on GPX uploads. As I write this, a certain user (from Spain) has uploaded about 12 GPX files, the first of which has been processing for 19 hours now, and if all his other uploads are similar lengths, the system is going to be hosed for at least 200 more hours.

This happened last week, though the jam was "only" around 7 hours, where a user (from Britain) uploaded several 100,000 point GPX files.

Suggestions: ban users loading more than X kilobytes of data in a single file. Chances are they've included hours of logging themselves parked in a car-park and didn't remove that bit!

Another suggestion: queue big files separately and process them only when the "small jobs" queue is empty.

@openstreetmap-trac
Copy link
Author

Author: steve[at]nexusuk.org
[Added to the original trac issue at 12.28pm, Wednesday, 31st October 2007]

I've not yet looked at the source code, but it seems that the XML parser might be loading the entire DOM tree into memory before starting to import the GPS points into the database. The way the GPX files are structured means that this isn't necessary and it would be far more memory efficient to handle the GPX file as a stream, adding each point to the database as it is read and then discarding the data from memory immediately.

I'll raise my hand and admit that I was the British user that uploaded several large GPX files (sorry) - I wasn't expecting it cause any problems. One of the files failed to parse and returned a Ruby stack trace to me by email after running out of memory, adding weight to my suspicion that it is loading the whole file into memory before processing it.

As a side note, is it such a good idea, from a security perspective, to send stack traces to users? Seems that it would make it easier for a malicious user to "tune" an attack.

@openstreetmap-trac
Copy link
Author

Author: tom[at]compton.nu
[Added to the original trac issue at 12.54pm, Wednesday, 31st October 2007]

I see the department of generalised hand waving has turned up again...

First up, you might want to apologise to the user from Spain as this problem has nothing to do with them - the problem traces are the two private ones that you can't see in the public list and the Spanish traces are quite small in fact.

Secondly, no we are not stupid enough to load the entire trace into a DOM tree! Please give us some credit for not being complete morons or at least read the code before engaging in random guesswork.

Thirdly, the only reason I haven't unjammed the system by removing the two large traces is because I'm trying to diagnose and fix the the problem!

For the record the problem is nothing to do with parsing the trace or adding it to the database - that is working fine. The problem is creating the images, and I'm working on recoding that to make it run in a stable memory footprint.

@openstreetmap-trac
Copy link
Author

Author: tomhughes
[Added to the original trac issue at 1.46pm, Wednesday, 31st October 2007]

(In [5257]) Rework image generation to work in a fixed amount of memory. Closes #580.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant