import massively drops data in multi process mode #4651

openstreetmap-trac · 2021-07-23T11:45:14Z

Reporter: Nop
[Submitted to the original trac issue database at 1.49pm, Thursday, 25th October 2012]

osm2pgsql works fine when called with number-processes=1. If called with a parameter > 1 it massively drops data. For 4 processes, DB size is about 40% smaller than for working import. No error, osm2pgsql commits the corrupted tables. Swapping behaviour in Munin looks the same in all cases. Latest SVN version, built on 64bit Debian.

call gzip -d -c update.osm.gz |
osm2pgsql/osm2pgsql -c --slim -d topo -p data -C 5000
--number-processes=1 -S topo_import.style /dev/stdin

openstreetmap-trac · 2021-07-23T19:20:17Z

Author: amm
[Added to the original trac issue at 8.32pm, Saturday, 27th October 2012]

Assuming the bug is caused by what I think it is caused by, this only happens if there is not enough memory available to execute the forks for the helper processes.

The helper processes independently go through the pending way / relations array in a stride length of the number of processes. However, if not all helper processes start up and process their part of the share, then that fraction of the pending ways never get processed and are missing in the rendering tables.

There was a fallback that should have prevented this in the code, but the information of the changed number of processes was only communicated to the parent process. So the other helper processes still processed the wrong number of ways / relations.

This is now fixed in commit r28864.

Given that the ways were in the ways / relations table of the database, they would have gotten correctly processed the next time one does an update. However, given that going over pending ways / relations seem orders of magnitude slower in append mode than initial mode, that would have likely been prohibitively expensive.

openstreetmap-trac · 2021-07-23T19:20:19Z

Author: Nop
[Added to the original trac issue at 8.02am, Monday, 29th October 2012]

Allowing overcommit of swap space with "echo 1 > /proc/sys/vm/overcommit_memory" did not work. Maybe I did not apply it properly, maybe there is a different problem.

openstreetmap-trac · 2021-07-23T19:20:21Z

Author: Nop
[Added to the original trac issue at 11.17am, Sunday, 18th November 2012]

According to Sven, overcommit is enabled on Debian by default, that explains why there was no change.

But this would indicate that the drop of data in multi-process mode is not caused by insufficient swap space as assumed.

openstreetmap-trac · 2021-07-23T19:20:24Z

Author: amm
[Added to the original trac issue at 4.42pm, Saturday, 12th January 2013]

So far I don't think I have been able to reproduce this issue.

Could you post the full log of imports both with num-proccesses = 1 and > 1? Also, could you do a count on all of the tables to see where the data is lost?

openstreetmap-trac · 2021-07-23T19:20:26Z

Author: Nop
[Added to the original trac issue at 10.55am, Sunday, 3rd February 2013]

I have built another version from the latest SVN and conducted a series of extended tests. A huge data set was required to provoke the problem. With your fixes, it works with 4 processes and is now live on the server. There's a noticeable difference in the munin protocol: The working version shows a huge peak in committed memory (ca. 35GB) during import that was missing when data was lost before the fix (ca. 12GB).
So I assume that it is fixed now, though for slightly different reasons.

Ticket can be closed.

openstreetmap-trac added Component: osm2pgsql Priority: major Resolution: fixed Type: defect labels Jul 23, 2021

openstreetmap-trac self-assigned this Jul 23, 2021

openstreetmap-trac closed this as completed Jul 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

import massively drops data in multi process mode #4651

import massively drops data in multi process mode #4651

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

import massively drops data in multi process mode #4651

import massively drops data in multi process mode #4651

Comments

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021

openstreetmap-trac commented Jul 23, 2021