In the map overlay problem, two maps, or planar graphs, are
combined to make a third, each of whose polygons represents the
intersection of one polygon from the first input map with one from
the second. For instance, if the first map's polygons are
countries and the second watersheds, then one output polygon might
be that part of Canada that drains into Hudson Bay. The overlay
problem is one of the more difficult computational issues in GIS
since the algorithm is complex, the data are large, and the
numerical inaccuracy of computers can confound even correct
algorithms, White[
Often the overlay operation is performed only to find some property of the output polygons, such as their area, or even just to list the non-empty ones. The polygons' actual vertices and edges are not used otherwise. Nevertheless, the usual method is to find the overlay polygons first, and then to calculate the desired properties.
For example, assume that we need the population of each hydrographic polygon in the USA, but have only the population of each county. One reasonable method to find the population of one particular hydrographic polygon is areal interpolation, as follows. Find each county that it overlaps, the overlapping area with each county, and then the percentage of each county that is overlapped. Assign the same percentage of the county's population to the hydrographic polygon. This does assume that each county's population is uniformly distributed, which is false, but perhaps approximately correct.
Our algorithm, OVERPROP, presents a simpler method for finding
properties of overlay polygons. It is much more efficient, can
process the largest databases, and is less sensitive to numerical
errors since they cannot cause topological difficulties. It is
based on our simpler, local topological data structures for
representing polygons and polyhedra. Its speed comes from the
uniform grid data structure. These concepts have been described in
the context of computer aided design (CAD) most recently in
Franklin[ ]. A preliminary description of this
algorithm and implementation were presented in
Franklin[
OVERPROP is fast enough to be used dynamically during a data investigation. It is not necessary explicitly to overlay all the data layers at the start of a job, as some commercial GIS packages do, unless the complete overlay is needed for other reasons, such as display, which is often the case.
Monte Carlo methods could be used approximately to calculate overlay areas. However, the accuracy improves with the square root of the number of sample points, which makes high accuracy infeasible. Also Monte Carlo methods do not extend to determining other properties, such as perimeter.
The rest of the paper is as follows. First, we will see the logical data structures, and the theoretical formulæ underlying the algorithm. We give the algorithm in two stages. First, a brief conceptual description ignores efficiency considerations. Then we see a more detailed description. Finally, we see several deeper issues, such as controlling numerical errors, design tradeoffs, and several possible extensions.