Terrain Elevation Data Structure Operations

Wm Randolph Franklin
Michael B Gousie

Goal

Operate on terrain elevation data.
Compare competing representations on large datasets
Integrate into a system

TIN
grid
contours

Operations:

Visibility
Compression
Interpolate contours to grid
Approx grid to TIN
Drainage (future)

How the Parts Fit Together

Theme

Experimental Computational Cartography
Interface between the theoreticians and the computer
Design then implement algorithms and data structures

efficiently
on the largest (not toy sized!) datasets

then

observe results and
suggest new research problems

use general purpose data structures and algorithms,

benefit from larger community effort
like using Open Source Software instead of proprietary packages.
Notable special-purpose failures: Lisp machines, floating point processors, database engines, special graphics engines, and most parallel machines.

What's New?

... since people have converted data for several decades
now.

hardware,
SW packages,
specific algorithms.

Historical Context

Various formats possible for Hypsographic (elevation) data

Triangulated Irregular Networks (TINs),
grids or arrays, and
contour lines.

Which is best?
Criteria:

complexity, size, and accuracy

need for conversion algorithms.
need for a measure of goodness for approx conversions.

simplest: RMS error.
more useful: effect on

visibility, drainage, slope

Research Program Components

Conversion from DEM to TIN
Visibility
DEM Compression
Interpolating from Contours to DEM

Conversion from DEM to TIN

I designed possibly the first TIN program in GIS, in 1973.
Now, can easily completely TIN a complete 1201x1201 DEM.

Example - Lake Champlain West DEM

1201x1201
Can process to over 1,000,000 triangles.
However, max error < 1 after 600,000 triangles.
Here is original, then after 1024, 2048, 4096, 8192, 16384, 32768, 65536 points inserted.

Extensions:

How does error correlate to number of triangles?
Is a Delauney triangulation the best?
Is a constrained triangulation, forcing features to be included, necessary?
Should we alternate inserting and deleting points?

Visibility

viewshed: What points can an observer see?
Visibility indices: How much area can each possible observer see?

Extensions:

Compute confidence intervals.

Unshaded (): almost certainly visible.
Lightly shaded (): probably visible.
Darkly shaded (): probably hidden.
Black (): almost certainly hidden.

Sample viewshed output

Following are two images:

some sample terrain from South Korea, color-coded by elevation, and
the visibility indices of every point, colored similarly.

Sample Terrain; Visibility Indices

Extensions:

siting observers: alternate the above 2 programs.
feature determination
small relation of height to visibility

Siting Observers

Programs viewshed and vix may be used to find a set of observers that jointly can see every point as follows.

Use vix to list all the points by visibility index, and hence, to find the most visible point. Place the first observer, O₁, there.
Use viewshed to find the points that O₁ cannot see.
Filter the sorted list of points to delete points that O₁ can see.
Find the most visible point that O₁ cannot see; that is the second observer, O₂.
Repeat until the set of observers can see every point.

DEM Compression

Hypothesis: The effort spent on image compression might transfer to elevations.
Differences: one channel, 16 bits, different measure of goodness
Uses: Said and Pearlman's wavelet program
Test cases: 24 USGS level-1 1201x1201 DEMs.
Today's example: level-2 30-meter DEM of Bountiful Utah.
Standard deviation of the elevations: 1096 meters,
= RMS error if the file was compressed to two bytes
W/o compression: 11 bits per point (bpp).
Lossless compression: 5.234 bpp
Lossy compression, size vs error:

BPP	RMS Error, m
5.234	0.0
1.0	6.0
0.1	51.7
0.03	148
0.01	535
0.0	1096

Original

Lossy Compression to 1 Bit/Point, RMSE=6.0 meters

Lossy Compression to 0.1 Bit/Point, RMSE=51.7 m

Lossy Compression to 0.03 Bit/Point, RMSE=148 m

Lossy Compression to 0.01 Bit/Point, RMSE=535

Interpolating from Contours to DEM

Prior Art

Interpolate in vertical planes.
PDE - Lagrange (heat flow) or thin-plate
Extend straight lines out in 8 directions to the nearest contour
Voronoi diagram / Delaunay triangulation, e.g. area stealing
Medial axis
Inverse distance weighting.
Detect features, interpolate them, then fill in.

Limits of Existing Methods

Often tested only on synthetic, small, datasets.
May require unbroken contours.
May require contours be thinned.
Generated triangles may be horizontal, or long & thin.
Generated peaks may be flat.
May generate terraces and ringing.

Figure: Interpolating 3 Nested Squares with Inverse Distance Weighting, and 3 Ways with Delaunay Triangles

Interpolation & Approximation Introduction

Desirable properties:

local control
variation minimization
interpolation
conformal
Don't want to see the contours in the result.
Peak inside top contour and valley outside outer contour.

The math is counterintuitive, and the desired properties mutually contradictory.

Goodness Criteria

Ideal:

Maximum likelihood surface fitting the data.
Requires: Formal probability model of terrain elevation.
which is still incompletely solved. (e.g., long range correlations from drainage patterns).

Next best:

What looks good?
Shaded plots
Profiles
Statistically compare curvature at generated points on/near data points against farther points. (future)

Partial differential equations (PDEs)

Laplacian, or heat-flow equation, z_xx+z_yy = 0,

where

z_xx = d²z/dx² solved by iteration on a grid,

4z_ij = z_i-1,j + z_i+1,j + z_i,j-1 + z_i,j+1.

Problem: demonstrates the terracing artifact.

The thin plate model minimizes total curvature.

z_xx² + 2z_xy² + z_yy² = 0;

20z_ij = 8(z_i-1,j + z_i+1,j + z_i,j-1 + z_i,j+1) - 2(z_i-1,j-1+z_i-1,j+1+z_i+1,j-1+z_i+1,j+1) - (z_i-2,j + z_i+2,j + z_i,j-2 + z_i,j+2)

Intermediate Contours

Our first new method repeatedly interpolates new contour lines between the original ones, somewhat similar to the medial axis. If two adjacent contour lines are: A, at elevation a, and B, at b, then we interpolate at elevation (a+b) thus.

Pick a point on A.
Find the closest point on B (approximating the gradient).
The midpoint is on the intermediate contour.

Our contrarian philosophy is that we don't find the elevation at certain points, but rather find points with a certain elevation.

Figure: Crater Lake: Original Contours and Intermediate Contour Interpolation

The Maximum Intermediate Contours (MIC) method iterates the process.

Gradient Lines

like lofting in CAGD.
Calculate gradient lines.
Refine their elevations.

Example: 900x900 piece of the Crater Lake DEM.

Figure 6. Crater Lake: Original Contours	Figure 7. Thin Plate Approx
Figure 8. Intermediate Contour Approx	Figure 9. Gradient Line Approx

Overdetermined Laplacian Solution

The problem with many existing methods is that no information flows across the contours. This causes terracing.
Make the known data points also be the average of their neighbors.
assume that there are N² unknowns; i.e., even the known points are unknown. There are now two types of equations. First, for all points:

4z_ij = z_i-1,j + z_i+1,j + z_i,j-1 + z_i,j+1

Second, for the known points, we make an equation that they are equal to their known values:

z_ij = h_ij

Do a best-fit solution.

Figure: Overdetermined Laplacian PDE, Weighting Different Equations Differently

**Table:** Error versus Weight
R	Max % Error	Mean % Error
0.1	0.27	0.01
1.0	5.5	0.6
3.0	12	2.7

Proposed Future Work

Short Term

Multiple observers covering a region.
Fast graphics cards?
Test the largest DEM datasets available.
Study other serendipitous questions that will arise.

Medium Term

Leverage from array operations tools, like Matlab?
Output sensitivity to input?
If visibility indices and viewsheds are sensitive to small input perturbations, then

Try to establish error bounds on the output.
Faster and less accurate output when input is inaccurate?

Feature recognition: feed into constrained TIN.

Long Term

Investigate the general question of the proper representation of terrain elevation data.
If using TINs, use triangular splines, not just piecewise multilinear flat triangles.
Consider the idea of a conceptually deeper representation, based on the geomorphological forces that created the terrain. Devise a basis set of operators, such as uplift, downcut, etc. Deduce the operators that created the particular terrain under consideration, and store them.
Slightly differently, represent the terrain by the features that people would use to describe it. This idea is many decades old. Can we get it to work? The problem is that you can't just say that there is a hill over there, you have to specify the hill in considerable detail, which would seem to take more space than simply listing all the elevations with a grid or TIN.
Extend to geological data structures. Plan excavation strategy for open pit mine.
The major creative force for terrestrial terrain is water erosion. This does not apply to the moon, Venus, and, to the same extent, to Mars. Therefore those surfaces might be statistically different. What impact, if any, would this have on the speed and output distribution of our algorithms and data structures?

How the Parts Fit Together