Connected Components Test Results 2 D (Research)

Altho connect was designed for 3D data, it can also process 2D data as a trivial special case. Sharad Seth kindly provided an image of a USGS map. The original was about 20 inches square, and was scanned at 1000 dpi, giving an image of size 18573 by 19110.

It's difficult to show such an image. However, here it is, but reduced a factor of ten in each dimension (with pbmreduce). Click on the image to see a larger version.

Connect took 25 CPU seconds on a 1600 MHz Pentium to find the 32858 components. Here is the log. One notable point is the components' complexity; the average number of runs per component is 118.

***** CONNECT starting at Wed Oct 27 10:57:19 1999
*****
nx=1, ny=18573, nz=19110, conn=6, key=0
Cumulative CPU time thru initialization= 0.01
Cumulative CPU time thru processing planes= 132.56
Cumulative CPU time thru check_tree_depth= 134.65
Cumulative CPU time thru assemble_components= 136.64
Component #0 with volume 1483604 is too large to write.
Component #1 with volume 2408617 is too large to write.
Component #78 with volume 2425918 is too large to write.
Component #125 with volume 12501468 is too large to write.
Component #8079 with volume 888103 is too large to write.
Component #14543 with volume 194877 is too large to write.
Component #15480 with volume 265049 is too large to write.
Component #18039 with volume 331616 is too large to write.
Component #19856 with volume 355632 is too large to write.
Component #22526 with volume 781705 is too large to write.
Component #32823 with volume 1460148 is too large to write.
Cumulative CPU time thru sort_components= 142.71
# volume=1 components not written: 397
Cumulative CPU time thru for write_components= 164.03
Universe has 1*18573*19110=354930030 voxels.
Type of connectivity: 6
# empty voxels= 47462485, fraction= 0.133723
# runs= 3889428, average size=12.2029
# components= 32858, average # runs/component=118.371
# times a pair of possibly adjacent runs tested, but they didn't
overlap= 3885023
# times a pair of adjacent runs processed= 3875252
# times two adjacent runs were already in the same component= 18682
# pointers followed when processing adjacent runs= 7969486
Average # pointers followed per run= 2.04901
# times find_root_of_run called= 7750504, average # pointers followed per call= 0.514127
# times update_root called= 7750504, average # pointers followed per call= 0.514127
# times find_roots finds a run not pointing directly to its root=0
# bytes used by one run= 12, total run storage= 48000000
# bytes used by one component= 16, total component storage= 7200000
rows1 storage= 74300
Total storage used by large arrays= 55274300
Cumulative CPU time thru end= 164.05

As usual, the ancillary programs to pre- and postprocess the data take much more time and space, and many refuse to process a 350M pixel image. Here's a 2000 by 2000 section of the output, with each component randomly colored. Click on it to see a larger version.

There may be other fast 2D connected component programs. We are not claiming that our 3D program is also the fastest 2D program, but only that it is as good as the others here.