W Randolph Franklin home page
... (old version)
Research/ home page Login

WR FranklinResearch


Connect is a very efficient implementation on a laptop computer of the union-find algorithm for finding connected components when the input is more than a 1000x1000x1000 3D box of binary voxels. Each voxel may be connected either to its 6 orthogonal neighbors, or to all 26 neighbors. Component properties, such as area, and volume are also computed. This implementation is fast enough to permit experimental studies of random connectivity.

Determining the connected components of a 2D or 3D array of bits is a required operation in domains such as the following:

  1. Investigating the distribution of cracks in concrete as it is stressed to failure (which is what motivated this research),
  2. Extraction of connected components in image processing, and
  3. Determination of porosity of an underground fluid reservoir.

Connect is very space- and time-efficient. The largest test case had a universe of size 1024x1088x1088, which has 1,212,153,856 voxels, about 1/2 empty and found the 4,539,562 components. The implementation environment was a 2GHz IBM T43p laptop with SuSE linux and gcc 4.1.0. The virtual memory used, which is a function of the space preallocated for runs and components, which was sized for this example, was only 340MB. The program's elapsed time was equal to its CPU time, which was 51 CPU seconds. Smaller examples run proportionately faster, in perhaps 0.1 seconds for a 100x100x100 universe. Very complicated cases would run more slowly.


  1. bibtexsummary:[/wrf.bib,wrf-connect-fwcg2006]
    two page abstract and Talk.
  2. bibtexsummary:[/wrf.bib,capri01]
  3. bibtexsummary:[/wrf.bib,acse2003]
  4. bibtexsummary:[/wrf.bib,landis2006]
    Abstract: Concrete and cracking are nearly synonymous despite our best efforts and intentions. Relationships between cracking and the stress states that lead to cracking can be instructive. In an effort to better understand these relationships, X-ray microtomography was used to make high-resolution three-dimensional digital images of small concrete specimens under load. Using 3D image analysis, quantitative measurements of internal crack growth were made that include effects of crack tortuosity, branching and microcracking. Successive images at different levels of cracking and damage provide us with a detailed picture of internal crack progression. When coupled with load-deformation response, bulk material properties such as fracture toughness or damage variables can be quantitatively linked with cracking. Measurements to date have shown distinct fracture regimes linked to crack formation and propagation. In addition, the crack measurements offer a way to provide a physical basis for a scalar damage variable.
    Keywords: Concrete - Fracture - Damage - Tomography

Prior Art

This is a very incomplete list, which will be enlarged at some time. The only reference actually used was a description of the union-find problem in, probably, Aho, Hopcroft and Ullmann.

Searching on 3d connected components produces several refs including implementations, altho apparently always on smaller datasets.


This implements connected component algorithms on parallel machines like the CM-5 and SP-2 for 2-D 1024x1024 images. There are many references.

Implementation Details

  1. Implementation
  2. Test Results (sizes are approx)
    1. 1024 Cubed (approx) - our largest case
    2. 512 Cubed - including distribution of component volumes, and volume-area relationship of components, which suggests their fractal dimension
    3. Random - each voxel is independently 0 or 1, with probability p. How does the number of components vary with p, with either 6 or 26 connectivity?
    4. 2D

2018 update

  1. The executable file that I compiled in 2007 still runs today.
  2. However the source does not compile any more. C++ programs that were legal in 2007 do not always compile in 2018. I've updated my source to compile again.
  3. How does the p1024 test compare then and now?
    1. The output files generated by the 2 executables are identical.
    2. In 2007, that test took about 52 secs.
    3. On a 2.8GHz Lenovo Thinkpad yoga, the p1024 test runs in 11 sec, which is 4x faster than the time seen in 2007. Most of that time is probably still I/O.