Contiguity principle for geographic units: evidence on the quantity, degree, and location of Public Use Microdata Area (PUMA) fragmentation

Author: and
Key words: , , , , ,
Issue: Volume 7, Issue 2, 2013


Social scientists investigating how context varies by geographical location and/or how macro-level phenomenon affects individual outcomes often make use of U.S. Census Bureau Public Use Microdata Sample (PUMS) files where micro-units can only be geographically located to Public Use Microdata Area (PUMA) polygons.  Most spatial analysis investigations with PUMAs ignore the fact that many of them are multipart polygons—spatially separated polygons that share the same attribute and are stored as a single feature in a vector file. We briefly discuss the theoretical premises of how geographical boundaries are created for macro units and investigate the quantity, degree, and location of PUMA fragmentation. We argue that the basic contiguity principle (the assumption that spatial analysis uses polygon centroids for solid and contiguous geographic units) in spatial dependence analysis is being violated with many PUMAs in the U.S. mainland—where Texas, California, Tennessee, and Illinois merit special attention. Future research should outline a method for handling multipart polygons in spatial and hierarchical analyses. 

Full text

Permalink (doi)

Authors Affiliations

Carlos Siordia(a)*, Douglas F. Wunneburger(b)
(a) NRSA Department of Epidemiology at the University of Pittsburgh, USA
(b) Department of Landscape Architecture and Urban Planning at Texas A&M University, USA
* Corresponding author. Email:


Anselin, L 1995, Local indicators of spatial association-LISA, Geographical Analysis, 27, pp. 93-115.

Baddeley, AJ & Silverman, BW 1984, A Cautionary Example on the Use of Second-Order Methods for Analyzing Point Patterns, Biometrics, 40, pp. 1089-1093.

Barrios, T, Diamond, R, Imbens, GW & Kolesar, M 2010, Clustering, Spatial Correlations and Randomization Infer-ence. National Bureau of Economic Research, NBER Working Paper No. 1576.

Books, J & Prysby, C 1988, Studying contextual effects on political behavior: a research inventory and agenda, American Politics Quarterly, 16, p. 211-238.

Burdette, AM 2013, Neighborhood context and breastfeed-ing behaviors among urban mothers, Journal of Human Lactation, 29, pp. 597-604.

Cova, TJ & Church, RL 2000, Contiguity constraints for single-region site search problems, Geographical Analysis, 32, pp. 306-329.

Cromley, EK & McLafferty, S 2012, GIS and public health, Guilford Press.

Cuzick, J & Edvards, R 1990, Spatial clustering for inhomo-geneous populations, Journal of the Royal Statistical Soci-ety, 52, pp. 73-104.

Diggle, PJ 1983, Statistical analysis of spatial point patterns, Academic Press, London.

Durkheim, É 1951, Suicide: a study in sociology, Free Press.

ESRI 1998, Shapefile Technical Description: An ESRI White Paper, Redlands, California.

ESRI 2011, ArcGIS Desktop, Release 10, Environmental Sys-tems Research Institute, Redlands, CA.

Flint, C, Harrower, M & Edsall, R 2000, But how does place matter? using Bayesian networks to explore a structural definition of place. Paper presented at the New Method-ologies of the Social Sciences Conference, University of Colorado Boulder.

Fotheringham, AS, Brunsdon, C & Charlton, ME 2002, Geographically Weighted Regression: The Analysis of Spatially Varying Relationships, John Wiley, West Sussex, UK.

Fullerton, AS 2012, Spatial agglomeration and wages in the U.S. biotechnology sector, Sociological Spectrum, 32, pp. 61-80.

Galvis, L, Guertin, PJ & Meyer, WD 2009, Actionable cultur-al understanding for support to tactical operations: the effect of data quality on spatial analysis results. Report, ERDC/CERL TR-09-15.

Goodman, JM, Owens, PR & Libohova, Z 2012, Predicting soil organic carbon using mixed conceptual and geostatis-tical models, Digital Soil Assessments and Beyond: Pro-ceedings of the 5th Global Workshop on Digital Soil Map-ping, 10-13 April 2012, CRC Press, Sydney, Australia.

Grubesic, TH & Matisziw, TC 2006, On the use of ZIP codes and ZIP code tabulation areas (ZCTAs) for the spatial analysis of epidemiological data, International Journal of Health Geographics, 5, pp. 1-15.
PMid:17166283 PMCid:PMC1762013

Hart, TC & Zandbergen, PA 2013, Reference data and ge-ocoding quality: Examining completeness and positional accuracy of street geocoded crime incidents, Policing: An International Journal of Police Strategies & Management, 36, pp. 263-294.

Jacquez, GM 1995, The map comparison problem: tests for the overlap of geographic boundaries, Statistics in Medi-cine, 14, pp. 2343-2361.

Jones, JP & Caselli, E 1992, Applications of the expansion method, Routledge, London.

Laraia, BA, Karter, AJ, Warton, EM, et al. 2012, Place mat-ters: neighborhood deprivation and cardiometabolic risk factors in the Diabetes Study of Northern California (DISTANCE), Social Science Medicine, 74, pp. 1082-1090.
PMid:22373821 PMCid:PMC3437771

Law, J & Quick, M 2013, ̒Exploring links between juvenile offenders and social disorganization at a large map scale: a Bayesian spatial modeling approach̕, Journal of Geo-graphical Systems, 15, pp. 89-113.

LeSage, JP & Pace, RK 2009, Introduction to spatial econo-metrics, CRC Press, Boca Raton.

Liu, CY & Painter, G 2012, ̒Travel behaviour among Latino immigrants: the role of ethnic concentration and ethnic employment̕, Journal of Planning Education and Research, 32, pp. 62-80.

Mansour, S, Martin, D & Wright, J 2012, ̒Problems of spatial linkage of a geo-referenced Demographic and Health Survey (DHS) dataset to a population census: A case study of Egypt̕, Computers, Environment and Urban Systems, 36, pp. 350-358.

Mitchell, A 2005, The ESRI Guide to GIS Analysis, Volume 1: Geographic Patterns and Relationships and Zeroing In: Geographic Information Systems at Work in the Community, ESRI Press, US.

Openshaw, S 1993, ̒Some suggestions concerning the devel-opment of artificial intelligence tools for spatial modelling and analysis in GIS̕ in MM Fischer & P Nijkamp (eds), Geographic Information Systems, Spatial Modelling and Policy Evaluation, pp. 17-33, Springer Verlag, Berlin.

Raphael, S & Stoll, MA 2010, ̒Job sprawl and the suburbani-zation of poverty. Metropolitan Policy Program at Brook-ings̕, Metropolitan Opportunity Series, March: 1-21.

Raudenbush, SW & Bryk, AS 2002, Hierarchical Linear Models: Applications and Data Analysis Methods, 2nd edn, Thousand Oaks, Sage Publications, California.

Reamer, AD 2010, Surveying for dollars: the role of the American Community Survey in the geographic distribution of federal funds, Metropolitan Policy Program at Brookings, Washington D.C.

Ripley, B 1976, ̒The second-order analysis of stationary point processes̕, Journal of Applied Probability, 13, pp. 255-266.

Ripley, BD 1977, ̒Modelling spatial patterns (with discus-sion)̕, Journal of the Royal Statistical Society, Series B 39, pp. 172-212.

Ripley, BD 1981, Spatial statistics, Wiley, New York.

Safner, T, Miller, MP, McRae, BH, Fortin, M & Manel, S 2011, ̒Comparison of Bayesian clustering and edge detec-tion methods for inferring boundaries in landscape genetics̕, International Journal of Molecular Sciences, 12, pp. 865-889.
PMid:21541031 PMCid:PMC3083678

Salvatore, S, Chavers, JM, Nixon, LC & McQuiddy, MR 2007, ̒From here to there: methods of allocating data be-tween census geography and socially meaningful areas̕, Social Science Research, 36, pp. 897-920.

Siordia, C & Fox, A 2013, Public Use Microdata Area frag-mentation: research and policy implications of polygon discontiguity, Spatial Demography, 1(1), pp. 42-56.

Schootman, M, Sterling, DA, Struthers, J et al. 2007, ̒Positional accuracy and geographic bias of four methods of geocoding in epidemiologic research̕, Annals of epidemiology, 17, pp. 464-470.

Srinivasan, R & Venkatesan, P 2013, ̒Bayesian model for spatial dependance and prediction of tuberculosis̕, Inter-national Journal, 3, pp. 2307-2083.

Tobler, W 1970, ̒A Computer Movie Simulating Urban Growth in the Detroit Region̕, Economic Geography, 46, pp. 234-240.

Vilalta, CJ 2011, ̒The spatial dependence of judicial data̕, Applied Spatial Analysis and Policy, pp. 1-17.

Wang, F 2010, Quantitative methods and applications in GIS, CRC Press.

Wang, Q 2007, ̒Linking home to work: ethnic labor market concentration in the San Francisco consolidated metro-politan area̕, Urban Geography, 27, pp. 72-92.

Yang, T & Matthews, SA 2012, ̒Understanding the non-stationary associations between distrust of the health care system, health conditions, and self-rated health in the el-derly: a geographically weighted regression approach̕, Health and Place, 18, pp. 576-585.
PMid:22321903 PMCid:PMC3319514

Yu, Z & Myers, D 2007, ̒Convergence or divergence in Los Angeles: three distinctive patterns of immigrant residen-tial assimilation̕, Social Science Research, 36, pp. 254-285.

Zandbergen, PA 2008, ̒Positional Accuracy of Spatial Data: Non‐Normal Distributions and a Critique of the National Standard for Spatial Data Accuracy̕, Transactions in GIS, 12, pp. 103-130.

Zandbergen, PA, Ignizio, DA & Lenzer, KE 2011, ̒Positional accuracy of TIGER 2000 and 2009 road networks̕, Trans-actions in GIS, 15, pp. 495-519.

This post has already been read 2999 times!