Abstract Background The relationship between health services and population outcomes is an important area of public health research that requires bringing together data on outcomes and the relevant service environment. Linking independent, existing datasets geographically is potentially an efficient approach; however, it raises a number of methodological issues which have not been extensively explored. This sensitivity analysis explores the potential misclassification error introduced when a sample rather than a census of health facilities is used and when household survey clusters are geographically displaced for confidentiality. Methods Using the 2007 Rwanda Service Provision Assessment (RSPA) of all public health facilities and the 2007–2008 Rwanda Interim Demographic and Health Survey (RIDHS), five health facility samples and five household cluster displacements were created to simulate typical SPA samples and household cluster datasets. Facility datasets were matched with cluster datasets to create 36 paired datasets. Four geographic techniques were employed to link clusters with facilities in each paired dataset. The links between clusters and facilities were operationalized by creating health service variables from the RSPA and attaching them to linked RIDHS clusters. Comparisons between the original facility census and undisplaced clusters dataset with the multiple samples and displaced clusters datasets enabled measurement of error due to sampling and displacement. Results Facility sampling produced larger misclassification errors than cluster displacement, underestimating access to services. Distance to the nearest facility was misclassified for over 50% of the clusters when directly linked, while linking to all facilities within an administrative boundary produced the lowest misclassification error. Measuring relative service environment produced equally poor results with over half of the clusters assigned to the incorrect quintile when linked with a sample of facilities and more than one-third misclassified due to displacement. Conclusions At low levels of geographic disaggregation, linking independent facility samples and household clusters is not recommended. Linking facility census data with population data at the cluster level is possible, but misclassification errors associated with geographic displacement of clusters will bias estimates of relationships between service environment and health outcomes. The potential need to link facility and population-based data requires consideration when designing a facility survey.