Determining a cutoff point for identifying the true pairs probabilistic record linkage database

The aim of this study was to propose cut-off points for scores calculated in the probabilistic record linkage process for several cancer topographies. In this study we used the PBCR-SP database composed of 343,306 incident cancer cases from the municipality of São Paulo, registered from 1997 through 2005, aged from less than one to 106 years, of both sexes. PRO-AIM and APAC-SIA/SUS databases were used to probabilistic record linkage using Reclink III software. Area under the curve, sensitivity and specificity values were calculated to determine the cut-off point with the highest accuracy in identifying true matches. In the topography analyses, it was found that the cut-off at score 18 showed good accuracy, with sensitivity ranging from 73.7 to 96.7% and specificity ranging from 98.5 to 99.4%. We concluded that above score 18 nearly all true pairs were found. Whereas, below this cut-off, less than 1% of linked records were true matches.