Similarity Scoring Rules

DescriptionQ similarQ identical
Differences in the ‘missing/not applicable’ categories only
Differences in the number of identifier coding levels
Differences in number of coding levels for count values
Imputation indicator v LS with no imputation indicators used
Variables based on ‘tick all that apply’ census questions where one LS has all answers combined into one variable and another with a separate variable for each applicable category
No equivalent variable match between LSs (either because no equivalent variable was derived or there was no equivalent question on census)
Distance variables rounded to different decimal places
1991 Variables only coded to 10% v equivalent variable coded to 100%
Postcodes/output areas etc: enumeration address
Postcodes/output areas etc: previous address/place of work


