Here we have drawn together some notes about the dictionary structures which users may find helpful to keep in mind when using the dictionary.
- NILS does not have a separate ‘short description’ and ‘full description’ text field in its data dictionaries. This means that several of the NILS variable descriptions are very short or are truncated due to exceeding the field’s character limit. If this impedes your understanding of the variable, please contact firstname.lastname@example.org for clarification.
- From the 2001 census onwards NILS has combined LS member and non-member data. Because of this the ‘NILS Member indicator’ variable must be used in conjunction with the variable of interest to distinguish one from the other. When looking at similarity scores, this means that one NILS variable is often matched to 2 variables from the other LSs (from their respective member and non-member tables).
- The NILS team do not preform any coding/recoding/variable derivations, rather the data they make available is in the format provided by Census. ONS LS and SLS carry out some work on derived variables and coding revisions which in turn has created some inconsistencies. This is why there are no additional coding notes in NILS variable descriptions.
- In 2011, NILS uses the term ‘disability’ instead of ‘limiting long-term illness (or LLTI). Researchers are advised to use all of these terms in their variable searches.
- Throughout all Census years, NILS variables can occasionally appear to have many fewer categories than the ONS and SLS counterparts despite being based on the same (or very similar) original Census question. This is because the NILS categories are based on actual responses rather than all available options so gaps exist when there were no responses to a particular option. This is particularly evident in variables with a number range e.g. number of people in household or a large classification list e.g. religion.
- NILS-RSU have produced a thorough guidance document (PDF 2MB) designed to be used in conjunction with their data dictionary.
ONS LS variables
- The word ‘gender’ is only used in the ONS LS dictionary from 2011 onwards, so researchers are advised to search for both ‘sex’ and ‘gender’ to avoid missing relevant variables from their search.
- There are some instances in the 1991 ONS LS where Scottish census data is recorded and held in variables that relate specifically to Scotland (such as Gaelic language and Scottish Output Areas). These variables came about when a small number of Scottish censuses were processed in England, e.g., when Scottish residents were visiting households in England on census night with their Scottish census forms.
- There are certain variables in the ONS LS dictionary that appear to be equivalent to the SLS and NILS variables but in their description, they state that the variable is only populated if discrepant with the LS member’s CORE file e.g. sex and date of birth. Therefore, these will have been assigned a similarity score to show that they are incompatible across LSs. In 2011 however, this changed and the ONS LS variables SEX11 and DOBYR11 were as recorded at census and thus compatible across LSs. For these reasons, researchers are advised to use both Census and CORE variables.
- Variables relating to place of work at 1991 were not coded to 100% in SLS. This was because the SLS recoding from original census form was carried out in 2002-2004, over 10 years after the census. Workplaces were difficult to establish by then due to closures within that period as well as changes to the way records were held on computer.
- For similar reasons to that above, the database required to code qualifications at 1991 (educational institute and study subjects) was not available to SLS. Work on qualification coding is complicated and time consuming and hence only highest level of qualification is captured in 1991.
- In 2011, several of the SLS variables report a ‘Known quality issue’ in their description. A link is provided to a fuller explanation to this issue that researchers are advised to take note of when using these variables in their study. The similarity score assigned to these variables has not been adjusted to account for these issues.
Variations across LSs
- At 1991, there are differences between ONS LS and SLS in the naming conventions for recoded variables which are marked using ‘9’ and ‘T9’/’TEN9’ suffixes. These are differences that came about through differences in LS coding methods for fully coding variables that were initially only taken from a 10% sample. See individual variables for guidance, but in general:
- ONS LS – where there exists both a ‘9’ and a ‘T9’ variable: ONS are currently investigating the differences and recommend that both variables be selected in the meantime.
- SLS – where there exists both a ‘9’ and a ‘T9’/’TEN9’ variable, SLS recommend the ‘T9’/’TEN9′ as they have been corrected for greater accuracy.
- SLS variables up until the 2011 census do not contain imputed data. This is why their variables are often found to contain more ‘missing’ categories than in ONS LS or NILS. ONS LS and NILS variables that contain imputed data are often coupled with an imputation indicator variable. Other reasons for variations in number of missing categories can be explained simply by all questions being fully complete for a particular LS’s members sample but not in others or due to variations in how thoroughly an LS was able to code ‘hard to code’ variables. This is particularly relevant in 1991 and earlier where forms had to be transcribed by hand.
- For 1991, ONS LS contains variables that refer to both ‘de facto’ and ‘de jure’ households. The ‘de facto’ households being those that had no usual residents and ‘de jure’ households containing at least one usual resident. As all SLS members must be usual residents, the SLS only contains ‘de jure’ household variables. This also applies to the 2001 and 2011 census years.
- There are numerous differences in the religion variables across the 3 LSs, largely on account of the original census questions being notably different. It is therefore advised that researchers refer to the original questions when using these variables.
- The 2011 Census introduced several new questions, one of which concerns spoken languages. Whilst the England & Wales and Northern Ireland censuses asked for ‘main language’, the Scottish census did not. For this reason, the 2011 SLS language variables are not directly compatible with those of NILS and ONS LS, and should be compared with caution.
Variations across time
- Variables are organised differently for different time periods and across the 3 LSs e.g. in 2001, the household and communal establishment variables are separated into different tables in SLS, but are combined in NILS and ONS LS. Therefore it is advised that when looking for similar variables across LSs, you do not restrict your search to a single table.
- In 1991 students were enumerated at their home address and were classified as ‘usual residents’ of their home address even if they were living away from home during term time. Therefore they were included in the family classification. In 2001 and 2011, students were enumerated at their term time address. If they were also included in the form of their home address they were not classified as usual residents and excluded from family classification.
- The concept of a Household Reference Person (HRP) was introduced in the 2001 Census to replace the traditional concept of the ‘head of household’. Only ONS LS has head of household variables in 2011. Researchers should be aware of the definitions if comparing across time.
- The ‘Similar variables across time within [LS]’ field in the data dictionary will begin to be populated in the near future, initially only for a selection of the most commonly used variables.
June 7, 2017 at University of Edinburgh, Geography Building, 1 Drummond St, Edinburgh EH8 9XP
June 8, 2017 at University of Edinburgh, Geography Building, 1 Drummond St, Edinburgh EH8 9XP