Measuring income in the UK Census (SLS)
Tom Clemens, SLS-DSU
Asking questions about income in surveys or in the census in the UK is a difficult and controversial issue. Although people in places like the US and Scandinavia are less protective about the amount of money that they earn, in the UK it is often considered private and sensitive so that many people choose to refuse to answer questionnaires and surveys that ask for income information, including the UK census. Despite regular debates about its inclusion in the years leading up to previous censuses, a question about income has never been included. This creates a problem for researchers interested in using the census (and other data sources) for social research purposes because income is often an extremely important piece of information when trying to understand the effect of poverty on the population. Other measures have often been used instead of income, including area-based measures of deprivation or other measures of socio-economic position such as education or social class, but often they are measuring something different and do not capture very well the particular effects of living on a low income.
At the Scottish Longitudinal Study, we have developed a method to address this problem through the calculation of a “synthetic” measure that estimates individual weekly wage. The method is based on the detailed occupation information contained in the Standard Occupation Classification or SOC. SOC is a hierarchical variable and contains descriptions for around 350 different types of job which are nested within a hierarchy of broader job description categories. This approach is different to previous occupation-based measures such as occupation based social class, because it utilises all of this highly detailed SOC information to calculate a continuous estimate of weekly wage. Other approaches waste this information by aggregating to higher level occupation information. More details about the precise methodology used to derive the estimates, and of their performance against other measures of socio-economic position can be found in a journal article (open access) and an SLS working paper.
As part of the project, a Stata program has been written in order to allow users to produce the estimates in their own research projects; all that is needed in your dataset is the following variables and associated coding:
- Individual single year of age
- Sex (0 for females and 1 for males)
- SOC coded occupation (3 digit format for SOC90 version which was introduced in 1990 or the four digit SOC2000 version which was introduced in 2001)
Once you have these variables you will need to download the Stata program “salaryest20” and “salaryest90” which are simply Stata ado files. These will be available from SLS support officers on request and can be installed by selecting the text in the ado files and running them in Stata which will automatically install the program. Once you have done this you can use the following syntax commands to use the programs:
For estimating weekly wage based on SOC90:
salaryest90 newvarname, age(age varname) sex(sex varname) soc(soc90 varname)
For estimating weekly wage based on SOC2000:
salaryest20 newvarname, age(age varname) sex(sex varname) soc(soc2000 varname)
Where newvarname, age varname, sex varname, soc90 varname and soc20 varname should be replaced with, respectively, the name that you want the new wage variable to be, the name of your age variable, the name of your sex variable and the names of the SOC90 and SOC2000 variables in your dataset. The command will then produce a new variable containing estimated weekly wage information and will output descriptive information about this new variable. The commands can be used in any dataset which is missing income but includes information about age, sex and SOC occupation. Anyone who is interested in using these commands in an SLS project should talk to their support officer who can assist you.