SLS migration research to be enhanced with continuous GP registration data
Dawn Everington, SLS-DSU
SLS-DSU have been working on an exciting new data development which will soon be available to researchers. We have been given access to postcode of residence and date of each NHS GP registration since 1 January 2000. This provides users with a means of locating their SLS sample continuously rather than only once every 10 years at the time of the censuses.
Besides projects primarily interested in internal migration, these data will be also useful to those investigating how the local environment affects outcomes, such as the recent project which looked at proximity to green space, forests and health services. The length of time spent at each address could be incorporated into analyses, or location might be explored in relation to wider policy measures or events such as the economic recession.
An early test dataset was supplied to project 2016_003 ‘Economic change and internal population dynamics: an innovative study of new residential mobilities in Scotland’. Results from these analyses have been presented at several seminars and conferences (see list at the bottom of the project page) and there are plans to publish papers.
The online data dictionary has now been updated with Table E10 which contains the raw data and some derived variables. Although many of these cannot be accessed by researchers due to the risk of disclosure (marked as restriction level 2), we are in the process of producing further derived variables such as flags, which users can access. We will soon produce a working paper which will document the data sources and processing of the data, describe the variables in table E10, compare the enumeration postcodes with the postcodes recorded in the NHS data, and provide other information that will be helpful when using and interpreting these data.
Zengyi Huang, SLS-DSU
The SLS Birth Cohort of 1936 (SLSBC1936) is now available to external researchers. This cohort is structured around the existing SLS. We took the SLS birth date sample from the Scottish Mental Survey of 1947 (a cognitive ability test that included almost all Scottish children born in 1936) and linked it to the 1939 National Register, the NHS Central Register and the SLS. The outcome of the project is a powerful life-course dataset containing information from childhood to old age.
A new SLS Technical Working Paper 7: ‘The Scottish Longitudinal Study 1936 Birth Cohort’ describes the methodology used in creating the SLSBC1936, the quality of linkage and the data included in this cohort. A short description of this data linkage project can be found on the poster: The creation of an administration data based 1936 Birth Cohort Study.
The SLSBC1936 is a general-purpose resource, which is available for researchers via the SLS administration. Anyone interesting in accessing this dataset should contact the SLS-DSU at email@example.com.
More detail: SLS Technical Working Paper 7
(download as a PDF 958kB)
The Northern Ireland Longitudinal Study, ONS Longitudinal Study (England and Wales), and Scottish Longitudinal Study include a vast range of data relevant to many different types of research question. Their combination of administrative, census and health data across time make them a rich and unique set of resources. Examples of the types of research enabled by these features of the LSs include: The role of subject choices in secondary education on further education studies and labour market outcomes and Population characteristics of stigma, condition disclosure and chronic health conditions.
As an exploration of the many ways in which the LSs have been used, CALLS have conducted an analysis of the journal papers produced by LS researchers.
This citation analysis demonstrates the impressive range of academic fields to which LS-based research has contributed in the last 6 years. Research featured in almost 60 journals, and spanned more than 40 Scopus subject categories.
Research based on the LSs is regularly published in top quality international peer reviewed journals such as Demography, the International Journal of Epidemiology and Population, Space and Place. Fifteen papers in the citation analysis were published in journals ranked within the top 5 for their field (articles ranked by SJR Impact Rating for the relevant subject category in the publication year).
n papers published
Total citation count
|NILS||29||119 (avg 4.1)|
|ONS LS||51||264 (avg 5.2)|
|SLS||32||259 (avg 8.1)|
|All LSs||106||588 (avg 5.6)|
Papers had excellent citation rates indicating the acknowledgement of the unique contributions LS data offer. Papers published within the last 2-3 years were amongst the most highly cited. Eighteen papers had been cited 10 or more times.
The subject areas of papers using the LSs reflect the strengths of the data that they offer: SLS and NILS had a higher proportion of health-related papers, likely due to their excellent linkages with health data. Looking at subject categories for the LSs also reflect these variations: whilst the categories were very similar, ONS LS’s top 5 included ‘Demography’, whereas the SLS and NILS included ‘Health(social science)’.
Overall the analysis shows the valuable contribution of the NILS, ONS LS and SLS to a diverse range of academic fields including medicine, demography, geography, economics, business, psychology, environmental science and more.
Although we only focus on publications in academic journals here, LS research has considerable impact in other formats such as briefing notes, books and presentations to government, and has also formed part of a variety of PhD Theses. The full list of outputs can be explored in our Outputs database.
The raw data for the analysis can be downloaded at the bottom of this page.
Using the CALLS Hub outputs database a total of 106 published papers from the period January 2010 – May 2016 were identified from the three LSs. It should be noted that whilst CALLS and the RSUs actively solicit LS users to record all outputs, and also conducts literature searches to maximise capture, it is possible that some further papers exist.
All papers published in journals or regularly produced official publications – such as ONS Population Trends – were included. We did not include working papers in this analysis. Citation counts were gathered from Scopus, taking the final counts as of 30 June 2016. Impact Factors were taken from the Scopus project SCImago using the SJR2 indicator.
The LSs combined
Of the 106 papers identified, 16 were from non-peer-reviewed journals such as Population Trends. Four papers used more than one LS for their analysis. (see figure 1)
figure 1. Number of published papers per LS, Jan 2010 – May 2016. n = 106
Papers from the three LSs were published in a total of 59 different journals, spanning 41 SCImago Subject Categories in 11 Subject Areas (figure 2). SJR Impact Factors for the papers ranged from 0.128 to 9.893, with an average of 1.577.
The 5 most frequent subject categories for LS papers were:
- Public Health, Environment & Occupational Health (30 papers)
- Medicine(misc) (25 papers)
- Geography, Planning & Development (20 papers)
- Epidemiology (17 papers)
- Health(social science) (16 papers)
The ten most cited papers from the three LSs were:
Northern Ireland Longitudinal Study
During the period January 2010 to May 2016, a total of 29 journal papers were found which had used NILS data, including one paper which had used all 3 LSs. Five NILS publications appeared journals with top-5 ranked impact factor.
NILS journal papers were published in 18 different journals, spanning 8 SCImago Subject Areas and 22 Subject Categories (see below). SJR Impact Factors for the papers ranged from 0.219 to 4.381, with an average of 1.632.
The 5 most frequent subject categories for NILS papers were:
- Public Health, Environmental & Occupational Health (11 papers)
- Geography, Planning & Development (7 papers)
- Health(social science) (7 papers)
- Epidemiology (6 papers)
- Medicine(misc) (5 papers)
The 10 most cited NILS papers were:
During the period in question, 51 journal papers were identified as having been produced from ONS LS projects (including 4 papers which also used other LSs). Of these, 14 appeared in non peer-reviewed journals. Seven papers appeared in top-5 ranked journals.
ONS LS papers appeared in 33 journals, and covered 20 SCImago Subject Categories in 7 Subject Areas. SJR Impact Factors for ONS LS papers ranged from 0.128 to 9.893 with an average of 1.453.
The most frequent subject categories in which ONS LS papers appeared were:
- Medicine(misc) (14 papers)
- Public Health, Environmental & Occupational Health (11 papers)
- Epidemiology (8 papers)
- Geography, Planning & Development (7 papers)
- Demography (7 papers)
The most cited ONS LS papers were:
Scottish Longitudinal Study
During the period January 2010 – May 2016, 32 SLS-based journal papers were identified (including 4 papers which also used other LSs). Of these, 2 appeared in non peer-reviewed journals. Three papers were published in top-5 ranked journals.
The SLS papers were published in 26 different journals, spanning 23 SCImago Subject Categories in 8 Subject Areas. Impact Factors for the papers ranged from 0.226 to 5.667, with an average of 1.6.
SLS papers appeared most frequently under the following subject categories:
- Public Health, Environmental & Occupational Health (9 papers)
- Medicine(misc) (8 papers)
- Geography, Planning & Development (6 papers)
- Health(social science) (5 papers)
- Epidemiology (3 papers)
The 10 most cited SLS papers were:
Raw data (Excel, 82kB)
On Nov 10th, our UK LS Roadshow moved to Bristol as part of the ESRC Festival of Social Science.
The first part of our Roadshow showcased some of the different types of research that the ONS LS for England & Wales has been used for, and you can download the slides here:
|Family size and educational attainment in England and Wales|
Prof Tak Wing Chan, University of Warwick
|Overall and Cause-specific Mortality differences by Partnership status in 21st Century England and Wales (PDF 645 kB)|
Sebastian Franke, University of Liverpool
|Ethnic differences in intragenerational social mobility between 1971 and 2011|
Dr Saffron Karlsen, University of Bristol
On October 26th and 28th CALLS Hub hosted two exciting roadshow events in Aberdeen and Glasgow to promote the UK Census-based Longitudinal Studies. The events were well attended and feedback from the audience was very enthusiastic! It was great to be able to share our excitement about the potential of the datasets.
The first part of our Roadshows showcased some of the different types of research that the Scottish Longitudinal Study has been used for, and you can download the slides here:
|Protective effects of nurses’ health literacy: evidence from the Scottish Longitudinal Study|
Dr Ian Atherton, Edinburgh Napier University
|NEETs in Scotland: a longitudinal analysis of health effects of NEET experience (PDF 5MB)|
Dr Zhiqiang Feng, University of Edinburgh
|Population Ageing in Scotland: Implications for Healthcare Expenditure Projections (PDF 312kB)|
Dr Claudia Geue, University of Glasgow
|How spatial segregation changes over time: sorting out the sorting processes (PDF 285kB)|
Prof Nick Bailey, University of Glasgow
|Using the Scottish Longitudinal Study to analyse social inequalities in school subject choice (PDF 766kB)|
Prof Cristina Ianelli, University of Edinburgh
|Inequalities in young adults’ access to home-ownership in Scotland: a widening gap? (PDF 1MB)|
Prof Elspeth Graham, University of St Andrews
Tom Clemens, SLS-DSU
Asking questions about income in surveys or in the census in the UK is a difficult and controversial issue. Although people in places like the US and Scandinavia are less protective about the amount of money that they earn, in the UK it is often considered private and sensitive so that many people choose to refuse to answer questionnaires and surveys that ask for income information, including the UK census. Despite regular debates about its inclusion in the years leading up to previous censuses, a question about income has never been included. This creates a problem for researchers interested in using the census (and other data sources) for social research purposes because income is often an extremely important piece of information when trying to understand the effect of poverty on the population. Other measures have often been used instead of income, including area-based measures of deprivation or other measures of socio-economic position such as education or social class, but often they are measuring something different and do not capture very well the particular effects of living on a low income.
At the Scottish Longitudinal Study, we have developed a method to address this problem through the calculation of a “synthetic” measure that estimates individual weekly wage. The method is based on the detailed occupation information contained in the Standard Occupation Classification or SOC. SOC is a hierarchical variable and contains descriptions for around 350 different types of job which are nested within a hierarchy of broader job description categories. This approach is different to previous occupation-based measures such as occupation based social class, because it utilises all of this highly detailed SOC information to calculate a continuous estimate of weekly wage. Other approaches waste this information by aggregating to higher level occupation information. More details about the precise methodology used to derive the estimates, and of their performance against other measures of socio-economic position can be found in a journal article (open access) and an SLS working paper.
As part of the project, a Stata program has been written in order to allow users to produce the estimates in their own research projects; all that is needed in your dataset is the following variables and associated coding:
- Individual single year of age
- Sex (0 for females and 1 for males)
- SOC coded occupation (3 digit format for SOC90 version which was introduced in 1990 or the four digit SOC2000 version which was introduced in 2001)
Once you have these variables you will need to download the Stata program “salaryest20” and “salaryest90” which are simply Stata ado files. These will be available from SLS support officers on request and can be installed by selecting the text in the ado files and running them in Stata which will automatically install the program. Once you have done this you can use the following syntax commands to use the programs:
For estimating weekly wage based on SOC90:
salaryest90 newvarname, age(age varname) sex(sex varname) soc(soc90 varname)
For estimating weekly wage based on SOC2000:
salaryest20 newvarname, age(age varname) sex(sex varname) soc(soc2000 varname)
Where newvarname, age varname, sex varname, soc90 varname and soc20 varname should be replaced with, respectively, the name that you want the new wage variable to be, the name of your age variable, the name of your sex variable and the names of the SOC90 and SOC2000 variables in your dataset. The command will then produce a new variable containing estimated weekly wage information and will output descriptive information about this new variable. The commands can be used in any dataset which is missing income but includes information about age, sex and SOC occupation. Anyone who is interested in using these commands in an SLS project should talk to their support officer who can assist you.
On Tuesday 4th November 2014, the SLS-DSU (supported by National Records of Scotland and CALLS Hub), held a launch event to announce the linkage of 2011 Census data to the Scottish Longitudinal Study.
The event was held at Royal College of Physicians, Edinburgh, and around 70 people attended to hear about the new data, as well as examples of how it could be used. The welcome was given by Prof Andrew Morris, Scottish Government Chief Scientist.
UPDATE: You can now download full audio + slide presentations here.
Rachel Stuchbury of CeLSIUS and Kevin Ralston of SLS-DSU share their reflections on this years excellent BSPS conference at the University of Winchester.
BSPS 14 from the England & Wales perspective
Rachel Stuchbury, CeLSIUS
Four CeLSIUS staff attended BSPS and promptly dispersed among the six simultaneous sessions available – going to BSPS entails making hard choices among so many tempting possibilities. Did we sneak off and rubber-neck our way round beautiful Winchester at all? If so, we are certainly not going to admit it.
But you’ll be asking – did we give any presentations? No, not one. We claim this is because we are worker bees rather than honey bees (and definitely not queens). OK, we did muster three posters between us. But personally I spent much time listening admiringly to my SLS and NILS colleagues, a dazzlingly bright lot and a shining example to the rest of us. How they get all that research done as well as supporting large numbers of user-led projects is a mystery.
In addition, of course, we CeLSIUS types sigh with envy at the expanding array of interesting data being linked to SLS and NILS, unlike the dear old LS which hasn’t had a new type of data linked for many decades. (But watch this space, there are agents provocateurs at work.)
On the positive side I was also able to admire presentations of many LS projects which have been supported by CeLSIUS. I’ve lost count but there were around eight of them and all of a quality that would make the heart of any worker bee swell with pride. Which is not to say that I always understood them – some of these young PhD students can pronounce three- and even four-syllable words while simultaneously using a screen pointer, they will certainly all be professors one day and let’s hope they still remember and love the Longitudinal Studies then. But signs are good; it was gratifying to hear the three Studies mentioned so frequently. Apart from in-house events I’ve never been to a gathering where such a high proportion of attenders appeared to know about them.
Will we be going to BSPS next year? You bet. Wherever it is, and whatever the weather’s like, we’ll be there. We might even manage to stand up and say something.
Second time at the BSPS
Kevin Ralston, SLS-DSU
This year saw another successful British Society for Population Studies conference from the point of view of the Scottish Longitudinal Study (SLS) research group. I managed see a bit of the historic town of Winchester on the afternoon of the conference dinner. A combination of the weather, the beautiful location, the convenient transport links together with the spellbinding scientific output on show made this one of the best conferences I’ve ever attended.
As usual many of our team were involved across all stands of activities at the conference. I presented: Assessing the potential impact of markers of social support on levels of ‘excess’ mortality in Scotland and Glasgow compared to elsewhere in the UK from a project involving Zhiqiang Feng, Chris Dibben and David Walsh at the Scottish Public Health Observatory.
Dr Beata Nowok presented: Generating synthetic microdata to widen access to sensitive data sets: method, software and empirical examples. This project also involved Gillian Raab and Chris Dibben, and Dr Nowok showcased the results of the SYLLS project which, amongst other things, provides and curates the R package ‘synthpop’ that generates anonymous synthetic data. This is particularly useful for anyone involved teaching and projects using sensitive data.
Dr Zhiqiang Feng gave a presentation entitled The long‐term impacts of NEET experiences on health: evidence from the Scottish Longitudinal Study. This is part an ongoing project for the Scottish Government and involves me, and Chris Dibben.
There were also two SLS posters on show, with Dr Lee Williamson hosting the Progress and developments of the Digitising Scotland (DS) Project poster. Meanwhile, Susan Carlsey presented An Introduction to the Scottish Longitudinal Study (SLS).
In addition to these contributions Dr Zhiqiang Feng, in conjunction with Celia Macintyre from the National Records of Scotland, also ran a training session in How to analyse UK census flow data.
This sample of output illustrates what an outstanding year it has been for the team and we look forward to next year’s conference where the continued development of our work should mean we maintain a large presence.
Note: Information on all SLS, NILS and ONS LS presentations and posters from BSPS 2014 can be found in our outputs database
Susan Carsley, SLS-DSU
The Scottish Longitudinal Study (SLS) has recently received approval to include all SLS members’ GP registered postcodes since 2001 in the SLS database. The inclusion of this data will enable researchers to more accurately link to other environmental and geographical data in the intercensal period.
At present the SLS holds Census postcode data (current address and address one year ago, workplace/place of study), for all SLS members, also available are postcodes from registration data (births, deaths and marriages) and a postcode from the School Census data (where applicable). Although researchers do not have direct access to postcode data, they are essential in being able to identify different ecological factors associated with SLS members. Using postcode (thus grid references derived from them) researchers are able to link to any higher level geographies via lookup tables or to geographical and environmental indicators using GIS operations.
The main benefit of having this more frequent data will be the ability to start identifying any changes of address between censuses. This will be particularly useful for studies of mobility – for example how this interacts with labour market involvement. The data will also be very useful in studies of environmental exposure e.g. to pollution and having more frequent and accurate postcode data will become increasingly beneficial as we continue to add in more datasets to the SLS which provide annual data.
The addition of this data to the SLS will not only open doors to many new projects, it will also be beneficial to some projects currently being investigated. For example:
In this project postcodes of residence (mother’s address from birth registration) and workplace (from census) are used for linking to SIMD (datazone) and air pollution data (1km square). This allowed the researchers to explore whether levels of air pollution at residence and workplace are associated with low birth weight. More frequent postcode data could help improve the study by identifying whether a member moved to an area with different level of pollution. Thus helping more accurately identify how long a member stayed in a highly polluted area, as length of time exposure to pollution is highly relevant in study on its effect. Previously this project was only able to compare postcode from census and birth registration to see whether a member moved to an area with a different level of pollution and whether this has impact on low birth weight, this ignored the possibility of a member moving between the census time and vital registration. The addition of this more frequent data will reduce this problem.
Adam Dennett (CeLSIUS and CASA, UCL) recently featured in an episode of The Global Lab and discussed his work with Census data and the Synthetic Data Estimation for the UK Longitudinal Studies (SYLLS) project.
You can hear or download the podcast on Soundcloud