Disease, death and UK Biobank

Interest into the causes of death and disease is growing, as is our knowledge and understanding. Individuals, healthcare professionals, researchers, health organisations and governments all want to understand more about what might improve or reduce life expectancy, particularly in the middle-aged and elderly.

A large-scale project called UK Biobank was set up, and between 2006 and 2010, it collected 655 measurements from nearly half a million UK volunteers (498,103) aged 40-70. These measurements included taking blood samples, physical and biological measurements from volunteers as well as carrying out detailed questionnaires. The large-scale nature of UK Biobank makes it a unique and valuable data resource, which is available to researchers worldwide after an approved application.

Researchers Andrea Ganna (Karolinska Institutet and Uppsala University) and Erik Ingelsson (Uppsala University) used this resource to achieve the two aims of their research project, UbbLE (UK Longevity Explorer).

1. Association Explorer

Firstly, this data allowed Andrea and Erik to investigate how closely each of the 655 measurements (demographic, health and lifestyle variables) taken from the UK Biobank participants was associated with death within five years. As UK Biobank only collected data from middle-aged to elderly participants (40-70 years old), Andrea and Erik’s work is also based on this age range.

How did they do this?
In this study, UK Biobank participants were monitored until February 2014. For those who had died, information from the Health & Social Care Information Centre and NHS Central Register was used to determine the cause of death. These were categorised as ‘all-cause’ (any cause of death) and as one of six specific categories, for example death due to diseases of the digestive system or cancer. The researchers used this information and the 655 UK Biobank measurements to see how closely each was associated with death within five years. This association shows how accurately a variable can predict death within five years. It was calculated using C-index. The higher the C-index, the more accurately the variable can predict death within five years. For example, the measure of ‘usual walking pace’ is a more accurate predictor of death within five years (C-index = 0.72), than the measure of 'Number of days per week of moderate physical activity' (C-index = 0.68). A C-index of 50-60% is considered a poor predictor, 60-70% moderate, 70-80% good, 80-90% very good and >90% is considered excellent.

What did they find?
Using UK Biobank data in this way revealed five key results.

  1. Andrea and Erik found that the variables that most accurately predicted death from all causes within five years did not need to be measured by physical examination, but could be reported by individuals in response to a questionnaire. For example, asking people to rate their overall health (self-reported health) and to describe their usual walking pace were two of the strongest predictors in both men and women for different causes of death. Overall, walking pace was a stronger predictor than smoking habits and other lifestyle measurements. In fact, men aged 40-52 who reported their usual walking pace as ‘slow’ had a 3.7 times increased risk of death within five years than those who answered ‘steady average pace’.
  2. The variables that were most accurate at predicting death differed between men and women. For example, the strongest predictor in men was ‘self-reported health’, whereas in women, it was ‘previous cancer diagnosis’. Additionally, variables were generally more accurate predictors in men. These variations could be because men and women die from different causes, respond differently to measurements and questionnaires or simply because men and women are biologically different.
  3. When Andrea and Erik examined only individuals who didn’t have any major diseases, measurements of smoking habits were the strongest predictors of death within 5 years.
  4. Psychological and socioeconomic variables were the strongest predictors of external cause mortality, which means, for example suicide and accidentally falling.
  5. They also found that most variables were less accurate at predicting mortality in older individuals compared with those who were younger, but still within the 40-70 age range.

All of these association findings can be explored in the Association Explorer.

2. Risk Calculator

Secondly, UK Biobank data allowed Andrea and Erik to achieve the second aim of UbbLE: to create a 'Risk Calculator' .

How did they do this?
As questionnaire-based variables were found to be the strongest predictors, the researchers created a calculator that could use questionnaire answers to predict an individual’s risk of dying within five years (‘five-year risk’). To do this, they used a computer-based approach to automatically select the combination of questions from UK Biobank that gave the most accurate prediction of death within five years. The most accurate combination was found to be 13 questions for men, and 11 for women.

What did they find?
When they tested the Risk Calculator, Erik and Andrea found that it had a C-index of 0.80 for men and 0.79 for women. As a C-index of 70-80% is considered good, this means their questionnaire-based Risk Calculator gives a reasonably good prediction of the chance of dying within five years.

The Risk Calculator can use this prediction score to calculate an estimate of an individual’s ‘Ubble age’. To do so, it compares the individual risk (calculated from the questionnaire responses) to UK life tables, and selects the age at which the risk of dying is most similar. For example, if you are a woman of any age (between 40 and 70) and your individual risk of five-year mortality is 2.4% (calculated from questionnaire responses), the most similar risk in the UK life tables is the average risk for a 56-year old woman. Therefore, your Ubble age is 56 years.

If your Ubble age is higher than your actual age, you have a higher risk of dying within five years than the average person of your age in the UK. Conversely, if your Ubble age is lower than your actual age, you have a lower risk than the average person of your age in the UK.

This risk calculator gives an estimate of how many people with similar answers will live and die within the next five years. However, it does not predict the future for any one individual; it cannot identify who will live and who will die.

Why is this research important?
Although the association of variables with mortality has been investigated before, previous studies have various limitations. Many studies only looked at one variable or one cause of death and are based on a small number of participants. These factors limit how robust the results are. In contrast, UbbLE is the first study to investigate such a large number of variables as well as examine different causes of death and to involve such a large number of participants. Therefore, researchers can now use UbbLE to evaluate the relative importance of each variable in predicting mortality.

How can UbbLE be used?
UbbLE is also unique in its accessibility – all results from the study are publicly available and can be explored on the UbbLE website. Andrea and Erik hope this will not only improve the awareness of individuals about their own health, but will also be used by other scientists and to inform public health policy and advice. They hope doctors can use this information to identify and care for high-risk patients, and that other medical researchers can use this data as a starting point for future research.

What are the limitations of UbbLE?
Importantly, UbbLE only investigated the associations of variables with death within five years, and does not claim that any variables cause death. When trying to study whether a variable causes death in a population-based study, researchers need to make adjustments to rule out other factors (confounders) that might be influencing its relationship with death. This has not been done in this study. For example, if research found that having hayfever was associated with (or was a good predictor of) increased drowsiness, we cannot conclude that hayfever causes drowsiness. The drowsiness could be caused or influenced by a third factor, such as hayfever medication.
Additionally, as in any statistical analysis, the associations of variables with death have a degree of uncertainty, which is reflected in the ranges of values given for different measures. UbbLE is also limited because the answers given in questionnaires are known to have a degree of error, and the reporting of causes of death may also be inaccurate.