-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathlifespan_variation.Rmd
167 lines (117 loc) · 5.11 KB
/
lifespan_variation.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
title: "Lifespan variation and life expectancy in UK nations"
output:
pdf_document: default
word_document: default
html_notebook: default
---
# Introduction
This short notebook will look at how average age of death and lifespan variation have changed in UK nations oer the 20th century. The aim is to make what's meant by the following statement clearer:
[Our aim is] That mortality trends improve (and inequalities narrow), such that there
is (at least) 'catch up' to the previous trends. This would involve improving the
rate of improvement back to the rapid improvement rates seen (for example) during
the 2000s and the rates of improvement in inequalities seen between the
1950s and the 1970s.
Let's just look at England/Wales, and Scotland, (as Northern Ireland is complicated by the Troubles and has a smaller population).
## Measures
* Improvements
* Median age of death
* Life expectancy
* Inequalities
* 80-20 interval in age of death
The 80-20 interval is the difference in years of age between the age by which 20% of deaths have occurred, and the age by which 80% of deaths have occurred. I felt this was easier to convey than (say) variance in age of death.
# Analysis
## Data Prep
```{r,echo=FALSE, message = FALSE, warning = FALSE}
rm(list = ls())
require(tidyverse)
require(plotly)
```
## Load data
```{r,echo=FALSE, message = FALSE, warning = FALSE}
dta <- read_csv("hmd_explorer/data/hmd_data.csv")
dta_e0 <- read_csv("hmd_explorer/data/hmd_e0.csv")
```
# Results
## Life expectancy
```{r,echo=FALSE, message = FALSE, warning = FALSE}
dta_e0 %>%
filter(code %in% c("GBRCENW", "GBR_SCO")) %>%
filter(gender != "total") %>%
filter(year >= 1900) %>%
ggplot(aes(x = year, y = e0, group = paste0(gender, code), colour = gender, linetype = code)) +
geom_line() +
scale_x_continuous(breaks = seq(1900, 2020, by = 10))
```
What was the average rate of improvement in e0 by decade?
```{r,echo=FALSE, message = FALSE, warning = FALSE}
improvements <- dta_e0 %>%
filter(code %in% c("GBRCENW", "GBR_SCO")) %>%
filter(gender != "total") %>%
filter(year >= 1900) %>%
group_by(code, gender) %>%
arrange(year) %>%
mutate(annual_improvement = e0 - lag(e0)) %>%
mutate(per_annual_improvement = 100 * annual_improvement / lag(e0)) %>%
mutate(decade = cut(year, breaks = seq(1900, 2020, by = 10),
labels = paste0(seq(1900, 2010, by = 10), "s"))) %>%
group_by(code, gender, decade) %>%
summarise(
avg_annual_improvement = mean(annual_improvement, na.rm = T),
avg_perc_annual_improvement = mean(per_annual_improvement)
)
```
## Figures
```{r, echo=FALSE, message = FALSE, warning = FALSE}
improvements %>%
filter(!is.na(decade)) %>%
ggplot(aes(x = decade, y = avg_annual_improvement, fill = code, group = code)) +
geom_bar(position = "dodge", stat = "identity", colour = "black") +
facet_wrap(~gender) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_y_continuous(breaks = seq(-0.1, 0.9, by = 0.1)) +
scale_fill_grey() +
labs(x = "Decade", y = "Average increase in life expectancy per year in decade")
```
# Lifespan variation
```{r, echo=FALSE, message = FALSE, warning = FALSE}
interval_data <- dta %>%
filter(code %in% c("GBRCENW", "GBR_SCO")) %>%
filter(gender != "Total") %>%
filter(Year >= 1900) %>%
group_by(gender, code, Year) %>%
arrange(Age) %>%
mutate(
cumulative_deaths = cumsum(num_deaths),
cumulative_proportion_deaths = cumulative_deaths / sum(num_deaths)
) %>%
summarise(
median_age = Age[cumulative_proportion_deaths < 0.5 & lead(cumulative_proportion_deaths > 0.5)],
lower_age = max( 0 , Age[cumulative_proportion_deaths < 0.2 & lead(cumulative_proportion_deaths > 0.2)]),
upper_age = Age[cumulative_proportion_deaths < 0.8 & lead(cumulative_proportion_deaths > 0.8)]
)
```
## Figure
```{r,echo=FALSE, message = FALSE, warning = FALSE}
interval_data %>%
ggplot(aes(x = Year)) +
geom_ribbon(aes(ymin = lower_age, ymax = upper_age), fill = "grey", alpha = 0.5) +
geom_line(aes(y = median_age)) +
facet_grid(gender ~ code) +
labs(x = "year", y = "Age in years", title = "Median and 20-80 intervals of age of death") +
scale_y_continuous(breaks = seq(0, 90, by = 10)) +
scale_x_continuous(breaks = seq(1900, 2020, by = 10))
```
## 20-80 interval over time
```{r,echo=FALSE, message = FALSE, warning = FALSE}
interval_data %>%
mutate(interval = upper_age - lower_age) %>%
ggplot(aes(x = Year, y = interval, group = paste0(code, gender), colour = gender, linetype = code)) +
geom_line() +
scale_y_continuous(limits = c(0, 80), breaks = seq(0, 80, by = 10)) +
scale_x_continuous(breaks = seq(1900, 2020, by = 10)) +
labs(x = "Year", y = "Difference in years", title = "80-20 interval in ages of death",
subtitle = "Difference between age by which 20% of deaths have occurred\nand age by which 80% of deaths have occurred")
```
# Conclusions
We should aim for 80-20 intervals of 'only' around 20 years, and average annual live expectancy improvements of at least 0.15 years/year averaged over ten years.