forked from mrecos/MUSA_800_template
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathPresentation1_Yibing.Rmd
317 lines (276 loc) · 13.3 KB
/
Presentation1_Yibing.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
---
title: "Markdown_Presentation1"
author: "Yibing Zheng"
date: "February 12, 2020"
output: html_document
---
```{r setup, message=FALSE, warning=FALSE, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
library(corrplot)
library(caret)
library(viridis)
library(stargazer)
library(tidyverse)
library(sf)
library(FNN)
library(tigris)
library(spdep)
library(ckanr)
library(grid)
library(gridExtra)
library(ggplot2)
library(RColorBrewer)
```
##City-wide Ridership Change Before and After CapRemap
```{r}
data <- read.csv("C:/Upenn/Practicum/Data/OneDrive_1_1-16-2020/Stop Ridership Aggregated.csv")
table(data$YEAR_ID)
#Subset data in 2018 for further analysis
data.y18 <- data %>%
subset(data$YEAR_ID == "2018")
```
```{r}
#Change June to 0 and make months before CapRemap become negative, after CapRemap become positive
data.y18$MONTH_ID <- as.numeric(data.y18$MONTH_ID)
data.y18$Month <- data.y18$MONTH_ID - 6
#Make month column become factor
#data.y18$Month <- factor( data.y18$Month, levels = c( "-5", "-4", "-3", "-2", "-1", "0", "1", "2", "3", "4", "5", "6") )
data.y18$Before <- ifelse(data.y18$Month < 0, 1, 0)
data.y19 <- data %>%
subset(data$YEAR_ID == 2019)
data.y19$Month <- data.y19$MONTH_ID - 6
data.y19$Before <- ifelse(data.y19$Month < -5, 1, 0)
data.y <- rbind(data.y18, data.y19)
data.y$YEAR_ID <- as.factor(data.y$YEAR_ID)
table(data$YEAR_ID, data$MONTH_ID)
```
Current available data from Capital Metro allows us to observe the trend in ridership change before and after Cap Remap. The first important part of exploratory analysis is to see the city-wide change in ridership brought by CapRemap. With the stop-level data from Janurary 2018 to June 2019, the aggregated city-wide ridership change is shown in the chart below.
The x-axis represents month, and y-axis repredents the average daily ridership in the given month. The solid lines in the chart are 2018 riderships. The yellow solid line is ridership from Janurary to May in 2018 (before Cap Remap happened in June 2018) while the blue solid line is ridership from June to December in 2018 (after CapRemap). The dashed line is the ridership in 2019 from Janurary to June after CapRemap happend the year before.
From the trend in 2018, it is clear that ridership fluctuated between months. Cap Remap didn't bring a rapid increase in ridership after the implementation. On the contrary, the ridership decreased in June and July. In August, the ridership recovers to the previous level before CapRemap, Then in September the ridership almost exploded to 0.1 million. Then it gradually went down in winter but in 2019, the ridership is generally higher than the same month in 2018. This result shows the general positive impact CapRemap brought to ridership change. For the decrease in June and July and following increasing trend, people might need time to adjust to the new bus schedule and get used to it. And after they realize the convinience of the redesign, the ridership increased rapidly. Another explanation is related to university's opening and closing as the low ridership happened in summer break and high ridership happened in the beginning of the new semester.
```{r}
#Plot Average_on first
plot.city<-
as.data.frame(data.y) %>%
group_by(Month, YEAR_ID) %>%
summarize(BOARD_Count = sum(AVERAGE_ON), Time = as.factor(max(Before))) %>%
ggplot(aes(x=Month,y=BOARD_Count, colour = Time, linetype = YEAR_ID)) +
geom_point() + stat_smooth(size=1) +
plotTheme() +
ylim(70000,100000) +
labs(title="Ridership by stops on an average weekday among all the stops in Austin",
subtitle="CapRemap Redesign Implemented Month in Blue", x="Month", y="Average Daily Ridership")+
geom_vline(xintercept = 0, color = "blue")+
scale_colour_manual(values = c("#E7B800", "#0072B2"), name="Time Period (Before,After)", breaks=c("0", "1"), labels=c("After", "Before"))+
# scale_color_brewer(palette = "YlGnBu")
scale_linetype_manual(values=c("solid", "dotted"))
plot.city
```
## Ridership Change in Different Neighborhoods in Austin in 2018
```{r}
#Transform ridership data into geo data
data.y18.sf <-
data.y18 %>%
st_as_sf(coords = c("LONGITUDE", "LATITUDE"), crs = 4326, agr = "constant") %>%
st_transform(2278)
data.y.sf <-
data.y %>%
st_as_sf(coords = c("LONGITUDE", "LATITUDE"), crs = 4326, agr = "constant") %>%
st_transform(2278)
#Load neighborhood geojson
nhood <- st_read("https://data.austintexas.gov/resource/nz5f-3t2e.geojson")%>%
st_set_crs(4326)%>%
st_transform(2278)
ggplot()+
geom_sf(data = nhood)
```
```{r}
#Load the transformed dataset from long format-stop level to long format-neighborhood level
Rider_nhood4 <- read.csv("C:/Upenn/Practicum/Data/Rider_nhood4.csv")
Rider_nhood4$Before <- ifelse(Rider_nhood4$Month == -5 |Rider_nhood4$Month == -4 |Rider_nhood4$Month == -3|Rider_nhood4$Month == -2|Rider_nhood4$Month == -1, "1", "0")
```
```{r}
Rider_nhood <- read.csv("C:/Upenn/Practicum/Data/Rider_nhood.csv")
Rider <- merge(Rider_nhood, nhood, by = "label")
```
After knowing the trend of city-wide ridership change, the next question is how the ridership changed across the city: which area experienced ridership increase and which area exprienced ridership decrease. Neighborhoods in Austin are used here to show the spatial trend here.
As shown in the map, darker blue represents higher ridership increase, darker red presents lower ridership increase or even ridership decrease. As shown in the map, mostly downtown areas experienced ridership increase from June to September while the outskirts of Austin experienced low ridership increase or even ridership decrease.
```{r}
ggplot() +
# geom_sf(data = nhoods, fill = "grey40") +
geom_sf(data = Rider, aes(fill = q5(Dif))) +
scale_fill_brewer(palette = "RdYlBu",
aesthetics = "fill",
labels=qBr(Rider,"Dif"),
name="Quintile\nBreaks") +
labs(title="Ridership Change in Neighborhoods") +
mapTheme()
```
The following charts are the ridership change in each neighborhood in 2018 ranked by the difference in ridership in June and September. There are 12 neighborhoods experienced ridership decrease from June to September. There are several neighborhoods experienced high ridership increase of more than 10,000 from June to September.
```{r,fig.height = 30, fig.width = 15}
plot.nhood <-
as.data.frame(Rider_nhood4) %>%
arrange(desc(Dif))%>%
ggplot(aes(x=Month,y=BOARD_Count, colour = Before)) +
geom_point() + stat_smooth(size=1) +
plotTheme() +
facet_wrap(Dif~label,scales="free", ncol=4) +
labs(title="Average Daily Ridership by Neighborhoods in Austin",
subtitle="CapRemap Redesign Implemented Month in Blue", x="Month", y="Average Daily Ridership in the Neighborhood")+
geom_vline(xintercept = 0, color = "blue")+
scale_colour_manual(values = c("#0072B2", "#E7B800"),name="Time Period (Before,After)", breaks=c("0", "1"), labels=c("After", "Before"))
plot.nhood
```
Among the 78 neighborhoods in Austin, we identified three neighborhoods that represents different characteristics: neighborhoods with expected ridership increase; neighborhoods with unexpected ridership increase; neighborhoods with unexpected ridership decrease.
UT is the neighborhood with expected ridership increase.The location of UT neighborhood is just above downtown neighborhood. With a lot of university students living around here, the bus network is sensitive to school schedule. There is a relatively clear trend in ridership change according to school seasons.
```{r}
UT <- Rider_nhood4 %>%
subset(Rider_nhood4$label == "UT")
plot.UT <-
as.data.frame(UT) %>%
arrange(desc(Dif))%>%
ggplot(aes(x=Month,y=BOARD_Count, colour = Before)) +
geom_point() + stat_smooth(size=1) +
plotTheme() +
facet_wrap(Dif~label,scales="free", ncol=4) +
labs(title="Average Daily Ridership in UT Neighborhood",
subtitle="CapRemap Redesign Implemented Month in Blue", x="Month", y="Average Daily Ridership in UT Neighborhood")+
geom_vline(xintercept = 0, color = "blue")+
scale_colour_manual(values = c("#0072B2", "#E7B800"),name="Time Period (Before,After)", breaks=c("0", "1"), labels=c("After", "Before"))
```
```{r}
mapTheme <- function(base_size = 12) {
theme(
text = element_text( color = "black"),
plot.title = element_text(size = 14,colour = "black"),
plot.subtitle=element_text(face="italic"),
plot.caption=element_text(hjust=0),
axis.ticks = element_blank(),
panel.background = element_blank(),axis.title = element_blank(),
axis.text = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_rect(colour = "black", fill=NA, size=2)
)
}
qBr <- function(df, variable, rnd) {
if (missing(rnd)) {
as.character(quantile(round(df[[variable]],0),
c(.01,.2,.4,.6,.8), na.rm=T))
} else if (rnd == FALSE | rnd == F) {
as.character(formatC(quantile(df[[variable]]), digits = 3),
c(.01,.2,.4,.6,.8), na.rm=T)
}
}
q5 <- function(variable) {as.factor(ntile(variable, 5))}
```
```{r}
nhood <- st_read("https://data.austintexas.gov/resource/nz5f-3t2e.geojson")%>%
st_set_crs(4326)%>%
st_transform(2278)
UT2 <- nhood%>%
subset(nhood$label == "UT")
Map.UT<-
ggplot() +
geom_sf(data = nhood, fill = "grey30") +
geom_sf(data = UT2, fill = "#56B4E9") +
labs(title = "UT Neighborhoods") +
mapTheme() + theme(legend.position = "bottom")
```
Neighborhood with ridership increase as we expected
```{r}
print(plot.UT)
print(Map.UT, vp=viewport(.85,.815,.35,.5))
```
The second neighborhood Govalle is the neighborhood that experiencnig unexpected ridership increase. After CapRemap, the ridership in Govalle nearly increased by 50% to 75%. As Govalle is closer to the outskirts of Austin, this ridership increase might reflects CapRemap'success in strengthening the east-west connection.
```{r}
Govalle <- Rider_nhood4 %>%
subset(Rider_nhood4$label == "Govalle")
plot.Govalle <-
as.data.frame(Govalle) %>%
arrange(desc(Dif))%>%
ggplot(aes(x=Month,y=BOARD_Count, colour = Before)) +
geom_point() + stat_smooth(size=1) +
plotTheme() +
facet_wrap(Dif~label,scales="free", ncol=4) +
labs(title="Average Daily Ridership in Govalle Neighborhood",
subtitle="CapRemap Redesign Implemented Month in Blue", x="Month", y="Average Daily Ridership in Govalle Neighborhood")+
geom_vline(xintercept = 0, color = "blue")+
scale_colour_manual(values = c("#0072B2", "#E7B800"),name="Time Period (Before,After)", breaks=c("0", "1"), labels=c("After", "Before"))
```
```{r}
Govalle2 <- nhood%>%
subset(nhood$label == "Govalle")
Map.Govalle<-
ggplot() +
geom_sf(data = nhood, fill = "grey30") +
geom_sf(data = Govalle2, fill = "#56B4E9") +
# labs(title = "Govalle Neighborhoods") +
mapTheme() + theme(legend.position = "bottom")
```
Neighborhood with unexpected ridership increase
```{r}
print(plot.Govalle)
print(Map.Govalle, vp=viewport(.85,.72,.4,.42))
```
But there are also neighborhoods exepriencing ridership decrease on the east-west direction. Zilker located in the southwest side of Austin's downtown region. Its ridership experienced a gradually slight decrease after CapRemap.
```{r}
Zilker <- Rider_nhood4 %>%
subset(Rider_nhood4$label == "Zilker")
plot.Zilker <-
as.data.frame(Zilker) %>%
arrange(desc(Dif))%>%
ggplot(aes(x=Month,y=BOARD_Count, colour = Before)) +
geom_point() + stat_smooth(size=1) +
plotTheme() +
facet_wrap(Dif~label,scales="free", ncol=4) +
labs(title="Average Daily Ridership in Zilker Neighborhood",
subtitle="CapRemap Redesign Implemented Month in Blue", x="Month", y="Average Daily Ridership in Zilker Neighborhood")+
geom_vline(xintercept = 0, color = "blue")+
scale_colour_manual(values = c("#0072B2", "#E7B800"),name="Time Period (Before,After)", breaks=c("0", "1"), labels=c("After", "Before"))
```
```{r}
Zilker2 <- nhood%>%
subset(nhood$label == "Zilker")
Map.Zilker<-
ggplot() +
geom_sf(data = nhood, fill = "grey30") +
geom_sf(data = Zilker2, fill = "#56B4E9") +
# labs(title = "Zilker Neighborhoods") +
mapTheme() + theme(legend.position = "bottom")
```
```{r}
print(plot.Zilker)
print(Map.Zilker, vp=viewport(.85,.72,.4,.42))
```
```{r}
Galindo <- Rider_nhood4 %>%
subset(Rider_nhood4$label == "Galindo")
plot.Galindo <-
as.data.frame(Galindo) %>%
arrange(desc(Dif))%>%
ggplot(aes(x=Month,y=BOARD_Count, colour = Before)) +
geom_point() + stat_smooth(size=1) +
plotTheme() +
facet_wrap(Dif~label,scales="free", ncol=4) +
labs(title="Average Daily Ridership in Galindo Neighborhood",
subtitle="CapRemap Redesign Implemented Month in Blue", x="Month", y="Average Daily Ridership in Galindo Neighborhood")+
geom_vline(xintercept = 0, color = "blue")+
scale_colour_manual(values = c("#0072B2", "#E7B800"),name="Time Period (Before,After)", breaks=c("0", "1"), labels=c("After", "Before"))
```
```{r}
Galindo2 <- nhood%>%
subset(nhood$label == "Galindo")
Map.Galindo<-
ggplot() +
geom_sf(data = nhood, fill = "grey30") +
geom_sf(data = Galindo2, fill = "#56B4E9") +
# labs(title = "Galindo Neighborhoods") +
mapTheme() + theme(legend.position = "bottom")
```
```{r}
print(plot.Galindo)
print(Map.Galindo, vp=viewport(.85,.72,.4,.42))
```