Geographical participation in the #30DayMapChallenge

I am curious about the geographical participation in the #30DayMapChallenge, so I thought it may be a good opportunity to learn the R-package rtweet and see if I can map from where people are twitting.

First, I installed rtweet:

if (!require("rtweet")) install.packages("rtweet")
library(rtweet)

I also needed ggmap for geocoding users (I registered with Google and set up the API key; in the following link you can see how to do that: link):

if (!require("ggmap")) install.packages("ggmap")
library(ggmap)

Finally, I used the following libraries:

library(tidyverse)
library(sf)
library(glue)
library(rnaturalearth)
library(emo)
library(knitr)

Once I loaded the required libraries, I did:

  1. Get all tweets with the hashtag #30DayMapChallenge:

Note that rtweets only returns tweets from the past 6-9 days (I didn’t know that before so I won’t be able to analyse all tweets 🙍). Therefore, I have saved (loaded) the downloaded data for getting reproducible results and you would need to uncomment the lines for for downloading new data.

# NOT RUN
# rt_download <- search_tweets("#30DayMapChallenge",  
#                              n = 18000,
#                              include_rts = FALSE)
# 
# saveRDS(rt_download, "rt_download_2021_11_12.rds") 

rt_download <- readRDS("rt_download_2021_11_12.rds")
date_min <- format(min(rt_download$created_at), "%y-%m-%d")
date_max <- format(max(rt_download$created_at), "%y-%m-%d")
  1. Get users:

Unfortunately my only tweet so far was on Day2 [lines] and thus I am not in this dataset; let’s see if I can identify me next time 😄).

# Total users
user_info <- lookup_users(unique(rt_download$user_id)) 

# users with a known location 
user_loc <- lookup_users(unique(rt_download$user_id)) %>% 
  filter(location != "")
  1. Geocode users:

I did that with the function ggmap::geocode.

# NOT RUN
# # Get coordinates
# coded <- user_loc$location %>%
#   ggmap::geocode()
# 
# # Add coordinates to users
# user_geocode <- user_loc %>%
#   mutate(lon = coded$lon,
#          lat = coded$lat) %>%
#   select(user_id, screen_name, name, location, lon, lat)
#
# saveRDS(user_geocode, "user_geocode_2021-11-12.rds")
user_geocode <- readRDS("user_geocode_2021-11-12.rds")

# Remove NAs
user_geocode_na <- user_geocode %>% 
  drop_na()
  1. Convert the resulting dataset to a sf object:
user_sf <- user_geocode_na %>% 
  st_as_sf(coords = c("lon", "lat")) %>% 
  st_set_crs(4326)

# saveRDS(user_sf, "user_sf_2021_11_12.rds") #Just if I need it later
  1. Load the world map (with the rnaturalearth package):
world <- ne_countries(scale = "medium", returnclass = "sf") %>% 
  st_transform(crs = 4326 )
  1. Plot users on the world map:
p <- ggplot() +
  geom_sf(data = world, fill= "antiquewhite", color = "grey", size = 0.05) +
  geom_sf(data = user_sf, size = 0.2, col = "red") +
  labs(title = "From where are we twitting #30DayMapChallenge?",
       subtitle = glue("{nrow(user_info)} users have posted {nrow(rt_download)} tweets from {date_min} to {date_max}.
                       However, I only was able to identify {nrow(user_sf)} users (red points)")) +
  theme_bw() + 
  theme(panel.grid.major = element_line(color = gray(.5),
                                        linetype = "dashed",
                                        size = 0.5),
        panel.background = element_rect(fill = "aliceblue"))

ggsave(plot = p, filename = "tweets_30DayMapChallenge_03_12_Nov.png")
include_graphics("tweets_30DayMapChallenge_03_12_Nov.png")

Javier Elío
Javier Elío
Associate Professor

My research interests include environmental sciences and data analysis.

Related