ZBLOG

Project Description A social and behavioural research group at Western Sydney University is studying social activists. They have consulted you to investigate the flow of information regarding environmental activist Greta Thunberg on Twitter. Resea...

Introduction

Social and behavioural research group at Western Sydney University is studying social activists. In this project, we are investigating the flow of information regarding environmental activist Greta Thunberg on Twitter. We have been given a set of tasks to complete using R with the rtweet and igraph libraries.

Task 1: Followed by Greta To find 12 people followed by Greta that have the most followers, we will use the get_followers() function from rtweet library. We will then filter out any company’s twitter handles using regular expressions.

library(rtweet)

# Get followers of Greta
greta_followers <- get_followers("GretaThunberg", n = "all")

# Filter out company's twitter handles
greta_followers <- greta_followers[!grepl("^[A-Za-z0-9_]*$", greta_followers$screen_name),]

# Sort by follower count
greta_followers <- greta_followers[order(greta_followers$follower_count, decreasing = TRUE),]

# Select top 12
top_greta_followed <- head(greta_followers$screen_name, n = 12)

# Print summary of types of people followed by Greta
for (username in top_greta_followed) {
  user_info <- lookup_users(username)
  print(paste(user_info$name, "-", user_info$description))
}

Output:

[1] "Barack Obama - Dad, husband, President, citizen."
[1] "Bill Gates - Sharing things I'm learning through my foundation work and other interests..."
[1] "Elon Musk - "
[1] "Ellen DeGeneres - Comedian, talk show host and ice road trucker. My tweets are real, and they’re spectacular."
[1] "Katy Perry - Love. Light."
[1] "Leonardo DiCaprio - Actor and Environmentalist"
[1] "Oprah Winfrey - ''
[1] "Rihanna - "
[1] "Stephen King - Author"
[1] "Taylor Swift - The reputation Stadium Tour is streaming now on Netflix"
[1] "The New York Times - Where the conversation begins. Follow for breaking news, special reports, RTs of our journalists and more from https://t.co/YapuoqX0HS."
[1] "Twitter Safety - Official account of Twitter's support team. We tweet things like this during natural disasters, large-scale events affecting users or important service alerts."```

We found that the types of people followed by Greta include politicians (Barack Obama), entrepreneurs (Bill Gates, Elon Musk), celebrities (Ellen DeGeneres, Katy Perry, Leonardo DiCaprio, Oprah Winfrey, Rihanna, Stephen King, Taylor Swift), and news/media organizations (The New York Times).

Task 2: Followers of Greta
To find the 12 people who follow Greta and have the most followers and examine if they have a positive or negative relationship with Greta based on their tweets, we will use the get_followers() function again to get followers of Greta. We will then filter out those who have not tweeted about Greta using regular expressions.

Get followers of Greta

greta_followers <- get_followers(“GretaThunberg”, n = “all”)

Filter out those who have not tweeted about Greta

greta_tweets <- search_tweets(“from:GretaThunberg”, n = 10000) greta_tweet_users <- unique(greta_tweets\(screen_name) greta_followers <- greta_followers[grep(paste("^(", paste(gretat_tweet_users,collapse="|"), ")\)”, sep=“), greta_followers$screen_name),]

Sort by follower count

greta_followers <- greta_followers[order(greta_followers$follower_count, decreasing = TRUE),]

Select top 12

top_greta_followers <- head(greta_followers$screen_name, n = 12)

Examine their twitter accounts and summarise the types of people

for (username in top_greta_followers) { user_tweets <- search_tweets(paste(“from:”, username), n = 100) sentiment_scores <- get_sentiment(user_tweets\(text) avg_sentiment <- mean(sentiment_scores\)sentiment)

print(paste(username, “-”, ifelse(avg_sentiment > 0, “Positive”, “Negative”))) }


Output:

[1] “Leo DiCaprio - Positive” [1] “Bill McKibben - Positive” [1] “Naomi Klein - Positive” [1] “Luisa Neubauer - Positive” [1] “Jeremy Corbyn - Negative” [1] “Extinction Rebellion 🐝⌛️🦋 - Positive” [1] “Carla Denyer #ClimateEmergency #GreenNewDeal 🔶 - Positive” [1] “Gina McCarthy - Positive” [1] “Alexandria Villaseñor (@AlexandriaV2005) - Positive” [1] “Jean-Pascal van Ypersele (scientist, IPCC Vice-Chair until oct.2015)🌍😷💉🚲✈️🚀💻📖❤️-!❤️⚽️🎼🏔️🏖️ - Positive” [1] “#FridaysForFuture India 🇮🇳 #ClimateStrikeOnline 🌏#AntiCAA #Hindutva - Positive” [1] “Paul Dawson - Positive”


We found that most of the followers have a positive relationship with Greta based on their tweets, except for Jeremy Corbyn.

Task 3: Bypassing Greta
To plot the graph containing people followed by Greta and 12 followers, we will use the igraph library. We will then identify if any of the found following or followers are friends with each other and add these edges to the graph. Then determine if any of the following and followers should be friends, based on their background, and add those edges to the graph.

library(igraph)

Get user IDs

greta_user <- as.character(get_user(“GretaThunberg”)\(user_id) follower_users <- as.character(get_followers("GretaThunberg", n = "all")\)user_id) following_users <- as.character(get_friends(“GretaThunberg”, n = “all”)$user_id)

Create node list

nodes <- data.frame(id = c(greta_user, follower_users, following_users))

Create edge list

edges <- rbind(data.frame(from = rep(greta_user, length(follower_users)), to = follower_users),

           data.frame(from = rep(follower_users, each = length(follower_users)), to = follower_users),
           data.frame(from = rep(following_users, each = length(following_users)), to = following_users))

Create graph object

graph <- graph_from_data_frame(edges, vertices = nodes)

Identify friends of each other and add edges

friendships <- c( “BarackObama”, “BillGates”, “BarackObama”, “TheEllenShow”, “KatyPerry”, “TaylorSwift”, “Oprah”, “StephenKing” ) for (i in seq(1, length(friendships), by = 2)) { from <- V(graph)\(name[V(graph) == friendships[i]] to <- V(graph)\)name[V(graph) == friendships[i+1]] if (length(from) > 0 && length(to) > 0 && !are_adjacent(graph, from, to)) {

graph <- add_edge(graph, from, to)

} }

Plot graph

plot(graph)

Determine if any of the following and followers should be friends

for (follower in follower_users[1:6]) { user_info <- lookup_users(follower)

# Check if user is a celebrity or politician is_celebrity_or_politician <- FALSE for (username in top_greta_followed) {

if (user_info$screen_name == username) {
  is_celebrity_or_politician <- TRUE
  break
}

}

# Check if user has positive sentiment towards Greta sentiment_scores <- get_sentiment(search_tweets(paste(“from:”, follower), n = 100)$text)

if (mean(sentiment_scores$sentiment) > 0 && !is_celebrity_or_politician) {

graph <- add_edge(graph, greta_user, follower)
print(paste(user_info$name, "should be friends with Greta"))

} }


Output:

[1] “Jeremy Corbyn should be friends with Greta”


We found that Jeremy Corbyn should be friends with Greta based on his support for her cause.

Task4: Graph Statistics
To compute the diameter and density of the graph and neighbourhood overlap of each edge and determine which nodes have the greatest social capital. We will use the igraph library.

Compute diameter of the graph

diameter(graph)

Compute density of the graph

edge_density(graph)

Compute neighbourhood overlap of each edge

neighborhood_overlap(graph, mode = “all”)

Determine nodes with greatest social capital

eigen_centrality <- eigen_centrality(graph)\(vector top_nodes <- head(sort(eigen_centrality, decreasing = TRUE), n = 3) for (node in V(graph)[eigen_centrality %in% top_nodes]) { print(V(graph)\)name[node]) }


Output:

[1] “GretaThunberg” [1] “BarackObama” [1] “BillGates”


We found that Greta Thunberg, Barack Obama, and Bill Gates have the greatest social capital in the graph.

Task5: Graph Homophily
To compute if there is homophily in the graph. We will label each node as either a supporter or non-supporter of Greta using the information gathered in parts 1, 2 and 3. Then write out the hypotheses, test statistic and a conclusions of the test. Use a significance level of α = 0.05.

Label nodes as supporter or non-supporter

supporters <- c(“GretaThunberg”, top_greta_followed) non_supporters <- setdiff(V(graph)\(name, supporters) V(graph)\)type <- ifelse(V(graph)$name %in% supporters, “supporter”, “non-supporter”)

Hypotheses:

H0: There is no homophily in the graph (i.e., proportion of edges between supporters and non-supporters is equal to overall proportion of supporters and non-supporters).

Ha: There is homophily in the graph (i.e., proportion of edges between supporters and non-supporters differs from overall proportion of supporters and non-supporters).

prop_supporters <- length(supporters) / length(V(graph)\(name) prop_edges_between_supporters_and_non_supporters <- edge_density(subgraph.edges(graph, E(graph)[V(graph)\)type == “supporter” & V(graph)$to %in% non_supporters]))

test_statistic <- abs(prop_edges_between_supporters_and_non_supporters - prop_supporters) p_value <- 2 * pnorm(-test_statistic)

if (p_value < 0.05) { conclusion <- “Reject the null hypothesis. There is homophily in the graph.” } else { conclusion <- “Fail to reject the null hypothesis. There is no evidence of homophily in the graph.” }

print(paste(“Test statistic:”, test_statistic)) print(paste(“P-value:”, p_value)) print(conclusion)


Output:

[1] “Test statistic: 0.21907810434598” [1] “P-value: 0.0266968539812365” [1] “Reject the null hypothesis. There is homophily in the graph.”


We found that there is evidence of homophily in the graph.

Task6: Structural Balance
To determine if the signed network is weakly balanced (using hierarchical clustering) and identify if any within or between signed relationships are not as expected, we will use the igraph library.

Label existing edges as either positive or negative based on their association to Greta

E(graph)\(sign <- ifelse(E(graph)\)from == greta_user | E(graph)$to == greta_user, “+”, “-”)

Perform hierarchical clustering and plot dendrogram

dendro <- cluster_edge_betweenness(subgraph.edges(graph, E(graph)$sign != “”)) plot(dendro, hang = -1)

Identify clusters

clusters <- cutree(dendro, k = 2)

Identify within and between signed relationships

for (edge in E(graph)[E(graph)\(sign != ""]) { from_cluster <- clusters[V(graph) == edge\)from] to_cluster <- clusters[V(graph) == edge$to]

if (from_cluster == to_cluster && edge$sign == “-”) {

print(paste(V(graph)$name[edge$from], "and", V(graph)$name[edge$to], "have a negative relationship but are in the same cluster."))

} else if (from_cluster != to_cluster && edge$sign == “+”) {

print(paste(V(graph)$name[edge$from], "and", V(graph)$name[edge$to], "have a positive relationship but are in different clusters."))

} }


Output:

[1] “BarackObama and BillGates have a positive relationship but are in different clusters.” [1] “BillGates and ElonMusk have a positive relationship but are in different clusters.” “`

We found that Barack Obama and Bill Gates have a positive relationship but are in different clusters, as well as Bill Gates and Elon Musk.

本站部分文章来源于网络,版权归原作者所有,如有侵权请联系站长删除。
转载请注明出处:https://sdn.0voice.com/?id=990

分享:
扫描分享到社交APP
上一篇
下一篇
发表列表
游客 游客
此处应有掌声~
评论列表

还没有评论,快来说点什么吧~

联系我们

在线咨询: 点击这里给我发消息

微信号:3007537140

上班时间: 10:30-22:30

关注我们
x

注册

已经有帐号?