Network Analysis in R

Ted Laderas (laderast@ohsu.edu)

October 24, 2017

Overview

Network Data Representations useful in R

The igraph package

What you need to run these slides

I used the following libraries to generate these slides.

require(igraph)
require(knitr)
require(visNetwork)
require(shiny)

Loading in Data into igraph

The igraph package has parsers for reading in most of the general file formats for networks. Let’s load in the Karate network from Network Example Data. It’s in GML format, so we’ll need to specify that when we use read_graph().

library(igraph)
karate <- read_graph("data/karate.gml", format="gml")
class(karate)
## [1] "igraph"
plot.igraph(karate)

Adding Labels and Improving the Layout

That’s not very informative, especially since the original data doesn’t have labels. Let’s add some:

get.vertex.attribute(karate, "id")
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
## [24] 24 25 26 27 28 29 30 31 32 33 34
karate <- set.vertex.attribute(karate, "name", 
                               value=c("Chloe", "Emily", "Aaliyah", "Emma", 
                                       "Jennifer", "Olivia", "Hannah",
                                       "Jessica", "Sarah", "Lily", "Charlotte", 
                                       "Elizabeth", "Abigail", "Rebecca",
                                       "Samantha", "Jacob", "Muhammad", "Shawn", 
                                        "Aaron", "Daniel", "Jonah", "Alex", 
                                       "Michael", "James", "Ryan", "Jordan", 
                                       "Alexander", "Ali", "Tyler", "Kevin", 
                                       "Jack", "Ethan", "Luke", "Harry"))

#V() is a way to programmatically access vertices (or nodes) for igraph
V(karate)
## + 34/34 vertices, named:
##  [1] Chloe     Emily     Aaliyah   Emma      Jennifer  Olivia    Hannah   
##  [8] Jessica   Sarah     Lily      Charlotte Elizabeth Abigail   Rebecca  
## [15] Samantha  Jacob     Muhammad  Shawn     Aaron     Daniel    Jonah    
## [22] Alex      Michael   James     Ryan      Jordan    Alexander Ali      
## [29] Tyler     Kevin     Jack      Ethan     Luke      Harry
#get the ids (as a vector)
V(karate)$name
##  [1] "Chloe"     "Emily"     "Aaliyah"   "Emma"      "Jennifer" 
##  [6] "Olivia"    "Hannah"    "Jessica"   "Sarah"     "Lily"     
## [11] "Charlotte" "Elizabeth" "Abigail"   "Rebecca"   "Samantha" 
## [16] "Jacob"     "Muhammad"  "Shawn"     "Aaron"     "Daniel"   
## [21] "Jonah"     "Alex"      "Michael"   "James"     "Ryan"     
## [26] "Jordan"    "Alexander" "Ali"       "Tyler"     "Kevin"    
## [31] "Jack"      "Ethan"     "Luke"      "Harry"
#look at edges
E(karate)
## + 78/78 edges (vertex names):
##  [1] Chloe   --Emily     Chloe   --Aaliyah   Emily   --Aaliyah  
##  [4] Chloe   --Emma      Emily   --Emma      Aaliyah --Emma     
##  [7] Chloe   --Jennifer  Chloe   --Olivia    Chloe   --Hannah   
## [10] Jennifer--Hannah    Olivia  --Hannah    Chloe   --Jessica  
## [13] Emily   --Jessica   Aaliyah --Jessica   Emma    --Jessica  
## [16] Chloe   --Sarah     Aaliyah --Sarah     Aaliyah --Lily     
## [19] Chloe   --Charlotte Jennifer--Charlotte Olivia  --Charlotte
## [22] Chloe   --Elizabeth Chloe   --Abigail   Emma    --Abigail  
## [25] Chloe   --Rebecca   Emily   --Rebecca   Aaliyah --Rebecca  
## [28] Emma    --Rebecca   Olivia  --Muhammad  Hannah  --Muhammad 
## + ... omitted several edges

A More informative plot

#layout example from https://rulesofreason.wordpress.com/2012/11/05/network-visualization-in-r-with-the-igraph-package/
plot.igraph(karate, layout=layout.fruchterman.reingold, # the layout method. see the igraph documentation for details
        main='Karate Friends!', #specifies the title
        #vertex.label.dist=0.5,         #puts the name labels slightly off the dots
        vertex.label.color='black',     #the color of the name labels
        vertex.label.font=1,            #the font of the name labels
        vertex.label=V(karate)$name,        #specifies the labels of the vertices. in this case the 'name' attribute is used
        vertex.label.cex=0.75,          #specifies the size of the font of the labels. can also be made to vary
        vertex.size=degree(karate)*1.5, #make node size proportional to number of connections
        edge.arrow.size=2
)

Graph Attributes

Each element of a graph (vertices and edges) can be customized by passing in the appropriate attribute. For more info, check out http://www.inside-r.org/packages/cran/igraph/docs/attributes.

Some properties are automatically mapped, such as V(graph)$color or E(graph)$color. If you add a new attribute, you’ll need to map this to a plotting property in plot.igraph()

Remember, that after changing graph properties, such as node size, you need to run your layout_ function again.

#setting node properties
#can also use set.vertex.attributes() here
newKarate <- karate

faction <- c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2, 2, 1, 1, 2, 
                     1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)
V(newKarate)$color <- faction

#Set edge attributes using E(accessor)
#can pass a named list for each node as well
#You can also use set.edge.attributes() here
E(newKarate)$color <- "red"

#Here we instantiate a weight vector using sample
weightvec <- sample(c(1,2,3,4), length(E(newKarate)), replace = TRUE)
#Name each edge
names(weightvec) <- E(newKarate)$name
weightvec
##  [1] 2 3 2 3 2 4 1 3 3 1 3 4 2 2 1 3 4 3 1 1 1 3 3 2 2 2 3 3 1 1 4 3 1 1 4
## [36] 4 2 1 1 1 1 3 1 4 3 1 4 4 4 4 1 4 2 4 4 3 4 1 2 2 1 4 2 4 2 3 3 3 2 3
## [71] 2 4 3 3 3 1 2 3
E(newKarate)$weight <- weightvec

#note we need to map an attribute to a property of the graph (igraph is dumb)
plot(newKarate, edge.width=E(karate)$weight)

Simple Measures

Degree Distribution and Average Path Length

degree(karate)
##     Chloe     Emily   Aaliyah      Emma  Jennifer    Olivia    Hannah 
##        16         9        10         6         3         4         4 
##   Jessica     Sarah      Lily Charlotte Elizabeth   Abigail   Rebecca 
##         4         5         2         3         1         2         5 
##  Samantha     Jacob  Muhammad     Shawn     Aaron    Daniel     Jonah 
##         2         2         2         2         2         3         2 
##      Alex   Michael     James      Ryan    Jordan Alexander       Ali 
##         2         2         5         3         3         2         4 
##     Tyler     Kevin      Jack     Ethan      Luke     Harry 
##         3         4         4         6        12        17
sort(degree(karate), decreasing = TRUE)
##     Harry     Chloe      Luke   Aaliyah     Emily      Emma     Ethan 
##        17        16        12        10         9         6         6 
##     Sarah   Rebecca     James    Olivia    Hannah   Jessica       Ali 
##         5         5         5         4         4         4         4 
##     Kevin      Jack  Jennifer Charlotte    Daniel      Ryan    Jordan 
##         4         4         3         3         3         3         3 
##     Tyler      Lily   Abigail  Samantha     Jacob  Muhammad     Shawn 
##         3         2         2         2         2         2         2 
##     Aaron     Jonah      Alex   Michael Alexander Elizabeth 
##         2         2         2         2         2         1
hist(degree(karate))

average.path.length(karate)
## [1] 2.4082
shortPaths <- get.shortest.paths(karate, from="Muhammad")

plot(karate, mark.groups=shortPaths$vpath[c(10)])