Graph creation

Michael Taylor

2018/05/30

library(DiagrammeR)
knitr::opts_chunk$set(cache = TRUE)

Graph Creation

Creating a Graph Object

The create_graph() function creates a graph object. The function also allows for intialization of the graph name, the graph time (as a time with an optional time zone included), and any default attributes for the graph (i.e., graph, node, or edge attributes).

# Create the graph object
graph <- create_graph()

The components of the created graph object are:

  • graph_name — optional character vector with a name for the graph
  • graph_time — optional character vector that’s date and/or time
  • graph_tz — optional character vector with the time zone for graph_time
  • nodes_df — optional data frame with the graph’s nodes (or vertices) and attributes for each
  • edges_df — optional data frame with edges between nodes/vertices and attributes for each
  • graph_attrs — optional character vector of attributes pertaining to the entire graph
  • node_attrs — optional character vector of attributes pertaining to the nodes of the graph
  • edge_attrs — optional character vector of attributes pertaining to the edges of the graph
  • directed — a required logical value stating whether the graph should be considered a directed graph (TRUE, the default) or an undirected graph (FALSE)
  • dot_code — an optional character vector containing the automatically generated Graphviz DOT code for the graph

These components for the dgr_graph graph object are always present, and always in the specified order, however, the optional components may have NULL values if they are not set (e.g., an edgeless graph will have edges_df returning a NULL). To access any of these components directly for a graph named graph, simply use the construction graph$[component] (so, enter graph$nodes_df into the R console to examine the graph’s NDF). In forthcoming examples, this type of inspection will be used to reveal the contents of created graph objects, however, there are convenience functions (covered later) that directly return certain graph components without need for the $ operator.

For the nodes_df and edges_df arguments, one can supply a node data frame and an edge data frame, respectively. The dgr_graph object can be initialized wtihout any nodes or edges (by not supplying an NDF or an EDF in the function call), and this is a favorable option when supplying nodes and edges using other functions that modify an existing graph. Here is an example whereby an empty graph (initialized as a directed graph) is created. Note that the nodes_dfand edges_df data frames are NULL, signifying an empty graph.

# Get the class of the object
class(graph)
## [1] "dgr_graph"
# It's an empty graph, so no NDF
# or EDF
get_node_df(graph)
## [1] id    type  label
## <0 rows> (or 0-length row.names)
get_edge_df(graph)
## [1] id   from to   rel 
## <0 rows> (or 0-length row.names)
# By default, the graph is
# considered as directed
is_graph_directed(graph)
## [1] TRUE

It’s possible to include an NDF and not an EDF when calling create_graph. What you would get is an edgeless graph (a graph with nodes but no edges between those nodes. Edges can always be defined later (with functions such as add_edge(), add_edge_df(), add_edges_from_table(), etc., and these functions are covered in a subsequent section).

###
# Create a graph with nodes but no edges
###

# Create an NDF
nodes <-
  create_node_df(
    n = 4,
    label = FALSE,
    type = "lower",
    style = "filled",
    color = "aqua",
    shape = c("circle", "circle",
              "rectangle", "rectangle"),
    values = c(3.5, 2.6, 9.4, 2.7))
# Examine the NDF
nodes
##   id  type label  style color     shape values
## 1  1 lower  <NA> filled  aqua    circle    3.5
## 2  2 lower  <NA> filled  aqua    circle    2.6
## 3  3 lower  <NA> filled  aqua rectangle    9.4
## 4  4 lower  <NA> filled  aqua rectangle    2.7
# Create the graph and include the
# `nodes` NDF
graph <- create_graph(nodes_df = nodes)
# Examine the NDF within the graph object
get_node_df(graph)
##   id  type label  style color     shape values
## 1  1 lower  <NA> filled  aqua    circle    3.5
## 2  2 lower  <NA> filled  aqua    circle    2.6
## 3  3 lower  <NA> filled  aqua rectangle    9.4
## 4  4 lower  <NA> filled  aqua rectangle    2.7
# It's the same NDF (outside and inside the graph)
dplyr::all_equal(nodes, graph$nodes_df)
## [1] TRUE

Alternatively, an EDF can be supplied without need to supply an NDF (in which case the node ID values will be inferred but no node attributes will be available).

Quite often, there will be cases where node or edge attributes should be applied to all nodes or edges in the graph. To achieve this, there’s no need to create columns in NDFs or EDFs for those attributes (where you would repeat attribute values through all rows of those columns). Default graph attributes can be provided for the graph with the graph_attrs, node_attrs, and edge_attrs arguments. To supply these attributes, use vectors of graph, node, or edge attributes.

If you want the graph to be a directed graph, then the value for the directed argument should be set as TRUE (which is the default value). Choose FALSE for an undirected graph.

This next example will include both nodes and edges contained within a graph object. In this case, values for the type and rel attributes for nodes and edges, respectively, were provided. Adding values for those attributes is optional but will be important for any data modelling work.

# Create a node data frame
ndf <-
  create_node_df(
    n = 4,
    label = TRUE,
    type = c("a", "b", "c", "d"),
    style = "filled",
    color = "aqua",
    shape = c("circle", "circle",
              "rectangle", "rectangle"),
    values = c(3.5, 2.6, 9.4, 2.7))

edf <-
  create_edge_df(
    from = c(1, 2, 3),
    to = c(4, 3, 1),
    rel = "leading_to",
    fontname = "Helvetica",
    color = "blue",
    arrowsize = 2)


graph <-
  create_graph(
    nodes_df = ndf,
    edges_df = edf
    )
# Examine the NDF within the
# graph object
get_node_df(graph)
##   id type label  style color     shape values
## 1  1    a     1 filled  aqua    circle    3.5
## 2  2    b     2 filled  aqua    circle    2.6
## 3  3    c     3 filled  aqua rectangle    9.4
## 4  4    d     4 filled  aqua rectangle    2.7

Viewing a Graph Object

With the render_graph() function, it’s possible to view the graph object in the RStudio Viewer, or, output the DOT code for the current state of the graph.

graph %>% render_graph()
###
# Create a simple graph
# and display it
###

# Create a simple NDF
ndf <-
  create_node_df(
    n = 4,
    type = "number")

# Create a simple EDF
edg <-
  create_edge_df(
    from = c(1, 1, 3, 1),
    to = c(2, 3, 4, 4),
    rel = "related",
    fontname = "Helvetica",
    color = "gray20",
    layout = "neato")

# Create the graph object,
# incorporating the NDF and
# the EDF, and, providing
# some global attributes
graph <-
  create_graph(
    nodes_df = ndf,
    edges_df = edf
    )

# View the graph
render_graph(graph)

With packages such as magrittr or pipeR, one can conveniently pipe output from create_graph() to render_graph(). The magrittr package provides a forward pipe with the %>% operator. With pipeR, use %>>% instead.

###
# Use magrittr's %>% to create a graph and
# then view it without storing that graph object
###

# Create a simple NDF
ndf <-
  create_node_df(
    n = 4,
    type = "number")

# Create a simple EDF
edges <-
  create_edge_df(
    from = c(1, 1, 3, 1),
    to = c(2, 3, 4, 4),
    rel = "related",
    layout = "neato",
    fontname = "Helvetica",
    color = "gray20")

# Create the graph object,
# incorporating the NDF and
# the EDF, and, providing some
# global attributes
graph <-
  create_graph(
    nodes_df = ndf,
    edges_df = edf
    )
# Use the %>% operator between
# `create_graph()` and `render_graph()`
create_graph(
  nodes_df = nodes,
  edges_df = edges) %>% 
  render_graph

The create_random_graph() function is provided with several options for creating random graphs. The best way to understand the use of the function is through several examples. In all these examples, the function will be wrapped in render_graph() (with output = "visNetwork") to quickly inspect the graph upon creation. (Alternatively, the magrittr package’s %>% operator can pipe output from create_random_graph() directly to render_graph().)

We can create a not-so-random graph with 2 nodes and 1 edge (by default, the graphs produced are undirected graphs). The argument n is the number of nodes, and m is the number of edges.

###
# Create a very simple random graph
###

# Create a simple, random graph
# and render with the `visNetwork`
# output option
create_graph() %>% 
  add_growing_graph(n = 2, 
                    m = 1,
                    citation = TRUE,
                    set_seed = 23
                    ) %>% 
  render_graph(output = "visNetwork")

It’s better with more nodes and edges though. Try this again with 15 nodes and 30 edges:

create_graph() %>% 
  add_growing_graph(n = 15, 
                    m = 30,
                    citation = TRUE,
                    set_seed = 23
                    ) %>% 
  render_graph(output = "visNetwork")
# Use `n = 15` and `m = 105` to
# yield a fully-connected graph
# with 15 nodes

create_graph() %>% 
  add_growing_graph(n = 15, 
                    m = 105,
                    citation = TRUE,
                    set_seed = 23
                    ) %>% 
  render_graph(output = "visNetwork")
sessionInfo()
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17134)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252   
## [3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C                   
## [5] LC_TIME=English_Canada.1252    
## 
## attached base packages:
## [1] methods   stats     graphics  grDevices utils     datasets  base     
## 
## other attached packages:
## [1] bindrcpp_0.2.2   DiagrammeR_1.0.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.17       plyr_1.8.4         compiler_3.4.4    
##  [4] pillar_1.2.3       RColorBrewer_1.1-2 influenceR_0.1.0  
##  [7] bindr_0.1.1        viridis_0.5.1      tools_3.4.4       
## [10] digest_0.6.15      jsonlite_1.5       viridisLite_0.3.0 
## [13] gtable_0.2.0       evaluate_0.10.1    tibble_1.4.2      
## [16] rgexf_0.15.3       pkgconfig_2.0.1    rlang_0.2.1       
## [19] igraph_1.2.1       rstudioapi_0.7     yaml_2.1.19       
## [22] blogdown_0.6       xfun_0.1           gridExtra_2.3     
## [25] downloader_0.4     dplyr_0.7.5        stringr_1.3.1     
## [28] knitr_1.20         htmlwidgets_1.2    hms_0.4.2         
## [31] grid_3.4.4         rprojroot_1.3-2    tidyselect_0.2.4  
## [34] glue_1.2.0         R6_2.2.2           Rook_1.1-1        
## [37] XML_3.98-1.11      rmarkdown_1.9      bookdown_0.7      
## [40] ggplot2_2.2.1      tidyr_0.8.1        purrr_0.2.5       
## [43] readr_1.1.1        magrittr_1.5       codetools_0.2-15  
## [46] scales_0.5.0       backports_1.1.2    htmltools_0.3.6   
## [49] assertthat_0.2.0   colorspace_1.3-2   brew_1.0-6        
## [52] stringi_1.1.7      visNetwork_2.0.3   lazyeval_0.2.1    
## [55] munsell_0.4.3
## Adding cites for R packages using knitr
knitr::write_bib(.packages(), "packages.bib")
## Warning in citation(pkg, auto = if (pkg == "base") NULL else TRUE): no date
## field in DESCRIPTION file of package 'DiagrammeR'
## Warning in citation(pkg, auto = if (pkg == "base") NULL else TRUE): could
## not determine year for 'DiagrammeR' from package DESCRIPTION file

References