I’m trying to assign colors to links within a Sankey diagram command in Rstudio. Specifically, I would like the links to be colored by their source node group (SOF_Data$Species_Binomial).
The links and nodes already display properly, but when the command to color links is included (sankeyNetwork::colourScale), the diagram displays blank. Additionally, the command to color nodes (sankeyNetwork::NodeGroup) displays as the default blue, regardless of the color input from nodes_SOF_Data$color.
Starting data frame, node list creation:
library(dplyr)
library(networkD3)
library(htmlwidgets)
library(data.table)
SOF_Data <- data.frame(
Species_Binomial = c("C. artedi", "C. artedi", "C. artedi", "C. artedi", "C. artedi", "C. artedi", "C. fera", "C. fera"),
Life_Stage = c("Larva/fry", "Embryotic/egg", "Embryotic/egg", "Embryotic/egg", "Embryotic/egg", "Larva/fry", "Embryotic/egg", "Larva/fry"),
Effect_Category = c("Growth", "Growth", "Survival", "Growth", "Growth", "Growth", "Growth", "Growth"),
Categorical_Effect = c("Growth rate", "Other - Specific", "Survival - Specific", "Development rate", "50% hatching time", "Growth rate", "Development rate", "Otolith growth"))
nodes_SOF_Data <- data.frame(name = unique(c(
SOF_Data$Species_Binomial,
SOF_Data$Life_Stage,
SOF_Data$Effect_Category,
SOF_Data$Categorical_Effect)))
nodes_SOF_Data$color <- "#000"
Links creation:
links1_SOF_Data <- SOF_Data %>%
group_by(Species_Binomial, Life_Stage) %>%
summarize(value = n()) %>%
ungroup() %>%
mutate(source = match(Species_Binomial, nodes_SOF_Data$name) - 1,
target = match(Life_Stage, nodes_SOF_Data$name) - 1,
LinkGroup = Species_Binomial)
links2_SOF_Data <- SOF_Data %>%
group_by(Species_Binomial, Life_Stage, Effect_Category) %>%
summarize(value = n()) %>%
ungroup() %>%
mutate(source = match(Life_Stage, nodes_SOF_Data$name) - 1,
target = match(Effect_Category, nodes_SOF_Data$name) - 1,
LinkGroup = Species_Binomial)
links3_SOF_Data <- SOF_Data %>%
group_by(Species_Binomial, Effect_Category, Categorical_Effect) %>%
summarize(value = n()) %>%
ungroup() %>%
mutate(source = match(Effect_Category, nodes_SOF_Data$name) - 1,
target = match(Categorical_Effect, nodes_SOF_Data$name) - 1,
LinkGroup = Species_Binomial)
links_SOF_Data <- bind_rows(links1_SOF_Data, links2_SOF_Data, links3_SOF_Data)
links_SOF_Data <- links_SOF_Data %>%
mutate(color = case_when(Species_Binomial == "C. artedi" ~ "#66c2a5",
Species_Binomial == "C. fera" ~ "#e78ac3"))
Sankey graph code:
colour_scale_species <- JS("function(d) { return d.color; }")
sankey_SOF_Data <- sankeyNetwork(Links = links_SOF_Data,
Nodes = nodes_SOF_Data,
Source = "source",
Target = "target",
Value = "value",
NodeID = "name",
units = "Count",
fontSize = 12,
nodeWidth = 30,
NodeGroup = "color",
LinkGroup = "LinkGroup",
colourScale = colour_scale_species)
sankey_SOF_Data
I’ve tried imbedding a color hex code column in the links_PaperData column and calling directly from that, although it doesn’t seem to fix the issue.
2
Answers
The main problem is the way how the package parses the coloring function. If you look into the source code you will see that the package simply
eval
argumentcolourScale
. Hence, you have to pass a string which returns a function.The easiest way to do so, is to use a construct like this:
The outer function encapsulates the inner function and simply returns it. As it is directly invoked (note the brackets),
eval
will indeed receive a function which it can eventually use (this is a bit of a cumbersome interface, and the package author could simplify that by using a different approach).The second issue is that the color scale function (once we can correctly pass it to
sankeyNetwork
) will simply receive the value of columnNodeGroup
andLinkGroup
respectively. So what you need to do is, to create a column with the link color in you link data.Code speaks a thousand words, so here’s a working example (N.B. I refactored your code a bit to avoid duplication and – to show the intended effect of links having the color of their source node – decided to color each node group sepearately):
Which produces the following graph:
To process your data into the appropriate
nodes
andlinks
data frames, here is my suggestion…With those, if you just want to make sure that the links match whatever color their source nodes are, you can simply set
LinkGroup = "source"
because thesource
columns in thelinks
data frame should match the name of the source node in thenodes
data frame (that is why I made the new numericsource_id
column).If you want to set specific colors for the nodes and have the link colors follow, you can add a
color
column to thenodes
andlinks
data frames and use an identity JavaScript function for thecolourScale
argument.