skip to Main Content

In the artificial data I have created for the MWE below I have tried to demonstrate the essence of a script I have created in R. As can be seen by the graph that gets produced from this code, on one of my conditions I don’t have a “No” value to complete the series.

I have been told that unless I can make this last column that sadly doesn’t have the extra series as thin as the columns else where in the graph I won’t be permitted to use these graphs. This is sadly a problem because the script I have written produces hundreds of graphs simultaneously, complete with stats, significance indicators, propogated error bars, and intelligent y-axis adjustments (these features are of course not present in the MWE).

Few other comments:

  • This exception column is not guaranteed to be at the end of the graph… so manual tweaking to force the series to change color and invert the order leaving the extra space on the right hand side isn’t reliable.

  • I have tried to simulate the data as a constant 0 so that the series “is present” but invisible, but as would be expected, the order of the series c(No,Yes) makes this skip a space which is also unacceptable. This is how this same question was answered here, but sadly it doesn’t work for me with my restrictions: Consistent width for geom_bar in the event of missing data and Include space for missing factor level used in fill aesthetics in geom_boxplot

  • I also tried to do this with facets but numerous issues arose there including line breaks, and errors in the annotations I add to the x-axis.

MWE:

library(ggplot2)

print("Program started")

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- as.data.frame(cbind(x,s,y))

print(df)

gg <- ggplot(data = df, aes_string(x="x", y="y", weight="y", ymin=paste0("y"), ymax=paste0("y"), fill="s"));
dodge_str <- position_dodge(width = NULL, height = NULL);
gg <- gg + geom_bar(position=dodge_str, stat="identity", size=.3, colour = "black")

print(gg)

print("Program complete - a graph should be visible.")

2

Answers


  1. Yeah, I figured what happened: you need to be extra careful about factors being factors and numerics being numerics. In my case, with stringsAsFactors = FALSE I have

    str(df)
    'data.frame':   7 obs. of  3 variables:
     $ x: chr  "1" "2" "3" "1" ...
     $ s: chr  "No" "No" "No" "Yes" ...
     $ y: chr  "1" "2" "3" "2" ...
    
    dput(df)
    structure(list(x = c("1", "2", "3", "1", "2", "3", "4"), s = c("No", 
    "No", "No", "Yes", "Yes", "Yes", "Yes"), y = c("1", "2", "3", 
    "2", "3", "4", "5")), .Names = c("x", "s", "y"), row.names = c(NA, 
    -7L), class = "data.frame")
    

    with no factors and numeric turned into character because of cbind-ing (sic!). Let us have another data frame:

    dff <- data.frame(x = factor(df$x), s = factor(df$s), y = as.numeric(df$y))
    

    Adding a “dummy” row (manually for your example, check out expand.grid version in the linked question on how to do this automatically):

    dff <- rbind(dff, c(4, "No", NA))
    

    Plotting (I removed extra aes):

    ggplot(data = df3, aes(x, y, fill=s)) + 
      geom_bar(position=dodge_str, stat="identity", size=.3, colour="black")
    

    enter image description here

    Login or Signup to reply.
  2. At the expense of doing your own calculation for the x coordinates of the bars as shown below, you can get a chart which may be close to what you’re looking for.

    x <- c("1","2","3","1","2","3","4")
    s <- c("No","No","No","Yes","Yes","Yes","Yes")
    y <- c(1,2,3,2,3,4,5)
    df <- data.frame(cbind(x,s,y) )
    df$x_pos[order(df$x, df$s)] <- 1:nrow(df)
    x_stats <- as.data.frame.table(table(df$x), responseName="x_counts")
    x_stats$center <- tapply(df$x_pos, df$x, mean)
    df <-  merge(df, x_stats, by.x="x", by.y="Var1", all=TRUE)
    bar_width <- .7
    df$pos <- apply(df, 1, function(x) {xpos=as.numeric(x[4]) 
                                    if(x[5] == 1) xpos 
                                    else ifelse(x[2]=="No", xpos + .5 -        bar_width/2, xpos - .5 + bar_width/2) } )
     print(df)
    gg <- ggplot(data=df, aes(x=pos, y=y, fill=s ) )
    gg <- gg + geom_bar(position="identity", stat="identity", size=.3,    colour="black", width=bar_width)
    gg <- gg + scale_x_continuous(breaks=df$center,labels=df$x )
    plot(gg)
    

    —– edit ————————————————–

    Modified to place the labels at the center of bars.

    Gives the following chart

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search