skip to Main Content

This is a continuation from a previous question:
Rfast hd.eigen() returns NAs but base eigen() does not

I have been having a problem with .Internal(La_rs((x)) returning different results on different machines.

I suspect it may have something to do with number formatting, because on the same machine, if I save as a CSV and re-open, I don’t get negatives anymore:

On Clear Linux install:

> load("input_to_La_rs.Rdata")
> r <- .Internal(La_rs(as.matrix(x), only.values = FALSE))
> sum(r$values < 0)
[1] 1
> write.csv(x, "test_for_internal.csv", row.names = FALSE)
> x <- read.csv("test_for_internal.csv")
> r <- .Internal(La_rs(as.matrix(x), only.values = FALSE))
> sum(r$values < 0)
[1] 0

However on my Windows install (and on a CentOS based HPC setup), I can open the rdata file directly and don’t get negative values:

> load("input_to_La_rs.Rdata")
> r <- .Internal(La_rs(x, only.values=TRUE))
> sum(r$values < 0)
[1] 0

Is this related to different R builds/library versions? Some setting I don’t know about? A bug?

Edit: here is an updated example. It seems to work inconsistently, even on the this particular install, sometimes I do get zero:

set.seed(123)
bigm <- matrix(rnorm(2000*2000,mean=0,sd = 3), 2000, 2000)
m <- Rfast::colmeans(bigm)
y <- t(bigm) - m
xx <- crossprod(y)
x <- unname(as.matrix(xx))
b <- .Internal(La_rs(x, TRUE))
sum(b$values < 0)
# [1] 1

Yet another update: It turns out that the first difference creeps in with Rfast‘s colmeans producing slightly different results than base colMeans.

    set.seed(123)
    bigm <- matrix(rnorm(2000*2000,mean=0,sd = 3), 2000, 2000)
    m <- colMeans(bigm)
    m <- colmeans(bigm)
    y <- t(bigm) - m
    xx <- crossprod(y)
    x <- unname(as.matrix(xx))
    b <- .Internal(La_rs(x, TRUE))
    sum(b$values < 0)
  # [1] 1

    m <- colMeans(bigm)
    y <- t(bigm) - m
    xx <- crossprod(y)
    x <- unname(as.matrix(xx))
    b <- .Internal(La_rs(x, TRUE))
    sum(b$values < 0)

2

Answers


  1. I think the difference is rather negligible. Average difference of the matrix elements equal to 10^(-12) or less are actually zero.

    Login or Signup to reply.
  2. The hd.eigen function in Rfast works only, only for the case of n < p, i.e. when the rows are less than the columns. In the help page of the hd.eigen function is the reference to the paper that suggested this algorithm. The algorithm I do not think works for any other case. Perhaps that is why you get NAs.

    Rfast2 contains a function called “pca” that works for either case, np. Try that one also. Inside there, an SVD is effectively performed calling “svd” from R.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search