Amazon web services - R random numbers almost the same but not identical

Earlien
November 10, 2024
181 views
0 votes
2 Answers

Running the same version of R, one on a Linux R server and one on AWS, the RNG is almost the same, but not always identical. Out of 1 million samples from the uniform, gamma, and Normal distributions respectively:

runif() produces identical results.
rgamma() produces 7 small differences; otherwise identical results.
rnorm() also produces 7 small differences; otherwise identical results.

By small differences, I mean something like 1.4510448921274106 vs 1.4510448921274115.

What would be causing these differences? If a floating point issue, why only some distributions? If an OS/library/software issue, why only different on rare occasions?

Tags: amazon-web-services r random

Answers

- Bensstats
- November 10, 2024 at 12:34 am
- 0 votes
0
Besides for seed choice, your issue might lies in the choice of the Pseudo-random Number Generator (PRNG) that your R environment is using.

R usually implements the Mersenne Twister by default for generating random numbers which are then scaled to a range between 0 and 1 – thus simulating uniform random variables. The other distributions can then be simulated via the inverse probability transform.

For a easier understanding of how PRNGs work- check out this Kahn Academy video.

Additionally you can also check out the R documentation on this topic:
1. https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/Random
2. https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/Random.user
If you want to look at implementing different PRNGs to see what happens "under the hood" with R. I am in the process of developing an R package that allows users to implement different PRNGs via functions. Check it out here.
Login or Signup to reply.

- BenBolker
- November 10, 2024 at 2:41 am
- 0 votes
0
runif() is not really using floating point; it’s doing integer arithmetic (I think it’s a tricky/hacky 64-bit integer computation, although I might be misremembering that), and only converting to floating-point at the last step. So it is not subject to cross-platform/cross-compiler floating-point artifacts.

As for "why only different on rare occasions"; I assume that the rgamma() and rnorm() implementations are relatively numerically stable, so that the possibilities for floating-point/roundoff error are rare.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Amazon web services – R random numbers almost the same but not identical

Answers