R Style guide
R Code Optimization
How do I save my command history to disk?
How do I save specific R objects to disk?
How do you find help for write.exprs? Help for ExpressionSet appears.
Need some comment from more experienced R users.
While using R code for alpha spending, a strange behavior pertaining to random number generation was noticed that if R was invoked multiple times from the same directory. Multiple invocations of R from the same directory result in same set of random numbers. This is because in the very first invocation the seed(timestamp) is saved in a .RData file. All other invocations from the same directory then do NOT take the current time stamp as seed but reads the previous seed saved in the .RData file. Hence the same set of random numbers are generated.
For more information: http://tolstoy.newcastle.edu.au/R/devel/04a/1263.html
How do I gauruntee that my simulated data sets are reproducible across clusters, when using array jobs?
Setting the seed for your random number generation inside your program is the usual way for an investigator to begin replicating your results. It's important to note that when running multiple R processes across nodes using set.seed with the same value will generate the same dataset. This is true since one R process is usually not aware of the other in this setting. To avoid this we have the following suggestions when running multiple copies of R across the nodes on the network:
- Use the $SGE_TASK_ID as the seed value in each simulation. Since each simulation has a job task id, one idea is to use that id as the argument for the set.seed command. This makes the simulations easier to reproduce and track.
- Use the current date/time transformed into a specific integer value. This is an easier way to implement a value for the seed, however it is harder to reproduce, and will involve you writing out the time/date combined value in order to replicate the generation of your random values.
- If you have k jobs then one can create k text files each with a unique seed value in them. Once you start the kth simulation open up the kth file and grab the seed. This is also an alternative way to generate seed values and guarantees more control over the type of seed that you use. However it will encourage a lot of file i/o.