Hee: weird error with R when using data.table -

Saturday, 15 May 2010

weird error with R when using data.table -

i'm doing little calculations , decided fill info within data.table since it's much faster data.frame , rbind

so code that:

df data.frame used in calculation it's of import contain.

l=12000 dti = 1 dt = data.table(ni = 0, nj = 0, regerr = 0) (i in seq(1,12000,200)) {     (j in seq(1, 12000, 200)) {         (ind in 1:nrow(df)) {             if( i+j >= l/2 ){                 df[ind,]$x =  df[ind,]$pos * 2             } else {                 df[ind,]$x = df[ind,]$pos/l             }         }         (i in 1:100) { # 100 sample             sample(df$x,nrow(df), replace=false)              fit=lm(x ~ gx, df)   #linear regression calculation             regerror=sum(residuals(fit)^2)              print(paste(i,j,regerror))             set(dt,dti,1l,as.double(i))                          set(dt,dti,2l,as.double(j))                          set(dt,dti,3l,regerror)                          dti=dti+1          }      }  }

the code prints first few rounds of print(paste(i,j,regerror)) , quits error:

*** caught segfault *** address 0x3ff00008, cause 'memory not mapped' segmentation fault (core dumped)

edit

structure(list(ax = c(-0.0242214, 0.19770304, 0.01587302, -0.0374415,  0.05079826, 0.12209738), gx = c(-0.3913043, -0.0242214, -0.4259067,  -0.725, -0.0374415, 0.01587302), pos = c(11222, 13564, 16532,  12543, 12534, 14354)), .names = c("ax", "gx", "pos"), row.names = c(na,  -6l), class = "data.frame")

any ideas appreciated.

without meaning sound rude, think may benefit reading few r tutorials before going forward. question closed localized. also, seg faults bug somewhere, can avoid bunch of headache understanding each piece of code doing. since friday, lets walk through of it:

if( i+j >= l/2 ){ data[ind,]$x = df[ind,]$pos * 2 } else{ data[ind,]$x = df[ind,]$pos/l }

i'll assume data meant df , go there. we're within 2 loops of i , j both go 1 through 20000. never sum less 1/2 execute first statement. also, if ever expected false case occur, need else on same line closing brace:

if (i + j >= 1/2) { df$x <- df$pos * 2 } else { df$x <- df$pos }

r vectorized doing above same looping through every value , multiplying 2. removed / 1 statement since doesn't anything. whole section can moved outside of loop. since constant operation of adding column x double column pos.

next, loop fit:

for (i in 1:100) { # 100 sample    sample(df$x,nrow(df), replace=false)     fit=lm(x ~ gx, df)   #linear regression calculation    regerror=sum(residuals(fit)^2)     print(paste(i,j,regerror))    set(dt,dti,1l,as.double(i))                 set(dt,dti,2l,as.double(j))                 set(dt,dti,3l,regerror)                 dti=dti+1 }

taking, sample(df$x, nrow(df), replace=false) show new order. doesn't actual assign them. instead df$x <- sample(df$x, nrow(df), replace=false).

now, looks you're going assign dt (which function much df , should avoided variable name) @ row dti result of fit error indicies? far can tell, nil depends on i or j. instead, you're going perform randomly ordered fit 60 * 60 * 100 times... if want do, means go it! instead in efficient way:

df$x <- df$pos * 2 fit.fun <- function(n, dat) {    jumble <- sample(nrow(dat))    dat$x <- dat$x[jumble]    sum(residuals(lm(x ~ gx, dat))^2) }  sapply(1:10, fit.fun, dat=df)

r data.table

Hee

Saturday, 15 May 2010

weird error with R when using data.table -

No comments:

Post a Comment