Saturday, 15 September 2012

r - ggplot2 stat_summary error message -



r - ggplot2 stat_summary error message -

i have dataframe 3 columns. site , provider strings , los continuous. dataset has on 1500 rows sampled dataset seed create analysis easier. utilize dataframe:

heccsv:

site provider los sfh asand 259 sfh asand 343 sjh lei, 203 sfh harme 182 sjh khan, 303 emh deluh 145 sjh vissa 317 emh suria 266 sjh makow 113 sjh herna 263 frk smuko 262 emh bruce 309 sfh cesar 197 frk smuko 217 sfh shall 200 sjh nash, 258 sjh dough 391 sjh lacro 368 frk richb 196 sjh donie 208 sjh monaw 245 sjh stull 307 sjh schil 330 sfh abern 340 sjh deluh 420 sjh deluh 160 sfh kortb 328 sjh frazi 150 frk blank 281 sjh kraus 109 sjh rosie 279 sjh murph 200 sfh engli 231 sfh smuko 205 sjh johns 360 sjh donie 346 emh maure 102 sjh manth 205 sjh frazi 289 sjoc moran 172 sfh cesar 112 sjh herna 111 sjh lacro 211 sjh harme 343 sjh dixon 89 sfh culle 165 sjh wilso 239 sjh culle 200 sfh smuko 178 sfh binda 98 sjh abern 178 emh maure 352 sfh bergs 201 sjh ander 255 sjh hubba 107 sfh asand 1102 emh manth 143 sjh deluh 213 emh ruval 258 sjh vissa 350 sjh frazi 364 frk pilla 228 sfh wenni 335 sfh wilso 214 sjh culle 248 sjh lacro 298 twhh parri 135 sjh suria 234 sfh abern 317 frk kraus 223 sjh suria 310 emh glins 318 sjh adar, 308 sjh makow 253 sjh murph 257 sfh abern 262 sjh stull 514 sjh ander 324 sjh khan, 117 sjh lacro 151 emh pilla 150 sjh muell 295 sfh richb 149 sfh manth 315 frk herna 218 frk asand 167 emh donof 161 emh swart 243 sjh frazi 392 frk donie 213 sfh smuko 276 sjh maure 531 frk maure 241 sjh lacro 127 emh ruval 349 emh donof 346 sjh culle 399 emh ander 243 sjh makow 175 sfh honno 285

i'm interested in plotting median los each provider in each site. laso want label outliers. cose utilize geom_point jitter. next online comments others created sec dataframe, called datjit. new dataframe, los means median los groups site , provider. added sm median los grouped site, xj x-axis jitter ggplot, out true rows want label on ggplot. code looks so:

data <- ddply(heccsv, .(site,provider), function(x) median(x$los)) info <- ddply(data, .(site), function(g) {g$sm <- quantile(g$v1,0.96);g}) colnames(data) <- c ("site","provider","los","sm") datjit <- info datjit$xj <- jitter(as.numeric(factor(data$site))) datjit <- ddply(datjit,.(los),.fun=function(g){ g$out <- g$los >= g$sm; g } )

datjit dataframe looks this:

site provider los sm xj out emh manth 36.0 312.60 0.8989241 false emh ander 62.0 312.60 1.1421376 false sjh makow 92.0 402.00 4.1620511 false frk baniu 101.0 296.08 1.8130307 false emh harme 104.0 312.60 0.9778059 false sjh smuko 110.0 402.00 4.0529616 false sfh arbit 117.5 571.20 2.9281353 false sjh dixon 122.0 402.00 4.0077163 false sfh shall 124.0 571.20 2.8466912 false sfh felic 135.0 571.20 3.0444518 false emh deluh 145.0 312.60 1.0192006 false emh pilla 150.0 312.60 0.8234848 false sjoc scerp 151.0 206.68 5.0967039 false sfh adar, 155.0 571.20 3.1976121 false sjh stull 159.5 402.00 3.8986343 false sjh donie 165.0 402.00 4.1175304 false emh stull 167.0 312.60 0.8981766 false frk suria 177.0 296.08 1.8701017 false emh rosie 181.0 312.60 0.8141137 false frk gedan 182.0 296.08 1.8771275 false frk honno 186.0 296.08 2.1625728 false sjh hubba 191.5 402.00 3.9899187 false sjh suria 199.5 402.00 3.9287032 false frk dixon 207.0 296.08 1.8887039 false sfh chesk 209.0 571.20 2.9543299 false sjoc donie 209.0 206.68 5.0137046 true sfh hicks 210.0 571.20 2.8389316 false frk donie 213.0 296.08 2.1347646 false frk herna 218.0 296.08 2.1701933 false sjh wenni 219.0 402.00 4.0968437 false sjh harme 220.0 402.00 4.1628144 false sfh abern 221.0 571.20 2.9650992 false sjh glins 222.5 402.00 4.1643052 false sjh herna 228.0 402.00 4.1503148 false sjh kortb 231.0 402.00 3.8981447 false sjh murph 237.0 402.00 4.0250026 false sjh wilso 239.0 402.00 3.9719906 false frk maure 241.0 296.08 1.8193441 false sjh felic 243.0 402.00 3.8967263 false sjh ander 247.0 402.00 3.8460451 false emh ruval 247.5 312.60 1.0059763 false sjh manth 254.5 402.00 4.1880197 false sjh cline 260.0 402.00 4.0725797 false sfh manth 262.0 571.20 3.1527713 false sfh rosie 270.0 571.20 3.1446932 false sjh fula, 271.5 402.00 4.0326154 false emh abern 273.0 312.60 1.1911376 false sjh deluh 273.0 402.00 4.1119299 false emh bruce 276.0 312.60 1.1240272 false sjh wolf, 280.0 402.00 4.1061772 false sjh bergs 296.0 402.00 3.8062060 false sfh nash, 302.0 571.20 2.8628445 false sfh khan, 306.0 571.20 3.0914843 false frk smuko 322.0 296.08 1.9476877 true sfh deluh 333.0 571.20 3.0930637 false sjh frazi 333.0 402.00 4.1636525 false emh donof 337.0 312.60 1.1003078 true sfh fula, 343.5 571.20 2.9226700 false sjh pilla 361.0 402.00 3.8357858 false sfh cesar 373.0 571.20 3.0936832 false twhh kumpr 398.5 398.50 5.8299727 true sjh culle 402.0 402.00 4.0666768 true sjh lacro 411.0 402.00 3.8389552 true sfh donie 500.0 571.20 2.9377945 false sfh blum, 678.0 571.20 2.8202060 true

when plot without stat_summary plot looks fine maintain getting error message stat_summary:

p <- ggplot(heccsv,aes(x=site,y=los)) + geom_point(data = datjit, alpha=0.8, aes(x=xj,colour=site), size=4)+ stat_summary(fun.y = mean, fun.ymin = min, fun.ymax = max,colour = "red") + geom_text(data = subset(datjit,out), aes(x=xj, label = provider) ,hjust=1.1, size=4) + labs(title="los site provider", y = "median los (mins)" ) + ggtitle(expression(atop("los site provider", atop(italic("decemeber 2012"), "")))) print(p)

error message:

error: discrete value supplied continuous scale

what doing wrong?

r ggplot2

No comments:

Post a Comment