java.lang.ArrayIndexOutOfBoundsException

There are no available Samebug tips for this exception. Do you have an idea how to solve this issue? A short tip would help users who saw this issue last week.

  • Reported by [~mlandry] 03-17 10:44:29.098 172.16.2.20:54321 18207 FJ-2-29 INFO: Computing quantiles for 3 different strata. 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: java.lang.ArrayIndexOutOfBoundsException: -2 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.DTree.node(DTree.java:83) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.gbm.GBM$GBMDriver.fitBestConstantsQuantile(GBM.java:533) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.gbm.GBM$GBMDriver.buildNextKTrees(GBM.java:420) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.SharedTree$Driver.scoreAndBuildTrees(SharedTree.java:279) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.SharedTree$Driver.compute2(SharedTree.java:225) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at water.H2O$H2OCountedCompleter.compute(H2O.java:1181) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) Reproduces immediately with: {code} library(h2o) result<-tryCatch({h2o.shutdown(FALSE)}, error=function(e) {print(e)}, finally={print("error during shutdown")}) h2o.init(nthreads=-1,max_mem_size = '8G') train<-h2o.uploadFile("/users/arno/kaggle/paribas/input/train.csv",destination_frame = "train.hex") ## loads file in parallel test<-h2o.uploadFile("/users/arno/kaggle/paribas/input/test.csv",destination_frame = "test.hex") ## loads file in parallel #train$target<-as.factor(train$target) splits<-h2o.splitFrame(train,0.9,destination_frames = c("trainSplit","validSplit"),seed=111111111) print(paste("train dimensions:",dim(train))) print(paste("trainSplit dimensions:",dim(splits[[1]]))) print(paste("vaidSplit dimensions:",dim(splits[[2]]))) print(paste("test dimensions:",dim(test))) predictors<-colnames(train) predictors<-predictors[!(predictors %in% c("ID","target"))] gbm<-h2o.gbm( x = predictors, y="target", training_frame = splits[[1]], validation_frame = splits[[2]], ntrees = 3000, ## let stopping criteria dictate the number of trees stopping_rounds = 1, ## wait until the last round is worse than the previous ## this seems low because scoring is not on every tree by default ## If that is desired, you can turn on score_each_iteration ## (and then possibly increase stopping) stopping_tolerance = 0, max_depth = 5, distribution="quantile", quantile_alpha = 0.4, learn_rate = 0.02, sample_rate = 0.87, ## 80% row sampling col_sample_rate = 0.7, ## 80% columns seed = 222222222, model_id = "baseGbm") {code}
    via by Mark Landry,
  • Reported by [~mlandry] 03-17 10:44:29.098 172.16.2.20:54321 18207 FJ-2-29 INFO: Computing quantiles for 3 different strata. 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: java.lang.ArrayIndexOutOfBoundsException: -2 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.DTree.node(DTree.java:83) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.gbm.GBM$GBMDriver.fitBestConstantsQuantile(GBM.java:533) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.gbm.GBM$GBMDriver.buildNextKTrees(GBM.java:420) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.SharedTree$Driver.scoreAndBuildTrees(SharedTree.java:279) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at hex.tree.SharedTree$Driver.compute2(SharedTree.java:225) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at water.H2O$H2OCountedCompleter.compute(H2O.java:1181) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) 03-17 10:44:29.173 172.16.2.20:54321 18207 FJ-1-1 ERRR: at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) Reproduces immediately with: {code} library(h2o) result<-tryCatch({h2o.shutdown(FALSE)}, error=function(e) {print(e)}, finally={print("error during shutdown")}) h2o.init(nthreads=-1,max_mem_size = '8G') train<-h2o.uploadFile("/users/arno/kaggle/paribas/input/train.csv",destination_frame = "train.hex") ## loads file in parallel test<-h2o.uploadFile("/users/arno/kaggle/paribas/input/test.csv",destination_frame = "test.hex") ## loads file in parallel #train$target<-as.factor(train$target) splits<-h2o.splitFrame(train,0.9,destination_frames = c("trainSplit","validSplit"),seed=111111111) print(paste("train dimensions:",dim(train))) print(paste("trainSplit dimensions:",dim(splits[[1]]))) print(paste("vaidSplit dimensions:",dim(splits[[2]]))) print(paste("test dimensions:",dim(test))) predictors<-colnames(train) predictors<-predictors[!(predictors %in% c("ID","target"))] gbm<-h2o.gbm( x = predictors, y="target", training_frame = splits[[1]], validation_frame = splits[[2]], ntrees = 3000, ## let stopping criteria dictate the number of trees stopping_rounds = 1, ## wait until the last round is worse than the previous ## this seems low because scoring is not on every tree by default ## If that is desired, you can turn on score_each_iteration ## (and then possibly increase stopping) stopping_tolerance = 0, max_depth = 5, distribution="quantile", quantile_alpha = 0.4, learn_rate = 0.02, sample_rate = 0.87, ## 80% row sampling col_sample_rate = 0.7, ## 80% columns seed = 222222222, model_id = "baseGbm") {code}
    via by Mark Landry,
  • on master : e5b05ffc547b748c0de8051bf1333c114f9f2cdf Upload attached datasets {code:java} buildModel 'gbm', {"model_id":"gbm-fff1767c-a5e7-43ad-ab34-dc84ec6bb4e0","training_frame":"Key_Frame__ntr.hex","validation_frame":"Key_Frame__nts.hex","nfolds":0,"response_column":"C25","ignored_columns":[],"ignore_const_cols":true,"ntrees":50,"max_depth":5,"min_rows":10,"nbins":20,"nbins_cats":1024,"seed":-1,"learn_rate":0.1,"distribution":"laplace","sample_rate":1,"col_sample_rate":1,"col_sample_rate_per_tree":1,"score_each_iteration":false,"r2_stopping":0.999999,"stopping_rounds":0,"stopping_metric":"AUTO","stopping_tolerance":0.001,"max_runtime_secs":0,"build_tree_one_node":false,"quantile_alpha":0.5,"checkpoint":"","nbins_top_level":1024} {code} {code:java} java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 1 at hex.tree.DTree.node(DTree.java:82) at hex.tree.gbm.GBM$GBMDriver.fitBestConstantsQuantile(GBM.java:533) at hex.tree.gbm.GBM$GBMDriver.buildNextKTrees(GBM.java:418) at hex.tree.SharedTree$Driver.scoreAndBuildTrees(SharedTree.java:280) at hex.tree.SharedTree$Driver.compute2(SharedTree.java:226) at water.H2O$H2OCountedCompleter.compute(H2O.java:1088) at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) {code}
    via by Nidhi Mehta,
  • on master : e5b05ffc547b748c0de8051bf1333c114f9f2cdf Upload attached datasets {code:java} buildModel 'gbm', {"model_id":"gbm-fff1767c-a5e7-43ad-ab34-dc84ec6bb4e0","training_frame":"Key_Frame__ntr.hex","validation_frame":"Key_Frame__nts.hex","nfolds":0,"response_column":"C25","ignored_columns":[],"ignore_const_cols":true,"ntrees":50,"max_depth":5,"min_rows":10,"nbins":20,"nbins_cats":1024,"seed":-1,"learn_rate":0.1,"distribution":"laplace","sample_rate":1,"col_sample_rate":1,"col_sample_rate_per_tree":1,"score_each_iteration":false,"r2_stopping":0.999999,"stopping_rounds":0,"stopping_metric":"AUTO","stopping_tolerance":0.001,"max_runtime_secs":0,"build_tree_one_node":false,"quantile_alpha":0.5,"checkpoint":"","nbins_top_level":1024} {code} {code:java} java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 1 at hex.tree.DTree.node(DTree.java:82) at hex.tree.gbm.GBM$GBMDriver.fitBestConstantsQuantile(GBM.java:533) at hex.tree.gbm.GBM$GBMDriver.buildNextKTrees(GBM.java:418) at hex.tree.SharedTree$Driver.scoreAndBuildTrees(SharedTree.java:280) at hex.tree.SharedTree$Driver.compute2(SharedTree.java:226) at water.H2O$H2OCountedCompleter.compute(H2O.java:1088) at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) {code}
    via by Nidhi Mehta,
    • java.lang.ArrayIndexOutOfBoundsException: -2 at hex.tree.DTree.node(DTree.java:83) at hex.tree.gbm.GBM$GBMDriver.fitBestConstantsQuantile(GBM.java:533) at hex.tree.gbm.GBM$GBMDriver.buildNextKTrees(GBM.java:420) at hex.tree.SharedTree$Driver.scoreAndBuildTrees(SharedTree.java:279) at hex.tree.SharedTree$Driver.compute2(SharedTree.java:225) at water.H2O$H2OCountedCompleter.compute(H2O.java:1181) at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
    No Bugmate found.