java.lang.AssertionError: ERROR: got tcp resend with existing in-progress task #, FROM /172.16.2.164:44008 AB: task# 139867 hex.tree.ScoreBuildHistogram CLIENT_UDP

JIRA | Tom Kraljevic | 2 years ago
  1. 0

    tomk@mr-0xb4:/home4/jenkins/jobs/h2o_master_DEV_gradle_build_J8/workspace/h2o-algos/sandbox$ cat out.1 01-06 18:53:06.060 172.16.2.164:44000 12503 main INFO: ----- H2O started ----- 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build git branch: (no branch) 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build git hash: 66a3b357d410ddbf48bccd3d639efbf3e882b740 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build git describe: jenkins-h2o_master_DEV_gradle_build_J8-302 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build project version: 0.1.17.99999 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Built by: 'jenkins' 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Built on: '2015-01-06 18:52:32' 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java availableProcessors: 32 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java heap totalMemory: 1.92 GB 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java heap maxMemory: 26.67 GB 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java version: Java 1.8.0_25 (from Oracle Corporation) 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: OS version: Linux 3.2.0-74-generic (amd64) 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: eth0 (eth0), fe80:0:0:0:225:90ff:fec7:2604%eth0 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: eth0 (eth0), 172.16.2.164 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: lo (lo), 0:0:0:0:0:0:0:1%lo 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: lo (lo), 127.0.0.1 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Internal communication uses port: 44001 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Listening for HTTP and REST traffic on http://172.16.2.164:44000/ 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: H2O cloud name: 'junit_cluster_12488' on /172.16.2.164:44000, discovery address /239.42.196.93:61226 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555): 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: 1. Open a terminal and run 'ssh -L 55555:localhost:44000 jenkins@172.16.2.164' 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: 2. Point your browser to http://localhost:55555 01-06 18:53:06.174 172.16.2.164:44000 12503 main INFO: Log dir: '/tmp/h2o-jenkins/h2ologs' 01-06 18:53:06.222 172.16.2.164:44000 12503 main INFO: Cloud of size 1 formed [/172.16.2.164:44000] 01-06 18:53:09.424 172.16.2.164:44000 12503 FJ-126-15 INFO: Cloud of size 5 formed [/172.16.2.164:44000, /172.16.2.164:44002, /172.16.2.164:44004, /172.16.2.164:44006, /172.16.2.164:44008] Exception in thread "TCP-/172.16.2.164:44008-1" java.lang.AssertionError: ERROR: got tcp resend with existing in-progress task #, FROM /172.16.2.164:44008 AB: task# 139867 hex.tree.ScoreBuildHistogram CLIENT_UDP at water.RPC.remote_exec(RPC.java:474) at water.TCPReceiverThread$TCPReaderThread.run(TCPReceiverThread.java:85)

    JIRA | 2 years ago | Tom Kraljevic
    java.lang.AssertionError: ERROR: got tcp resend with existing in-progress task #, FROM /172.16.2.164:44008 AB: task# 139867 hex.tree.ScoreBuildHistogram CLIENT_UDP
  2. 0

    tomk@mr-0xb4:/home4/jenkins/jobs/h2o_master_DEV_gradle_build_J8/workspace/h2o-algos/sandbox$ cat out.1 01-06 18:53:06.060 172.16.2.164:44000 12503 main INFO: ----- H2O started ----- 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build git branch: (no branch) 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build git hash: 66a3b357d410ddbf48bccd3d639efbf3e882b740 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build git describe: jenkins-h2o_master_DEV_gradle_build_J8-302 01-06 18:53:06.150 172.16.2.164:44000 12503 main INFO: Build project version: 0.1.17.99999 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Built by: 'jenkins' 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Built on: '2015-01-06 18:52:32' 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java availableProcessors: 32 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java heap totalMemory: 1.92 GB 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java heap maxMemory: 26.67 GB 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: Java version: Java 1.8.0_25 (from Oracle Corporation) 01-06 18:53:06.151 172.16.2.164:44000 12503 main INFO: OS version: Linux 3.2.0-74-generic (amd64) 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: eth0 (eth0), fe80:0:0:0:225:90ff:fec7:2604%eth0 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: eth0 (eth0), 172.16.2.164 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: lo (lo), 0:0:0:0:0:0:0:1%lo 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Possible IP Address: lo (lo), 127.0.0.1 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Internal communication uses port: 44001 01-06 18:53:06.152 172.16.2.164:44000 12503 main INFO: Listening for HTTP and REST traffic on http://172.16.2.164:44000/ 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: H2O cloud name: 'junit_cluster_12488' on /172.16.2.164:44000, discovery address /239.42.196.93:61226 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555): 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: 1. Open a terminal and run 'ssh -L 55555:localhost:44000 jenkins@172.16.2.164' 01-06 18:53:06.153 172.16.2.164:44000 12503 main INFO: 2. Point your browser to http://localhost:55555 01-06 18:53:06.174 172.16.2.164:44000 12503 main INFO: Log dir: '/tmp/h2o-jenkins/h2ologs' 01-06 18:53:06.222 172.16.2.164:44000 12503 main INFO: Cloud of size 1 formed [/172.16.2.164:44000] 01-06 18:53:09.424 172.16.2.164:44000 12503 FJ-126-15 INFO: Cloud of size 5 formed [/172.16.2.164:44000, /172.16.2.164:44002, /172.16.2.164:44004, /172.16.2.164:44006, /172.16.2.164:44008] Exception in thread "TCP-/172.16.2.164:44008-1" java.lang.AssertionError: ERROR: got tcp resend with existing in-progress task #, FROM /172.16.2.164:44008 AB: task# 139867 hex.tree.ScoreBuildHistogram CLIENT_UDP at water.RPC.remote_exec(RPC.java:474) at water.TCPReceiverThread$TCPReaderThread.run(TCPReceiverThread.java:85)

    JIRA | 2 years ago | Tom Kraljevic
    java.lang.AssertionError: ERROR: got tcp resend with existing in-progress task #, FROM /172.16.2.164:44008 AB: task# 139867 hex.tree.ScoreBuildHistogram CLIENT_UDP
  3. Speed up your debug routine!

    Automated exception search integrated into your IDE

  4. 0

    TestNG testcase : glm_neg_testcase_137 Test results page : http://172.16.2.161:8080/view/testNG/job/h2o_master_DEV_testng_GLM_testcase/15/testngreports/h2o.testng/TestNG/glm_neg_testcase_137/ nfolds = 20 family = gaussian solver = irlsm Validate Parameters object with testcase: glm_neg_testcase_137 Create modelParameter object with testcase: glm_neg_testcase_137 Set _family: gaussian Set _standardize: Set _lambda_search: Set _nfolds: 20 Set _ignore_const_cols: Set _non_negative: Set _intercept: Create train frame: airquality_train1 Create validate frame: airquality_train1 Set train frame Set validate frame Create success modelParameter object. Build model Train model 09-21 17:13:34.957 172.16.2.171:54321 24632 FJ-0-7 INFO: Creating 20 cross-validation splits with random number seed: -5596913177457903046 09-21 17:13:34.973 172.16.2.171:54321 24632 FJ-0-7 INFO: Building cross-validation model 1 / 20. 09-21 17:13:34.974 172.16.2.171:54321 24632 FJ-1-5 INFO: Building H2O GLM model with these parameters: 09-21 17:13:34.974 172.16.2.171:54321 24632 FJ-1-5 INFO: {"_model_id":null,"_train":{"name":"model_cv_1_airquality_train1.hex_train","type":"Key"},"_valid":{"name":"model_cv_1_airquality_train1.hex_valid","type":"Key"},"_nfolds":0,"_keep_cross_validation_predictions":false,"_fold_assignment":"AUTO","_distribution":"AUTO","_tweedie_power":1.5,"_ignored_columns":null,"_ignore_const_cols":false,"_weights_column":"weights","_offset_column":null,"_fold_column":null,"_score_each_iteration":false,"_response_column":"Ozone","_balance_classes":false,"_max_after_balance_size":5.0,"_class_sampling_factors":null,"_max_hit_ratio_k":10,"_max_confusion_matrix_size":20,"_checkpoint":null,"_standardize":false,"_family":"gaussian","_link":"family_default","_solver":"IRLSM","_tweedie_variance_power":0.0,"_tweedie_link_power":1.0,"_alpha":null,"_lambda":null,"_prior":-1.0,"_lambda_search":false,"_nlambdas":100,"_non_negative":false,"_exactLambdas":false,"_lambda_min_ratio":-1.0,"_use_all_factor_levels":false,"_max_iterations":-1,"_intercept":false,"_beta_epsilon":1.0E-5,"_objective_epsilon":1.0E-5,"_gradient_epsilon":1.0E-4,"_beta_constraints":null,"_max_active_predictors":-1} java.lang.AssertionError: wrong priority for task GLMSingleLambdaTsk, expected 0, but got 1 at water.H2O$H2OCountedCompleter.compute(H2O.java:994) at jsr166y.CountedCompleter.exec(CountedCompleter.java:429) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:914) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:979) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) 09-21 17:13:34.977 172.16.2.171:54321 24632 FJ-1-5 INFO: GLM[dest=model_cv_1, iteration=0, lambda = 1877.9]: All 5 coefficients are active likelihood = 100648.0 09-21 17:13:34.981 172.16.2.171:54321 24632 FJ-0-13 WARN: ADMM solver reached maximum number of iterations (10000) 09-21 17:13:34.981 172.16.2.171:54321 24632 FJ-0-13 WARN: ADMM solver finished with gerr = 8449.675328571428 > eps = 1.0E-4 09-21 17:13:34.981 172.16.2.171:54321 24632 FJ-0-13 INFO: GLM[dest=model_cv_1, iteration=1, lambda = 1877.9]: iteration computed in 0 + 3 ms 09-21 17:13:34.981 172.16.2.171:54321 24632 FJ-0-13 INFO: GLM[dest=model_cv_1, iteration=1, lambda = 1877.9]: converged (reached a fixed point with ~ 1e-2147483648 precision), got 0 nzs 09-21 17:13:34.983 172.16.2.171:54321 24632 FJ-0-13 INFO: GLM[dest=model_cv_1, iteration=1, lambda = 1877.9]: hold-out set validation = mse = 109.0, explained_dev = 0.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: Solution at lambda = 1877.9142857142856 has 0 nonzeros, gradient err = 8449.675328571428 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: Model Metrics Type: RegressionGLM 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: Description: N/A 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: model id: model_cv_1 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: frame id: model_cv_1_airquality_train1.hex_train 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: MSE: 2875.6572 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: R^2: -1.8281012 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: mean residual deviance: 2875.6572 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: null DOF: 70.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: residual DOF: 70.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: null deviance: 201296.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: residual deviance: 201296.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: AIC: 758.134 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: Model Metrics Type: RegressionGLM 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: Description: N/A 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: model id: model_cv_1 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: frame id: model_cv_1_airquality_train1.hex_valid 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: MSE: 109.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: R^2: -11.111111 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: mean residual deviance: 109.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: null DOF: 2.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: residual DOF: 2.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: null deviance: 218.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: residual deviance: 218.0 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: AIC: 17.05845 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: GLM Model (summary): 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: Family Link Regularization Number of Predictors Total Number of Active Predictors Number of Iterations Training Frame 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: gaussian identity Elastic Net (alpha = 0.5, lambda = 1877.9 ) 6 1 1 model_cv_1_airquality_train1.hex_train 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: Scoring History: 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: timestamp duration iteration log_likelihood objective 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: 2015-09-21 17:13:34 0.000 sec 0 100648.00000 1437.82857 09-21 17:13:34.984 172.16.2.171:54321 24632 FJ-0-13 INFO: 2015-09-21 17:13:34 0.006 sec 1 100648.00000 1437.82857 09-21 17:13:34.985 172.16.2.171:54321 24632 FJ-0-7 INFO: Building cross-validation model 2 / 20. 09-21 17:13:34.986 172.16.2.171:54321 24632 FJ-1-5 INFO: Building H2O GLM model with these parameters: 09-21 17:13:34.986 172.16.2.171:54321 24632 FJ-1-5 INFO: {"_model_id":null,"_train":{"name":"model_cv_2_airquality_train1.hex_train","type":"Key"},"_valid":{"name":"model_cv_2_airquality_train1.hex_valid","type":"Key"},"_nfolds":0,"_keep_cross_validation_predictions":false,"_fold_assignment":"AUTO","_distribution":"AUTO","_tweedie_power":1.5,"_ignored_columns":null,"_ignore_const_cols":false,"_weights_column":"weights","_offset_column":null,"_fold_column":null,"_score_each_iteration":false,"_response_column":"Ozone","_balance_classes":false,"_max_after_balance_size":5.0,"_class_sampling_factors":null,"_max_hit_ratio_k":10,"_max_confusion_matrix_size":20,"_checkpoint":null,"_standardize":false,"_family":"gaussian","_link":"family_default","_solver":"IRLSM","_tweedie_variance_power":0.0,"_tweedie_link_power":1.0,"_alpha":null,"_lambda":null,"_prior":-1.0,"_lambda_search":false,"_nlambdas":100,"_non_negative":false,"_exactLambdas":false,"_lambda_min_ratio":-1.0,"_use_all_factor_levels":false,"_max_iterations":-1,"_intercept":false,"_beta_epsilon":1.0E-5,"_objective_epsilon":1.0E-5,"_gradient_epsilon":1.0E-4,"_beta_constraints":null,"_max_active_predictors":-1} java.lang.AssertionError: wrong priority for task GLMSingleLambdaTsk, expected 0, but got 1 at water.H2O$H2OCountedCompleter.compute(H2O.java:994) at jsr166y.CountedCompleter.exec(CountedCompleter.java:429) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:914) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:979) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) 09-21 17:13:34.988 172.16.2.171:54321 24632 FJ-1-5 INFO: GLM[dest=model_cv_2, iteration=0, lambda = 1760.6]: All 5 coefficients are active likelihood = 90335.0

    JIRA | 1 year ago | Neeraja Madabhushi
    java.lang.AssertionError: wrong priority for task GLMSingleLambdaTsk, expected 0, but got 1

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.lang.AssertionError

      ERROR: got tcp resend with existing in-progress task #, FROM /172.16.2.164:44008 AB: task# 139867 hex.tree.ScoreBuildHistogram CLIENT_UDP

      at water.RPC.remote_exec()
    2. water
      TCPReceiverThread$TCPReaderThread.run
      1. water.RPC.remote_exec(RPC.java:474)
      2. water.TCPReceiverThread$TCPReaderThread.run(TCPReceiverThread.java:85)
      2 frames