java.lang.UnsupportedOperationException: Trying to predict with an unstable model. Job was aborted due to observed numerical instability (exponential growth). Either the weights or the bias values are unreasonably large or lead to large activation values. Try a different initial distribution, a bounded activation function (Tanh), adding regularization (via max_w2, l1, l2, dropout) or learning rate (either enable adaptive_rate or use a smaller learning rate or faster annealing). For more information visit: http://jira.h2o.ai/browse/TN-4

JIRA | Arno Candel | 1 year ago
tip
Do you know that we can give you better hits? Get more relevant results from Samebug’s stack trace search.
  1. 0

    http://172.16.2.161:8080/job/h2o_master_DEV_gradle_build/28042/testReport/junit/hex.deeplearning/DeepLearningTest/testCreditProstateTanh/ {code} 12-09 15:45:42.144 172.16.2.179:44008 32224 FJ-0-17 INFO: Building H2O DeepLearning model with these parameters: 12-09 15:45:42.144 172.16.2.179:44008 32224 FJ-0-17 INFO: {"_model_id":{"name":"_9483eb6fab215e8e8ba27ab8d5d4c7d","type":"Key"},"_train":{"name":"_9211ec74219ab28deb4d9f0f42ac4192","type":"Key"},"_valid":null,"_nfolds":0,"_keep_cross_validation_predictions":false,"_fold_assignment":"AUTO","_distribution":"poisson","_tweedie_power":1.5,"_ignored_columns":null,"_ignore_const_cols":true,"_weights_column":null,"_offset_column":null,"_fold_column":null,"_score_each_iteration":false,"_stopping_rounds":5,"_stopping_metric":"AUTO","_stopping_tolerance":0.0,"_response_column":"Cost","_balance_classes":false,"_max_after_balance_size":5.0,"_class_sampling_factors":null,"_max_hit_ratio_k":10,"_max_confusion_matrix_size":20,"_checkpoint":null,"_overwrite_with_best_model":true,"_autoencoder":false,"_use_all_factor_levels":true,"_activation":"Rectifier","_hidden":[10,10,10],"_epochs":100.0,"_train_samples_per_iteration":-2,"_target_ratio_comm_to_comp":0.05,"_seed":11185083,"_adaptive_rate":false,"_rho":0.99,"_epsilon":1.0E-8,"_rate":1.0E-4,"_rate_annealing":1.0E-6,"_rate_decay":1.0,"_momentum_start":0.9,"_momentum_ramp":1000000.0,"_momentum_stable":0.99,"_nesterov_accelerated_gradient":true,"_input_dropout_ratio":0.0,"_hidden_dropout_ratios":null,"_l1":0.0,"_l2":0.0,"_max_w2":10.0,"_initial_weight_distribution":"UniformAdaptive","_initial_weight_scale":1.0,"_loss":"Automatic","_score_interval":5.0,"_score_training_samples":10000,"_score_validation_samples":0,"_score_duty_cycle":0.1,"_classification_stop":0.0,"_regression_stop":1.0E-6,"_quiet_mode":false,"_score_validation_sampling":"Uniform","_diagnostics":true,"_variable_importances":false,"_fast_mode":false,"_force_load_balance":true,"_replicate_training_data":true,"_single_node_mode":false,"_shuffle_training_data":false,"_missing_values_handling":"MeanImputation","_sparse":false,"_col_major":false,"_average_activation":0.0,"_sparsity_beta":0.0,"_max_categorical_features":2147483647,"_reproducible":true,"_export_weights_and_biases":false,"_elastic_averaging":false,"_elastic_averaging_moving_rate":0.9,"_elastic_averaging_regularization":0.001,"_mini_batch_size":1} 12-09 15:45:42.147 172.16.2.179:44008 32224 FJ-0-17 INFO: _adaptive_rate: Using manual learning rate. Ignoring the following input parameters: rho, epsilon. 12-09 15:45:42.147 172.16.2.179:44008 32224 FJ-0-17 INFO: _reproducibility: Automatically enabling force_load_balancing, disabling single_node_mode and replicate_training_data 12-09 15:45:42.147 172.16.2.179:44008 32224 FJ-0-17 INFO: and setting train_samples_per_iteration to -1 to enforce reproducibility. 12-09 15:45:42.148 172.16.2.179:44008 32224 FJ-0-17 INFO: Model category: Regression 12-09 15:45:42.148 172.16.2.179:44008 32224 FJ-0-17 INFO: Number of model parameters (weights/biases): 391 12-09 15:45:42.148 172.16.2.179:44008 32224 FJ-0-17 WARN: Reproducibility enforced - using only 1 thread - can be slow. 12-09 15:45:42.148 172.16.2.179:44008 32224 FJ-0-17 INFO: ReBalancing dataset into (at least) 1 chunks. 12-09 15:45:42.157 172.16.2.179:44008 32224 FJ-0-17 INFO: Number of chunks of the training data: 1 12-09 15:45:42.157 172.16.2.179:44008 32224 FJ-0-17 INFO: Setting train_samples_per_iteration (-1) to one epoch: #rows (20). 12-09 15:45:42.157 172.16.2.179:44008 32224 FJ-0-17 INFO: Enabling training data shuffling to avoid training rows in the same order over and over (no Hogwild since there's only 1 chunk). 12-09 15:45:42.157 172.16.2.179:44008 32224 FJ-0-17 INFO: Starting to train the Deep Learning model. 12-09 15:45:42.162 172.16.2.179:44008 32224 FJ-0-17 INFO: Scoring the model. 12-09 15:45:42.163 172.16.2.179:44008 32224 FJ-0-17 INFO: Status of Neuron Layers (predicting Cost, regression, poisson distribution, Automatic loss, 391 weights/biases, 5.1 KB, 20 training samples, mini-batch size 1): 12-09 15:45:42.163 172.16.2.179:44008 32224 FJ-0-17 INFO: Layer Units Type Dropout L1 L2 Mean Rate Rate RMS Momentum Mean Weight Weight RMS Mean Bias Bias RMS 12-09 15:45:42.163 172.16.2.179:44008 32224 FJ-0-17 INFO: 1 15 Input 0.00 % 12-09 15:45:42.163 172.16.2.179:44008 32224 FJ-0-17 INFO: 2 10 Rectifier 0.00 % 0.000000 0.000000 0.000100 0.000000 0.900002 -0.108586 0.815557 -1991477697279010.500000 3812537241960448.000000 12-09 15:45:42.163 172.16.2.179:44008 32224 FJ-0-17 INFO: 3 10 Rectifier 0.00 % 0.000000 0.000000 0.000100 0.000000 0.900002 -21504996.238843 145055296.000000 -2059190118721077.500000 1609435914960896.000000 12-09 15:45:42.163 172.16.2.179:44008 32224 FJ-0-17 INFO: 4 10 Rectifier 0.00 % 0.000000 0.000000 0.000100 0.000000 0.900002 -29014930.585342 133162176.000000 -578534089719839.800000 601639018823680.000000 12-09 15:45:42.163 172.16.2.179:44008 32224 FJ-0-17 INFO: 5 1 Linear 0.000000 0.000000 0.000100 0.000000 0.900002 -3167488038.400004 6322765824.000000 -1644014701750027.800000 0.000000 onExCompletion for hex.Model$BigScore@2e3a87f5 water.DException$DistributedException: from /172.16.2.179:44000; by class hex.Model$BigScore; class java.lang.UnsupportedOperationException: Trying to predict with an unstable model. Job was aborted due to observed numerical instability (exponential growth). Either the weights or the bias values are unreasonably large or lead to large activation values. Try a different initial distribution, a bounded activation function (Tanh), adding regularization (via max_w2, l1, l2, dropout) or learning rate (either enable adaptive_rate or use a smaller learning rate or faster annealing). For more information visit: http://jira.h2o.ai/browse/TN-4 at hex.deeplearning.DeepLearningModel.score0(DeepLearningModel.java:831) at hex.Model.score0(Model.java:852) at hex.Model$BigScore.map(Model.java:820) at water.MRTask.compute2(MRTask.java:678) at water.H2O$H2OCountedCompleter.compute1(H2O.java:1060) at hex.Model$BigScore$Icer.compute1(Model$BigScore$Icer.java) at water.H2O$H2OCountedCompleter.compute(H2O.java:1056) at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) java.lang.RuntimeException: water.DException$DistributedException: from /172.16.2.179:44008; by class hex.Model$BigScore; class water.DException$DistributedException: from /172.16.2.179:44000; by class hex.Model$BigScore; class java.lang.UnsupportedOperationException: {code}

    JIRA | 1 year ago | Arno Candel
    java.lang.UnsupportedOperationException: Trying to predict with an unstable model. Job was aborted due to observed numerical instability (exponential growth). Either the weights or the bias values are unreasonably large or lead to large activation values. Try a different initial distribution, a bounded activation function (Tanh), adding regularization (via max_w2, l1, l2, dropout) or learning rate (either enable adaptive_rate or use a smaller learning rate or faster annealing). For more information visit: http://jira.h2o.ai/browse/TN-4

    Root Cause Analysis

    1. java.lang.UnsupportedOperationException

      Trying to predict with an unstable model. Job was aborted due to observed numerical instability (exponential growth). Either the weights or the bias values are unreasonably large or lead to large activation values. Try a different initial distribution, a bounded activation function (Tanh), adding regularization (via max_w2, l1, l2, dropout) or learning rate (either enable adaptive_rate or use a smaller learning rate or faster annealing). For more information visit: http://jira.h2o.ai/browse/TN-4

      at hex.deeplearning.DeepLearningModel.score0()
    2. hex.deeplearning
      DeepLearningModel.score0
      1. hex.deeplearning.DeepLearningModel.score0(DeepLearningModel.java:831)
      1 frame
    3. hex
      Model$BigScore.map
      1. hex.Model.score0(Model.java:852)
      2. hex.Model$BigScore.map(Model.java:820)
      2 frames
    4. water
      H2O$H2OCountedCompleter.compute1
      1. water.MRTask.compute2(MRTask.java:678)
      2. water.H2O$H2OCountedCompleter.compute1(H2O.java:1060)
      2 frames
    5. hex
      Model$BigScore$Icer.compute1
      1. hex.Model$BigScore$Icer.compute1(Model$BigScore$Icer.java)
      1 frame
    6. water
      H2O$H2OCountedCompleter.compute
      1. water.H2O$H2OCountedCompleter.compute(H2O.java:1056)
      1 frame
    7. jsr166y
      ForkJoinWorkerThread.run
      1. jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
      2. jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
      3. jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
      4. jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
      5. jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
      5 frames