java.lang.IllegalArgumentException: Model is too large For more information visit: http://jira.h2o.ai/browse/TN-5

JIRA | Arno Candel | 1 year ago
  1. 0

    h1. Problem H2O Deep Learning triggers an internal limitation of H2O on the max. size of an object in the distributed K-V store (that is the core of H2O). This limit is 256MB, and once the DL model hits that size, this condition occurs. The reason is that the Deep Learning model is currently stored as one large piece, instead of splitting it up into partial pieces. Cutting it into one piece per hidden layer won't solve this issue either, so we would have to cut a single matrix into multiple pieces to address this issue, which is somewhat cumbersome to implement. That said, a model of that size is also going to take a long time to train. Note: The memory limit has nothing to do with the number of rows of the training data (just the # columns, as that affects the first hidden layer matrix size), nor the RAM or max. allowed heap memory (that is checked separately). It also has nothing to do with the number of nodes, threads, etc. It's purely a function of the model complexity, see the next section. h2. What affects the model size? It's mainly the number of total weights and biases, multiplied by an overhead factor of x1, x2 or x3, depending on whether momentum_start==0 && momentum_stable==0 (x1), momentum > 0 (x2) or adaptive learning rate (x3) is used. Then there's some small overhead for model metrics, statistics, counters, etc. The total weights is directly given by the fully connected layers: The number of input columns (after automatic one-hot encoding of categoricals) The size of the hidden layers The number of output neurons (#classes) h2. Failing example (~25M floats * 3 for ADADELTA > 256MB) {noformat} library(h2o) h2o.init() h2o.deeplearning(x=1:4,y=5,as.h2o(iris),hidden=c(5000,5000)) {noformat} h2. Working example (~25M floats * 1 without ADADELTA and no momentum < 256MB) {noformat} library(h2o) h2o.init() h2o.deeplearning(x=1:4,y=5,as.h2o(iris),hidden=c(5000,5000), adaptive_rate=F) {noformat} h2. Output: java.lang.IllegalArgumentException: Model is too large For more information visit: http://jira.h2o.ai/browse/TN-5 at hex.deeplearning.DeepLearningModel.<init>(DeepLearningModel.java:424) at hex.deeplearning.DeepLearning$DeepLearningDriver.buildModel(DeepLearning.java:201) at hex.deeplearning.DeepLearning$DeepLearningDriver.compute2(DeepLearning.java:171) at water.H2O$H2OCountedCompleter.compute(H2O.java:1005) at jsr166y.CountedCompleter.exec(CountedCompleter.java:429) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) barrier onExCompletion for hex.deeplearning.DeepLearning$DeepLearningDriver@5205f0fd h1. Solution The current solution is to reduce the number of hidden neurons, or to reduce the number of (especially categorical) features.

    JIRA | 1 year ago | Arno Candel
    java.lang.IllegalArgumentException: Model is too large For more information visit: http://jira.h2o.ai/browse/TN-5
  2. 0

    h1. Problem H2O Deep Learning triggers an internal limitation of H2O on the max. size of an object in the distributed K-V store (that is the core of H2O). This limit is 256MB, and once the DL model hits that size, this condition occurs. The reason is that the Deep Learning model is currently stored as one large piece, instead of splitting it up into partial pieces. Cutting it into one piece per hidden layer won't solve this issue either, so we would have to cut a single matrix into multiple pieces to address this issue, which is somewhat cumbersome to implement. That said, a model of that size is also going to take a long time to train. Note: The memory limit has nothing to do with the number of rows of the training data (just the # columns, as that affects the first hidden layer matrix size), nor the RAM or max. allowed heap memory (that is checked separately). It also has nothing to do with the number of nodes, threads, etc. It's purely a function of the model complexity, see the next section. h2. What affects the model size? It's mainly the number of total weights and biases, multiplied by an overhead factor of x1, x2 or x3, depending on whether momentum_start==0 && momentum_stable==0 (x1), momentum > 0 (x2) or adaptive learning rate (x3) is used. Then there's some small overhead for model metrics, statistics, counters, etc. The total weights is directly given by the fully connected layers: The number of input columns (after automatic one-hot encoding of categoricals) The size of the hidden layers The number of output neurons (#classes) h2. Failing example (~25M floats * 3 for ADADELTA > 256MB) {noformat} library(h2o) h2o.init() h2o.deeplearning(x=1:4,y=5,as.h2o(iris),hidden=c(5000,5000)) {noformat} h2. Working example (~25M floats * 1 without ADADELTA and no momentum < 256MB) {noformat} library(h2o) h2o.init() h2o.deeplearning(x=1:4,y=5,as.h2o(iris),hidden=c(5000,5000), adaptive_rate=F) {noformat} h2. Output: java.lang.IllegalArgumentException: Model is too large For more information visit: http://jira.h2o.ai/browse/TN-5 at hex.deeplearning.DeepLearningModel.<init>(DeepLearningModel.java:424) at hex.deeplearning.DeepLearning$DeepLearningDriver.buildModel(DeepLearning.java:201) at hex.deeplearning.DeepLearning$DeepLearningDriver.compute2(DeepLearning.java:171) at water.H2O$H2OCountedCompleter.compute(H2O.java:1005) at jsr166y.CountedCompleter.exec(CountedCompleter.java:429) at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) barrier onExCompletion for hex.deeplearning.DeepLearning$DeepLearningDriver@5205f0fd h1. Solution The current solution is to reduce the number of hidden neurons, or to reduce the number of (especially categorical) features.

    JIRA | 1 year ago | Arno Candel
    java.lang.IllegalArgumentException: Model is too large For more information visit: http://jira.h2o.ai/browse/TN-5
  3. 0
    This error is caused by malformed HTTP request. You are trying to access unsecured page through https.
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0
    Some bots are sending malformed HTTP requests to your site. Try to find their IP addresses in the access logs and ask them to fix the bots or blacklist them.

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.lang.IllegalArgumentException

      Model is too large For more information visit: http://jira.h2o.ai/browse/TN-5

      at hex.deeplearning.DeepLearningModel.<init>()
    2. hex.deeplearning
      DeepLearning$DeepLearningDriver.compute2
      1. hex.deeplearning.DeepLearningModel.<init>(DeepLearningModel.java:424)
      2. hex.deeplearning.DeepLearning$DeepLearningDriver.buildModel(DeepLearning.java:201)
      3. hex.deeplearning.DeepLearning$DeepLearningDriver.compute2(DeepLearning.java:171)
      3 frames
    3. water
      H2O$H2OCountedCompleter.compute
      1. water.H2O$H2OCountedCompleter.compute(H2O.java:1005)
      1 frame
    4. jsr166y
      ForkJoinWorkerThread.run
      1. jsr166y.CountedCompleter.exec(CountedCompleter.java:429)
      2. jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
      3. jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
      4. jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
      5. jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
      5 frames