java.lang.RuntimeException: Failed to create dictionary on DEFAULT.XXXXXXXXXXXXXXX_URL

Apache's JIRA Issue Tracker | Richard Calaba | 9 months ago
tip
Your exception is missing from the Samebug knowledge base.
Here are the best solutions we found on the Internet.
Click on the to mark the helpful solution and get rewards for you help.
  1. 0

    Getting exception in Step 4 - Build Dimension Dictionary: java.lang.IllegalArgumentException: Value not exists! at org.apache.kylin.dimension.Dictionary.getIdFromValueBytes(Dictionary.java:160) at org.apache.kylin.dict.TrieDictionary.getIdFromValueImpl(TrieDictionary.java:158) at org.apache.kylin.dimension.Dictionary.getIdFromValue(Dictionary.java:96) at org.apache.kylin.dimension.Dictionary.getIdFromValue(Dictionary.java:76) at org.apache.kylin.dict.lookup.SnapshotTable.takeSnapshot(SnapshotTable.java:96) at org.apache.kylin.dict.lookup.SnapshotManager.buildSnapshot(SnapshotManager.java:106) at org.apache.kylin.cube.CubeManager.buildSnapshotTable(CubeManager.java:215) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:59) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:60) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) result code:2 The code which generates the exception is: org.apache.kylin.dimension.Dictionary.java: /** * A lower level API, return ID integer from raw value bytes. In case of not found * <p> * - if roundingFlag=0, throw IllegalArgumentException; <br> * - if roundingFlag<0, the closest smaller ID integer if exist; <br> * - if roundingFlag>0, the closest bigger ID integer if exist. <br> * <p> * Bypassing the cache layer, this could be significantly slower than getIdFromValue(T value). * * @throws IllegalArgumentException * if value is not found in dictionary and rounding is off; * or if rounding cannot find a smaller or bigger ID */ final public int getIdFromValueBytes(byte[] value, int offset, int len, int roundingFlag) throws IllegalArgumentException { if (isNullByteForm(value, offset, len)) return nullId(); else { int id = getIdFromValueBytesImpl(value, offset, len, roundingFlag); if (id < 0) throw new IllegalArgumentException("Value not exists!"); return id; } } ========================================================== The Cube is big - fact 110 mio rows, the largest dimension (customer) has 10 mio rows. I have increased the JVM -Xmx to 16gb and set the kylin.table.snapshot.max_mb=2048 in kylin.properties to make sure the Cube build doesn't fail (previously we were getting exception complaining about the 300MB limit for Dimension dictionary size (req. approx 700MB)). ========================================================== Before that we were getting exception complaining about the Dictionary encoding problem - "Too high cardinality is not suitable for dictionary -- cardinality: 10873977" - this we resolved by changing the affected dimension/row key Encoding from "dict" to "int; length=8" on the Advanced Settings of the Cube. ========================================================== We have 2 high-cardinality fields (one from fact table and one from the big dimension (customer - see above). We need to use in distinc_count measure for our calculations. I wonder if this exception Value not found! is somewhat related ??? Those count_distinct measures are defined one with return type "bitmap" (exact precission - only for Int columns) and 2nd with return type "hllc16" (error rate <= 1.22 %) ========================================================== I am looking for any clues to debug the cause of this error and way how to circumwent this ...

    Apache's JIRA Issue Tracker | 9 months ago | Richard Calaba
    java.lang.RuntimeException: Failed to create dictionary on DEFAULT.XXXXXXXXXXXXXXX_URL

    Root Cause Analysis

    1. java.lang.NegativeArraySizeException

      No message provided

      at org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes()
    2. org.apache.kylin
      DictionaryManager.buildDictionary
      1. org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes(TrieDictionaryBuilder.java:443)
      2. org.apache.kylin.dict.TrieDictionaryBuilder.build(TrieDictionaryBuilder.java:408)
      3. org.apache.kylin.dict.DictionaryGenerator$StringDictBuilder.build(DictionaryGenerator.java:165)
      4. org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
      5. org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:73)
      6. org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:321)
      6 frames