org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location

Apache's JIRA Issue Tracker | Josh Elser | 7 months ago
tip
Do you know that we can give you better hits? Get more relevant results from Samebug’s stack trace search.
  1. 0

    Saw an error in some $dayjob testing where, while a RegionServer was going down to due to an exception, there was a scary looking exception about being unable to write to the stats table because an hconnection was closed. Pardon the mis-matched line numbers: {noformat} 2016-07-17 07:52:13,229 ERROR [phoenix-update-statistics-0] stats.StatisticsScanner: Failed to update statistics table! org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:309) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:152) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:326) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:301) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:166) at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:161) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:794) at org.apache.hadoop.hbase.client.HTableWrapper.getScanner(HTableWrapper.java:215) at org.apache.phoenix.schema.stats.StatisticsUtil.readStatistics(StatisticsUtil.java:136) at org.apache.phoenix.schema.stats.StatisticsWriter.deleteStats(StatisticsWriter.java:230) at org.apache.phoenix.schema.stats.StatisticsScanner$StatisticsScannerCallable.call(StatisticsScanner.java:117) at org.apache.phoenix.schema.stats.StatisticsScanner$StatisticsScannerCallable.call(StatisticsScanner.java:102) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: hconnection-0x5314972b closed at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1153) at org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1133) at org.apache.hadoop.hbase.client.CoprocessorHConnection.relocateRegion(CoprocessorHConnection.java:41) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1338) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1162) at org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300) ... 17 more {noformat} Looking into this some more, this async task to update the stats was still running after a RegionServer already was in the process of shutting down. The RegionServer already closed all of the "userRegions", but, because this task is async, the task is still running and using the RegionServer's CoprocessorHConnection. So, the RegionServer thinks all of the user regions are closed and it is safe to close the HConnection. In reality, there is still code tied to those user regions that might be running (as we can see with the above stacktrace). The next time the StatisticsScannerCallable tries to use the HConnection, it will then error. I think the simple fix is to just use the CoprocessorEnvironment to access the RegionServerServices and use the {{isClosing()}} and {{isClosed()}} methods. This is all pretty minor because the RegionServer is already shutting down, but it is likely misleading to less-experienced users who would think that the last exception in the log is the problem. Will put up a patch shortly.

    Apache's JIRA Issue Tracker | 7 months ago | Josh Elser
    org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location

    Root Cause Analysis

    1. java.io.IOException

      hconnection-0x5314972b closed

      at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion()
    2. HBase - Client
      HTableWrapper.getScanner
      1. org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1153)
      2. org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41)
      3. org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1133)
      4. org.apache.hadoop.hbase.client.CoprocessorHConnection.relocateRegion(CoprocessorHConnection.java:41)
      5. org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1338)
      6. org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1162)
      7. org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41)
      8. org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
      9. org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:152)
      10. org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
      11. org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
      12. org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:326)
      13. org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:301)
      14. org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:166)
      15. org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:161)
      16. org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:794)
      17. org.apache.hadoop.hbase.client.HTableWrapper.getScanner(HTableWrapper.java:215)
      17 frames
    3. Phoenix Core
      StatisticsScanner$StatisticsScannerCallable.call
      1. org.apache.phoenix.schema.stats.StatisticsUtil.readStatistics(StatisticsUtil.java:136)
      2. org.apache.phoenix.schema.stats.StatisticsWriter.deleteStats(StatisticsWriter.java:230)
      3. org.apache.phoenix.schema.stats.StatisticsScanner$StatisticsScannerCallable.call(StatisticsScanner.java:117)
      4. org.apache.phoenix.schema.stats.StatisticsScanner$StatisticsScannerCallable.call(StatisticsScanner.java:102)
      4 frames
    4. Java RT
      Thread.run
      1. java.util.concurrent.FutureTask.run(FutureTask.java:266)
      2. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      3. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      4. java.lang.Thread.run(Thread.java:745)
      4 frames