We want to create amazing apps without being stopped by crashes. Join us to help others who have the same bug.
java.net.SocketTimeoutException: Read timed out"
As the metastore we are using MySQL, that is being used by Thrift server.
The design / architecture is like this:
Oozie -- > Hive Action --> ELB (AWS) --> Hive Thrift ( 2 servers) -->
MySQL (Master) -- > MySQL (Slave).
Hive version : 0.10.0
Looks like when the load is beyond some threshold for certain operations
it is having problem in responding.
As the hive jobs sometimes fails because of this issue, we also have a
auto-restart check to see if the Thrift server is not responding, it stops
/ kills and restart the service.
Other tuning done:
Given 11gb heap, and configured CMS GC algo.
Tuned innodb_buffer, tmp_table and max_heap parameters.
So, can somebody please help to understand, what could be the root cause
for this or somebody faced the similar issue.
I found one related JIRA :
But this JIRA shows that Hive Thrift Server shows OOM error, but in my
case I didnt see any OOM error in my case.
Full Exception Stack: