java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630

There are no available Samebug tips for this exception. Do you have an idea how to solve this issue? A short tip would help users who saw this issue last week.

  • When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way. STDOUT of JNLP process: {noformat} INFO: Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582 at hudson.remoting.PingThread.ping(PingThread.java:125) at hudson.remoting.PingThread.run(PingThread.java:86) Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run SEVERE: I/O error in channel channel java.net.SocketException: Socket closed at java.net.SocketInputStream.read(SocketInputStream.java:190) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: dev127-virt2 Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://jenkins.acme.com/hudson/, http://hudson.acme.com/hudson/] Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to jenkins.acme.com:37003 Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection. java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection. at hudson.remoting.Engine.onConnectionRejected(Engine.java:306) at hudson.remoting.Engine.run(Engine.java:276) {noformat} Slave log on master: {noformat} JNLP agent connected from /10.16.180.145 <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated Connection terminated ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2] at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69) at sun.nio.ch.IOUtil.write(IOUtil.java:40) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336) at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130) at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622) ... 7 more Slave.jar version: 2.47 This is a Unix slave Slave successfully connected and online Connection terminated {noformat} Master log: {noformat} 2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:271) at hudson.remoting.Request$1.get(Request.java:206) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280) ... (All monitors times out) 2016-03-02 06:32:42,860 INFO [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248 2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection. 2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2 ... 2016-03-02 06:33:43,630 INFO [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630 at hudson.remoting.PingThread.ping(PingThread.java:125) at hudson.remoting.PingThread.run(PingThread.java:86) ... 2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:549) at hudson.remoting.Request.callAsync(Request.java:204) at hudson.remoting.Channel.callAsync(Channel.java:778) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1105) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:125) at hudson.remoting.PingThread.run(PingThread.java:86) Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630 ... 2 more ... (All monitors fails with "channel is already closed") {noformat}
    via by Oliver Gondža,
  • When JNLP slave detect ping timeout, it tries to reconnect. But if master have not noticed the timeout yet, it rejects the new connection from slave. JNLP slave agent process aborts once the connection is rejected in such a way. STDOUT of JNLP process: {noformat} INFO: Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1456918109582 hasn't completed at 1456918349582 at hudson.remoting.PingThread.ping(PingThread.java:125) at hudson.remoting.PingThread.run(PingThread.java:86) Mar 02, 2016 6:32:29 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run SEVERE: I/O error in channel channel java.net.SocketException: Socket closed at java.net.SocketInputStream.read(SocketInputStream.java:190) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Mar 02, 2016 6:32:29 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Mar 02, 2016 6:32:39 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect INFO: Restarting slave via jenkins.slaves.restarter.UnixSlaveRestarter@6523ff4a Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: dev127-virt2 Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://jenkins.acme.com/hudson/, http://hudson.acme.com/hudson/] Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to jenkins.acme.com:37003 Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Mar 02, 2016 6:32:42 AM hudson.remoting.jnlp.Main$CuiListener error SEVERE: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection. java.lang.Exception: The server rejected the connection: dev127-virt2 is already connected to this master. Rejecting this connection. at hudson.remoting.Engine.onConnectionRejected(Engine.java:306) at hudson.remoting.Engine.run(Engine.java:276) {noformat} Slave log on master: {noformat} JNLP agent connected from /10.16.180.145 <===[JENKINS REMOTING CAPACITY]===>ERROR: Connection terminated Connection terminated ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@131dbee3[name=dev127-virt2] at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:211) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:631) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69) at sun.nio.ch.IOUtil.write(IOUtil.java:40) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336) at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.send(FifoBuffer.java:130) at org.jenkinsci.remoting.nio.FifoBuffer.send(FifoBuffer.java:254) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:622) ... 7 more Slave.jar version: 2.47 This is a Unix slave Slave successfully connected and online Connection terminated {noformat} Master log: {noformat} 2016-03-02 06:32:38,352 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Clock Difference started on Wed Mar 02 06:32:08 EST 2016) Failed to monitor dev127-virt2 for Clock Difference java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:271) at hudson.remoting.Request$1.get(Request.java:206) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:97) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280) ... (All monitors times out) 2016-03-02 06:32:42,860 INFO [hudson.TcpSlaveAgentListener] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) Accepted connection #41773 from /10.16.180.145:58248 2016-03-02 06:32:42,865 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: dev127-virt2 is already connected to this master. Rejecting this connection. 2016-03-02 06:32:42,866 WARNING [jenkins.slaves.JnlpSlaveHandshake] (TCP slave agent connection handler #41773 with /10.16.180.145:58248) TCP slave agent connection handler #41773 with /10.16.180.145:58248 is aborted: Unrecognized name: dev127-virt2 ... 2016-03-02 06:33:43,630 INFO [hudson.slaves.ChannelPinger] (Ping thread for channel hudson.remoting.Channel@5caca20e:dev127-virt2) Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630 at hudson.remoting.PingThread.ping(PingThread.java:125) at hudson.remoting.PingThread.run(PingThread.java:86) ... 2016-03-02 06:38:26,902 WARNING [hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor] (Monitoring thread for Free Temp Space started on Wed Mar 02 06:38:26 EST 2016) Failed to monitor dev127-virt2 for Free Temp Space hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:549) at hudson.remoting.Request.callAsync(Request.java:204) at hudson.remoting.Channel.callAsync(Channel.java:778) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1105) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:125) at hudson.remoting.PingThread.run(PingThread.java:86) Caused by: java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630 ... 2 more ... (All monitors fails with "channel is already closed") {noformat}
    via by Oliver Gondža,
  • I have set up Jenkins slave Windows service so that it is restarted by system on failure. It works OK if slave fails for whatever reasons. Unfortunately recently master keeps failing due to out of memory problems and after master reboot some Windows slave do not reconnect. Service has to be restarted in order to connect slave again. {panel:title=jenkins-slave.err.log} Oct 02, 2012 3:59:10 PM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Hudson agent is running in headless mode. Oct 02, 2012 3:59:10 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://server:8080/] Oct 02, 2012 3:59:10 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to server:21234 Oct 02, 2012 3:59:10 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Oct 02, 2012 3:59:11 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected Oct 03, 2012 8:11:21 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Oct 03, 2012 8:12:22 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://server:8080/] Oct 03, 2012 8:12:22 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to server:21234 Oct 03, 2012 8:12:22 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Oct 03, 2012 8:12:22 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected Oct 03, 2012 8:15:19 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Oct 03, 2012 8:15:59 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://server:8080/] Oct 03, 2012 8:15:59 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to server:21234 Oct 03, 2012 8:15:59 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Oct 03, 2012 8:15:59 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected Oct 03, 2012 9:44:59 PM hudson.slaves.ChannelPinger$1 onDead INFO: Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1349293259718 hasn't completed at 1349293499718 at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Oct 03, 2012 9:44:59 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run SEVERE: I/O error in channel channel java.net.SocketException: socket closed at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at hudson.remoting.Command.readFrom(Command.java:90) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) Oct 03, 2012 9:44:59 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Oct 04, 2012 4:20:07 PM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Hudson agent is running in headless mode. Oct 04, 2012 4:20:07 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://server:8080/] Oct 04, 2012 4:20:07 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to server:21234 Oct 04, 2012 4:20:07 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Oct 04, 2012 4:20:07 PM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected {panel} {panel:title=jenkins-slave.wrapper.log} 2012-10-02 15:59:10 - Starting C:\Program Files\Java\jre7\bin\java.exe -Xrs -jar "C:\Jenkins\slave.jar" -jnlpUrl http://server:8080/computer/slave-1/slave-agent.jnlp 2012-10-02 15:59:10 - Started 5680 2012-10-04 16:20:05 - Stopping jenkinsslave-C__Jenkins 2012-10-04 16:20:05 - ProcessKill 5680 2012-10-04 16:20:05 - Finished jenkinsslave-C__Jenkins 2012-10-04 16:20:06 - Starting C:\Program Files\Java\jre7\bin\java.exe -Xrs -jar "C:\Jenkins\slave.jar" -jnlpUrl http://server:8080/computer/slave-1/slave-agent.jnlp 2012-10-04 16:20:06 - Started 6464 {panel} Note that before exception slave itself reconnected many times to master, but after exception it did not recover until service was restarted.
    via by Krzysztof Malinowski,
  • Slave stops responding for whatever reason. First ping thread notices slave does not respond: Nov 26, 2014 2:00:59 PM INFO hudson.slaves.ChannelPinger$1 onDead Ping failed. Terminating the channel. java.util.concurrent.TimeoutException: Ping started on 1417006619758 hasn't completed at 1417006859758 at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) (would be nice if we could see which channel/slave) A bit later: Nov 26, 2014 2:14:34 PM WARNING hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor monitor Failed to monitor SV-ARG-DEV-D23 for Clock Difference hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:541) at hudson.remoting.Request.callAsync(Request.java:208) at hudson.remoting.Channel.callAsync(Channel.java:766) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:76) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:280) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1027) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1417006619758 hasn't completed at 1417006859758 ... 2 more Yet node is not shown as offline, and still shows as executor busy. Threaddump shows executor thread on master: Executor #0 for DWI01164 : executing OKA.R201501/Team D/_publics/Public - 4 #103 / waiting for hudson.remoting.Channel@3b91e986:DWI01164 "Executor #0 for DWI01164 : executing OKA.R201501/Team D/_publics/Public - 4 #103 / waiting for hudson.remoting.Channel@3b91e986:DWI01164" Id=16260 Group=main TIMED_WAITING on hudson.remoting.UserRequest@e4ca0c4 at java.lang.Object.wait(Native Method) - waiting on hudson.remoting.UserRequest@e4ca0c4 at hudson.remoting.Request.call(Request.java:146) at hudson.remoting.Channel.call(Channel.java:739) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:168) at com.sun.proxy.$Proxy56.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:956) at hudson.Launcher$ProcStarter.join(Launcher.java:367) at hudson.tasks.Maven.perform(Maven.java:328) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:756) at hudson.model.Build$BuildExecution.build(Build.java:198) at hudson.model.Build$BuildExecution.doRun(Build.java:159) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529) at hudson.model.Run.execute(Run.java:1706) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) It looks like ChannelPinger calls channel.close() which sets channel.outClosed , yet Request is stuck in a loop checking for channel.isInClosed(), which was not set. Should Request check for channel.isClosingOrClosed()?
    via by Wannes Sels,
  • Almost on daily basis my Jenkins is shutting down is taking ALL slaves offline. The reasons for this is unknown to me and looks like a severe bug. Can you please help to check this? Based on my observation I notice that connecting new slaves seems to fail with an SSL exception. ---- Sep 22, 2015 8:08:42 AM org.eclipse.jetty.util.log.JavaUtilLog warn WARNING: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(Unknown Source) at sun.nio.ch.SocketChannelImpl.write(Unknown Source) at org.eclipse.jetty.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:293) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:402) at org.eclipse.jetty.io.nio.SslConnection.process(SslConnection.java:337) at org.eclipse.jetty.io.nio.SslConnection.access$900(SslConnection.java:48) at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.flush(SslConnection.java:738) at org.eclipse.jetty.io.nio.SslConnection$SslEndPoint.shutdownOutput(SslConnection.java:641) at org.eclipse.jetty.io.nio.SslConnection.onIdleExpired(SslConnection.java:260) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.onIdleExpired(SelectChannelEndPoint.java:349) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:326) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Sep 22, 2015 8:08:48 AM org.eclipse.jetty.util.log.JavaUtilLog warn WARNING: handle failed java.lang.IllegalStateException: Internal error at sun.security.ssl.SSLEngineImpl.initHandshaker(Unknown Source) at sun.security.ssl.SSLEngineImpl.readRecord(Unknown Source) at sun.security.ssl.SSLEngineImpl.readNetRecord(Unknown Source) at sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) at javax.net.ssl.SSLEngine.unwrap(Unknown Source) at org.eclipse.jetty.io.nio.SslConnection.unwrap(SslConnection.java:536) at org.eclipse.jetty.io.nio.SslConnection.process(SslConnection.java:401) at org.eclipse.jetty.io.nio.SslConnection.handle(SslConnection.java:193) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ---- Shortly afterwards I can see that Jenkins is taking ALL slaves offline ---- Sep 22, 2015 8:20:54 AM hudson.slaves.ChannelPinger$1 onDead INFO: Ping failed. Terminating the channel SLAVE-101051. java.util.concurrent.TimeoutException: Ping started at 1442902614156 hasn't completed by 1442902854206 at hudson.remoting.PingThread.ping(PingThread.java:126) at hudson.remoting.PingThread.run(PingThread.java:85) ---- Afterwards ALL slaves want to register back to Jenkins but Jenkins is rejecting it with ---- INFO: Accepted connection #288 from /10.0.209.109:64213 Sep 22, 2015 8:47:00 AM jenkins.slaves.JnlpSlaveHandshake error WARNING: TCP slave agent connection handler #288 with /10.0.209.109:64213 is aborted: SLAVE-719161 is already connected to this master. Rejecting this connection. Sep 22, 2015 8:47:00 AM hudson.TcpSlaveAgentListener$ConnectionHandler run ---- If Jenkins kicks out all slaves, I would expect Jenkins to allow it get automatically accepted again instead of referring to already existing connection. But that all slaves are being taken offline at once due to PING FAIL is rather a bug. Please find full logs attached as well!
    via by Hans Baer,
    • java.util.concurrent.TimeoutException: Ping started on 1456918183629 hasn't completed at 1456918423630 at hudson.remoting.PingThread.ping(PingThread.java:125) at hudson.remoting.PingThread.run(PingThread.java:86)
    No Bugmate found.