hudson.remoting.ChannelClosedException: channel is already closed

Jenkins JIRA | Johannes Wienke | 2 years ago
  1. 0

    Since approximately 9 months we have constant troubles with our master losing connectivity to our mac mavericks slave (via ssh). The issue we observe is that after some time, the master cannot communicate with the slave anymore, so that jobs fail with the following error message: {noformat} Building remotely on MAC_OS_mavericks_64bit (macos mavericks java7)FATAL: channel is already closed hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:541) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:739) at hudson.EnvVars.getRemote(EnvVars.java:404) at hudson.model.Computer.getEnvironment(Computer.java:912) at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29) at hudson.model.Run.getEnvironment(Run.java:2221) at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:885) at hudson.matrix.MatrixRun$MatrixRunExecution.decideWorkspace(MatrixRun.java:175) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:513) at hudson.model.Run.execute(Run.java:1706) at hudson.matrix.MatrixRun.run(MatrixRun.java:146) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1027) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1412069451954 hasn't completed at 1412069691955 ... 2 more {noformat} At some point, the slave is then marked as offline. When trying to reconnect, nothing happens. You see an empty log window with just the circling loading animation. No output is generated ever. We could not observer any issues with the underlying network connection. Everytime I observe this error, ssh-ing to the slave as the jenkins user is possible without any problems. This also only happens for the mavericks slave. All other Linux and Windows slave work perfectly. What is extremely confusing is that in case jenkins ended up in this condition, you cannot restart it in a clean fashion. You first have to kill the java process with SIGKILL, even though it is apparently not completely stuck since operation for everything apart from the mavericks slave continues to work perfectly. The general log file for jenkins only shows that also the jobs for checking disk space etc. suffer from the connectivity issue: {noformat} WARNING: Failed to monitor MAC_OS_mavericks_64bit for Architecture hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:541) at hudson.remoting.Request.callAsync(Request.java:208) at hudson.remoting.Channel.callAsync(Channel.java:766) at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMo nitorDescriptor.java:76) at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDe scriptor.java:280) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1027) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1412069451954 hasn't complet ed at 1412069691955 ... 2 more {noformat} Apart from this, no errors are visible for that slave. A thread dump from the situation where the master tries to reconnect to the salve but nothing happens is available here: http://pastebin.com/DxFU8j7C

    Jenkins JIRA | 2 years ago | Johannes Wienke
    hudson.remoting.ChannelClosedException: channel is already closed
  2. 0

    [JENKINS-6817] FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel - Jenkins JIRA

    jenkins-ci.org | 11 months ago
    hudson.util.IOException2: remote file operation failed: /home/hudson/ci/jenkins/slaves/robusta/workspace/dp_10.0_marker_nightly at hudson.remoting.Channel@789e3ac5:robusta
  3. 0

    Jenkins issues - [JIRA] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

    nabble.com | 1 year ago
    org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    I observe that jobs starting at the same time the slave is performing some sort of cleanup action hang: Build Log: {noformat} 17:04:21 Started by upstream project "pipeline-test-2" build number 422 17:04:21 originally caused by: 17:04:21 Started by upstream project "master-test" build number 867 17:04:21 originally caused by: 17:04:21 Started by upstream project "master-build" build number 5131 17:04:21 originally caused by: 17:04:21 Started by an SCM change 17:04:21 [EnvInject] - Loading node environment variables. ... hang for 14h ... {noformat} Slave Log: {noformat} ... lots of output ... INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-j2se.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/javassist.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system-jmx.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-system.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-mdr.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit INFO: Deleting /tmp/1408400946520-0/jboss51x_centos5-x64-bld-slave-09.corp.XXX.com_10002/lib/jboss-logging-spi.jar (atime=1408400947, diff=-5424) Aug 25, 2014 5:01:12 PM hudson.plugins.tmpcleaner.TmpCleanTask visit {noformat} After interrupting the job, I get this: {noformat} 0:30:26 ERROR: SEVERE ERROR occurs 10:30:26 org.jenkinsci.lib.envinject.EnvInjectException: java.lang.InterruptedException 10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:77) 10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) 10:30:26 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:575) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:481) 10:30:26 at hudson.model.Run.execute(Run.java:1689) 10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88) 10:30:26 at hudson.model.Executor.run(Executor.java:231) 10:30:26 Caused by: java.lang.InterruptedException 10:30:26 at java.lang.Object.wait(Native Method) 10:30:26 at hudson.remoting.Request.call(Request.java:146) 10:30:26 at hudson.remoting.Channel.call(Channel.java:722) 10:30:26 at hudson.FilePath.act(FilePath.java:1003) 10:30:26 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) 10:30:26 ... 8 more 10:30:26 Archiving artifacts 10:30:26 ERROR: Publisher hudson.tasks.Mailer aborted due to exception 10:30:26 hudson.remoting.ChannelClosedException: channel is already closed 10:30:26 at hudson.remoting.Channel.send(Channel.java:524) 10:30:26 at hudson.remoting.Request.call(Request.java:129) 10:30:26 at hudson.remoting.Channel.call(Channel.java:722) 10:30:26 at hudson.EnvVars.getRemote(EnvVars.java:404) 10:30:26 at hudson.model.Computer.getEnvironment(Computer.java:911) 10:30:26 at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29) 10:30:26 at hudson.model.Run.getEnvironment(Run.java:2202) 10:30:26 at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:873) 10:30:26 at hudson.tasks.Mailer.perform(Mailer.java:134) 10:30:26 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:714) 10:30:26 at hudson.model.Build$BuildExecution.post2(Build.java:182) 10:30:26 at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:663) 10:30:26 at hudson.model.Run.execute(Run.java:1714) 10:30:26 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 10:30:26 at hudson.model.ResourceController.execute(ResourceController.java:88) 10:30:26 at hudson.model.Executor.run(Executor.java:231) 10:30:26 Caused by: java.io.IOException 10:30:26 at hudson.remoting.Channel.close(Channel.java:1007) 10:30:26 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) 10:30:26 at hudson.remoting.PingThread.ping(PingThread.java:120) 10:30:26 at hudson.remoting.PingThread.run(PingThread.java:81) 10:30:26 Caused by: java.util.concurrent.TimeoutException: Ping started on 1409011309670 hasn't completed at 1409011549671 10:30:26 ... 2 more 10:30:26 [BFA] Scanning build for known causes... 10:30:26 10:30:26 [BFA] Done. 0s 10:30:26 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed {noformat}

    Jenkins JIRA | 2 years ago | Christian Goetze
    hudson.remoting.ChannelClosedException: channel is already closed
  6. 0

    The Jenkins server version is 1.480.1 The Jenkins master doesn't belong to the same timezone as the slave (The slave is 8 Hours behind) Slave Launch method: Launch slave agents on Unix machines via SSH A job run on this slave immediately failed. The corresponding output is: Started by user Dominique Ledit Building remotely on vm-hou-bldrh53cFATAL: channel is already closed hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:493) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:664) at hudson.EnvVars.getRemote(EnvVars.java:202) at hudson.model.Computer.getEnvironment(Computer.java:844) at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28) at hudson.model.Run.getEnvironment(Run.java:1952) at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:843) at hudson.model.AbstractBuild$AbstractBuildExecution.decideWorkspace(AbstractBuild.java:462) at hudson.model.AbstractBuild$AbstractBuildExecution.run (AbstractBuild.java:482) at hudson.model.Run.execute(Run.java:1502) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:902) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1361148569659 hasn't completed at 1361148809659 ... 2 more

    Jenkins JIRA | 4 years ago | Dominique Ledit
    hudson.remoting.ChannelClosedException: channel is already closed

    1 unregistered visitors
    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.util.concurrent.TimeoutException

      Ping started on 1412069451954 hasn't complet ed at 1412069691955

      at hudson.remoting.PingThread.ping()
    2. Hudson :: Remoting Layer
      PingThread.run
      1. hudson.remoting.PingThread.ping(PingThread.java:120)
      2. hudson.remoting.PingThread.run(PingThread.java:81)
      2 frames