Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins master mismatch java version with builders (java 21.0.5 vs 21.0.4) causing java.io.StreamCorruptedException: invalid stream header #9444

Closed
yaronkaikov opened this issue Dec 2, 2024 · 32 comments
Assignees

Comments

@yaronkaikov
Copy link
Contributor

Seen multiple times today, https://jenkins.scylladb.com/job/scylla-master/job/artifacts/job/artifacts-azure-image-test/696/ and https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci-offline-installer/23/

Failed during checkout:

10:03:17  ERROR: Checkout failed
10:03:17  java.io.StreamCorruptedException: invalid stream header: 636F7272
10:03:17  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
10:03:17  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
10:03:17  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
10:03:17  	at hudson.remoting.Command.readFrom(Command.java:141)
10:03:17  	at hudson.remoting.Command.readFrom(Command.java:127)
10:03:17  	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
10:03:17  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:62)
10:03:17  Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to builders-6orui6
10:03:17  		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1923)
10:03:17  		at hudson.remoting.Request.call(Request.java:204)
10:03:17  		at hudson.remoting.Channel.call(Channel.java:1111)
10:03:17  		at PluginClassLoader for git-client//org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.execute(RemoteGitImpl.java:153)
10:03:17  		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
10:03:17  		at java.base/java.lang.reflect.Method.invoke(Method.java:580)
10:03:17  		at PluginClassLoader for git-client//org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.invoke(RemoteGitImpl.java:138)
10:03:17  		at PluginClassLoader for git-client/jdk.proxy144/jdk.proxy144.$Proxy899.execute(Unknown Source)
10:03:17  		at PluginClassLoader for git//hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1220)
10:03:17  		at PluginClassLoader for git//hudson.plugins.git.GitSCM._checkout(GitSCM.java:1310)
10:03:17  		at PluginClassLoader for git//hudson.plugins.git.GitSCM.checkout(GitSCM.java:1277)
10:03:17  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:136)
10:03:17  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:101)
10:03:17  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:88)
10:03:17  		at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
10:03:17  		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
10:03:17  		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
10:03:17  		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
10:03:17  		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
10:03:17  		at java.base/java.lang.Thread.run(Thread.java:1583)
10:03:17  Caused: hudson.remoting.RequestAbortedException
10:03:17  	at hudson.remoting.Request.abort(Request.java:358)
10:03:17  	at hudson.remoting.Channel.terminate(Channel.java:1196)
10:03:17  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:95)
10:03:17  Retrying after 10 seconds
10:03:27  ERROR: Checkout failed
10:03:27  java.io.StreamCorruptedException: invalid stream header: 636F7272
10:03:27  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
10:03:27  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
10:03:27  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
10:03:27  	at hudson.remoting.Command.readFrom(Command.java:141)
10:03:27  	at hudson.remoting.Command.readFrom(Command.java:127)
10:03:27  	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
10:03:27  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:62)
10:03:27  Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@2eb4cd56:builders-6orui6": Remote call on builders-6orui6 failed. The channel is closing down or has closed down
10:03:27  	at hudson.remoting.Channel.call(Channel.java:1105)
10:03:27  	at hudson.FilePath.act(FilePath.java:1207)
10:03:27  	at hudson.FilePath.act(FilePath.java:1196)
10:03:27  	at hudson.FilePath.mkdirs(FilePath.java:1387)
10:03:27  	at PluginClassLoader for git//hudson.plugins.git.GitSCM.createClient(GitSCM.java:843)
10:03:27  	at PluginClassLoader for git//hudson.plugins.git.GitSCM._checkout(GitSCM.java:1299)
10:03:27  	at PluginClassLoader for git//hudson.plugins.git.GitSCM.checkout(GitSCM.java:1277)
10:03:27  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:136)
10:03:27  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:101)
10:03:27  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:88)
10:03:27  	at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
10:03:27  	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
10:03:27  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
10:03:27  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
10:03:27  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
10:03:27  	at java.base/java.lang.Thread.run(Thread.java:1583)
10:03:27  Retrying after 10 seconds
10:03:37  ERROR: Checkout failed
10:03:37  java.io.StreamCorruptedException: invalid stream header: 636F7272
10:03:37  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
10:03:37  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
10:03:37  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
10:03:37  	at hudson.remoting.Command.readFrom(Command.java:141)
10:03:37  	at hudson.remoting.Command.readFrom(Command.java:127)
10:03:37  	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
10:03:37  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:62)
10:03:37  Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@2eb4cd56:builders-6orui6": Remote call on builders-6orui6 failed. The channel is closing down or has closed down
10:03:37  	at hudson.remoting.Channel.call(Channel.java:1105)
10:03:37  	at hudson.FilePath.act(FilePath.java:1207)
10:03:37  	at hudson.FilePath.act(FilePath.java:1196)
10:03:37  	at hudson.FilePath.mkdirs(FilePath.java:1387)
10:03:37  	at PluginClassLoader for git//hudson.plugins.git.GitSCM.createClient(GitSCM.java:843)
10:03:37  	at PluginClassLoader for git//hudson.plugins.git.GitSCM._checkout(GitSCM.java:1299)
10:03:37  	at PluginClassLoader for git//hudson.plugins.git.GitSCM.checkout(GitSCM.java:1277)
10:03:37  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:136)
10:03:37  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:101)
10:03:37  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:88)
10:03:37  	at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
10:03:37  	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
10:03:37  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
10:03:37  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
10:03:37  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
10:03:37  	at java.base/java.lang.Thread.run(Thread.java:1583)
10:03:37  Retrying after 10 seconds
10:03:47  ERROR: Checkout failed
10:03:47  java.io.StreamCorruptedException: invalid stream header: 636F7272
10:03:47  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
10:03:47  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
10:03:47  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
10:03:47  	at hudson.remoting.Command.readFrom(Command.java:141)
10:03:47  	at hudson.remoting.Command.readFrom(Command.java:127)
10:03:47  	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
10:03:47  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:62)
10:03:47  Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@2eb4cd56:builders-6orui6": Remote call on builders-6orui6 failed. The channel is closing down or has closed down
10:03:47  	at hudson.remoting.Channel.call(Channel.java:1105)
10:03:47  	at hudson.FilePath.act(FilePath.java:1207)
10:03:47  	at hudson.FilePath.act(FilePath.java:1196)
10:03:47  	at hudson.FilePath.mkdirs(FilePath.java:1387)
10:03:47  	at PluginClassLoader for git//hudson.plugins.git.GitSCM.createClient(GitSCM.java:843)
10:03:47  	at PluginClassLoader for git//hudson.plugins.git.GitSCM._checkout(GitSCM.java:1299)
10:03:47  	at PluginClassLoader for git//hudson.plugins.git.GitSCM.checkout(GitSCM.java:1277)
10:03:47  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:136)
10:03:47  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:101)
10:03:47  	at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:88)
10:03:47  	at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
10:03:47  	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
10:03:47  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
10:03:47  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
10:03:47  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
10:03:47  	at java.base/java.lang.Thread.run(Thread.java:1583)
10:03:47  ERROR: Maximum checkout retry attempts reached, aborting
@yaronkaikov yaronkaikov added the P1 Urgent label Dec 2, 2024
@yaronkaikov yaronkaikov assigned fruch and roydahan and unassigned yaronkaikov Dec 2, 2024
@fruch
Copy link
Contributor

fruch commented Dec 2, 2024

@yaronkaikov

I don't know what got it started

But it's the Jenkins client code failing, which means you'll need your team to help figure it out...

@fruch
Copy link
Contributor

fruch commented Dec 2, 2024

@yaronkaikov

I don't know what got it started

But it's the Jenkins client code failing, which means you'll need your team to help figure it out...

This suggest communication issues between builder and master
https://wiki.jenkins.io/display/JENKINS/Remoting+issue

@fruch
Copy link
Contributor

fruch commented Dec 8, 2024

for now seems to be happening more often in artifact pipeline:
https://jenkins.scylladb.com/job/enterprise-2024.2/job/artifacts/job/artifacts-ami-arm-test/38/

@vponomaryov
Copy link
Contributor

vponomaryov commented Dec 10, 2024

Similar problem happens not on the checkout but the stream in long-running CI jobs. It is 100% failure.
It makes all the pipeline stages, after the test run one, to fail - no resources deletions, no logs...

13:06:04  java.io.StreamCorruptedException: invalid stream header: 636F7272
13:06:04  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
13:06:04  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
13:06:04  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
13:06:04  	at hudson.remoting.Command.readFrom(Command.java:141)
13:06:04  	at hudson.remoting.Command.readFrom(Command.java:127)
13:06:04  	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
13:06:04  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:62)
13:06:04  Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to i-07796f3858ecbee63
13:06:04  		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1923)
13:06:04  		at hudson.remoting.Request.call(Request.java:204)
13:06:04  		at hudson.remoting.Channel.call(Channel.java:1111)
13:06:04  		at hudson.Launcher$RemoteLauncher.launch(Launcher.java:1121)
13:06:04  		at hudson.Launcher$ProcStarter.start(Launcher.java:507)
13:06:04  		at PluginClassLoader for durable-task//org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:180)
13:06:04  		at PluginClassLoader for durable-task//org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:134)
13:06:04  		at PluginClassLoader for workflow-durable-task-step//org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:330)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:323)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:196)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:124)
13:06:04  		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
13:06:04  		at java.base/java.lang.reflect.Method.invoke(Method.java:580)
13:06:04  		at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
13:06:04  		at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
13:06:04  		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
13:06:04  		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
13:06:04  		at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41)
13:06:04  		at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
13:06:04  		at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
13:06:04  		at PluginClassLoader for script-security//org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:180)
13:06:04  		at PluginClassLoader for script-security//org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
13:06:04  		at PluginClassLoader for script-security//org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:163)
13:06:04  		at PluginClassLoader for script-security//org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:148)
13:06:04  		at PluginClassLoader for script-security//org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:178)
13:06:04  		at PluginClassLoader for script-security//org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:182)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.LoggingInvoker.methodCall(LoggingInvoker.java:117)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:90)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:116)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:85)
13:06:04  		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
13:06:04  		at java.base/java.lang.reflect.Method.invoke(Method.java:580)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:110)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:85)
13:06:04  		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
13:06:04  		at java.base/java.lang.reflect.Method.invoke(Method.java:580)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.CastBlock$ContinuationImpl.cast(CastBlock.java:47)
13:06:04  		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
13:06:04  		at java.base/java.lang.reflect.Method.invoke(Method.java:580)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.dispatch(CollectionLiteralBlock.java:55)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.CollectionLiteralBlock$ContinuationImpl.item(CollectionLiteralBlock.java:45)
13:06:04  		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
13:06:04  		at java.base/java.lang.reflect.Method.invoke(Method.java:580)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.Next.step(Next.java:83)
13:06:04  		at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:147)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:17)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:49)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:180)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:422)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:330)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:294)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$wrap$4(CpsVmExecutorService.java:140)
13:06:04  		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
13:06:04  		at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
13:06:04  		at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
13:06:04  		at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
13:06:04  		at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
13:06:04  		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
13:06:04  		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
13:06:04  		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
13:06:04  		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:53)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:50)
13:06:04  		at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
13:06:04  		at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
13:06:04  		at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$categoryThreadFactory$0(CpsVmExecutorService.java:50)
13:06:04  		at java.base/java.lang.Thread.run(Thread.java:1583)
13:06:04  Also:   org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: a0e40698-1766-46f1-82ed-09c81ec9951a
13:06:04  Caused: hudson.remoting.RequestAbortedException
13:06:04  	at hudson.remoting.Request.abort(Request.java:358)
13:06:04  	at hudson.remoting.Channel.terminate(Channel.java:1196)
13:06:04  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:95)

Ci job as an example: https://jenkins.scylladb.com/job/scylla-staging/job/valerii/job/vp-longevity-aws-custom-d2-workload1-multidc-big/17/console

I consider this bug as serious that should be investigated and fixed ASAP

@fruch
Copy link
Contributor

fruch commented Dec 10, 2024

@vponomaryov

The builder probably got killed, the question is why, and by whom.

@vponomaryov
Copy link
Contributor

@vponomaryov

The builder probably got killed, the question is why, and by whom.

It is stable behavior, all my long running jenkins jobs fail this way.
It doesn't look like manual activity.

Probably, some cleanup logic somewhere got changed and now cleans up builders without any regards to utilization of it.

@fruch
Copy link
Contributor

fruch commented Dec 10, 2024

looking in AWS CloudTrail -> Event history

the build was stopped after the error occurred in your job

Event name User name Event source Resource type Resource name
TerminateInstances December 10, 2024, 13:18:20 (UTC+02:00) AutoScaling ec2.amazonaws.com AWS::EC2::Instance
TerminateInstances December 10, 2024, 13:17:55 (UTC+02:00) jenkins2 ec2.amazonaws.com AWS::EC2::Instance
13:06:04  java.io.StreamCorruptedException: invalid stream header: 636F7272
13:06:04  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
13:06:04  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
13:06:04  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
13:06:04  	at hudson.remoting.Command.readFrom(Command.java:141)
13:06:04  	at hudson.remoting.Command.readFrom(Command.java:127)

so it's not a builder that died

so the next theories

  • that it's influence of something else that used the builder, we can have up to 4 job sharing a builder
  • communication issues between jenkins master in GCP to the builders in AWS

so I would start by changing it for this region to stop sharing the builders

@Annamikhlin
Copy link

Annamikhlin commented Dec 11, 2024

Just saw the same error today in
https://jenkins.scylladb.com/job/releng-testing/job/artifacts/job/artifacts-docker-test/13/ that was triggered by scylla-ci-docker

08:36:28  ERROR: Checkout failed
08:36:28  java.io.StreamCorruptedException: invalid stream header: 636F7272
08:36:28  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
08:36:28  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
08:36:28  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
08:36:28  	at hudson.remoting.Command.readFrom(Command.java:141)
08:36:28  	at hudson.remoting.Command.readFrom(Command.java:127)
08:36:28  	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
08:36:28  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:62)
08:36:28  Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to i-0f63748fd6fd3604f
08:36:28  		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1923)
08:36:28  		at hudson.remoting.Request.call(Request.java:204)
08:36:28  		at hudson.remoting.Channel.call(Channel.java:1111)
08:36:28  		at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:306)
08:36:28  		at PluginClassLoader for git-client/jdk.proxy23/jdk.proxy23.$Proxy102.hasGitRepo(Unknown Source)
08:36:28  		at PluginClassLoader for git-client//org.jenkinsci.plugins.gitclient.RemoteGitImpl.hasGitRepo(RemoteGitImpl.java:330)
08:36:28  		at PluginClassLoader for git//hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1205)
08:36:28  		at PluginClassLoader for git//hudson.plugins.git.GitSCM._checkout(GitSCM.java:1310)
08:36:28  		at PluginClassLoader for git//hudson.plugins.git.GitSCM.checkout(GitSCM.java:1277)
08:36:28  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:136)
08:36:28  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:101)
08:36:28  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:88)
08:36:28  		at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
08:36:28  		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
08:36:28  		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
08:36:28  		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
08:36:28  		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
08:36:28  		at java.base/java.lang.Thread.run(Thread.java:1583)

Found similar issue opened 11 days ago
https://stackoverflow.com/questions/79237661/jenkins-ec2-agent-error-invalid-stream-header636f7272

@fruch - could it be related to java version on the builders? Is it updated automatically?

BTW, the current version of java on Jenkins server is 21.0.5 , it was updated as well from the last time
https://github.com/scylladb/scylla-pkg/issues/4156#issuecomment-2456602851

jenkins-scylladb-com:~$ java --version
openjdk 21.0.5 2024-10-15
OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.04)
OpenJDK 64-Bit Server VM (build 21.0.5+11-Ubuntu-1ubuntu124.04, mixed mode, sharing)

@fruch
Copy link
Contributor

fruch commented Dec 11, 2024

Just saw the same error today in https://jenkins.scylladb.com/job/releng-testing/job/artifacts/job/artifacts-docker-test/13/ that was triggered by scylla-ci-docker

08:36:28  ERROR: Checkout failed
08:36:28  java.io.StreamCorruptedException: invalid stream header: 636F7272
08:36:28  	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:989)
08:36:28  	at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:416)
08:36:28  	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:50)
08:36:28  	at hudson.remoting.Command.readFrom(Command.java:141)
08:36:28  	at hudson.remoting.Command.readFrom(Command.java:127)
08:36:28  	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
08:36:28  	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:62)
08:36:28  Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to i-0f63748fd6fd3604f
08:36:28  		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1923)
08:36:28  		at hudson.remoting.Request.call(Request.java:204)
08:36:28  		at hudson.remoting.Channel.call(Channel.java:1111)
08:36:28  		at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:306)
08:36:28  		at PluginClassLoader for git-client/jdk.proxy23/jdk.proxy23.$Proxy102.hasGitRepo(Unknown Source)
08:36:28  		at PluginClassLoader for git-client//org.jenkinsci.plugins.gitclient.RemoteGitImpl.hasGitRepo(RemoteGitImpl.java:330)
08:36:28  		at PluginClassLoader for git//hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1205)
08:36:28  		at PluginClassLoader for git//hudson.plugins.git.GitSCM._checkout(GitSCM.java:1310)
08:36:28  		at PluginClassLoader for git//hudson.plugins.git.GitSCM.checkout(GitSCM.java:1277)
08:36:28  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:136)
08:36:28  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:101)
08:36:28  		at PluginClassLoader for workflow-scm-step//org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:88)
08:36:28  		at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
08:36:28  		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
08:36:28  		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
08:36:28  		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
08:36:28  		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
08:36:28  		at java.base/java.lang.Thread.run(Thread.java:1583)

Found similar issue opened 11 days ago https://stackoverflow.com/questions/79237661/jenkins-ec2-agent-error-invalid-stream-header636f7272

@fruch - could it be related to java version on the builders? Is it updated automatically?

no it's not getting auto updates

the java version on the SCT builders is

ubuntu@sct-runner-1-8-instance-0bc52ea0:~/sct-results$ java  --version
openjdk 21.0.4 2024-07-16
OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04)
OpenJDK 64-Bit Server VM (build 21.0.4+7-Ubuntu-1ubuntu224.04, mixed mode, sharing)

BTW, the current version of java on Jenkins server is 21.0.5 , it was updated as well from the last time scylladb/scylla-pkg#4156 (comment)

jenkins-scylladb-com:~$ java --version
openjdk 21.0.5 2024-10-15
OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.04)
OpenJDK 64-Bit Server VM (build 21.0.5+11-Ubuntu-1ubuntu124.04, mixed mode, sharing)

@fruch
Copy link
Contributor

fruch commented Dec 11, 2024

@Annamikhlin

base on our differences in version and the ones reported in stackoverflow

I can bet this change might be the cause:

can we downgrade the java version on jenkins master ?
(changing the builders would take much longer, including backporting to all branches)

@Annamikhlin
Copy link

Annamikhlin commented Dec 11, 2024

@Annamikhlin

base on our differences in version and the ones reported in stackoverflow

I can bet this change might be the cause:

can we downgrade the java version on jenkins master ? (changing the builders would take much longer, including backporting to all branches)

We can't downgrade Java back to version 11, cause we already upgraded Jenkins server to version 2.488 - where Java version 11 is not supported and in addition java 11 reached end of life.

We can downgrade to Java 17 - but I'm not sure it will help.

Will it help to downgrade Java version to 21.0.4?

@fruch
Copy link
Contributor

fruch commented Dec 11, 2024

@Annamikhlin
base on our differences in version and the ones reported in stackoverflow
I can bet this change might be the cause:

can we downgrade the java version on jenkins master ? (changing the builders would take much longer, including backporting to all branches)

We can't downgrade Java back to version 11, cause we already upgraded Jenkins server to version 2.488 - where Java version 11 is not supported and in addition java 11 reached end of life.

We can downgrade to Java 17 - but I'm not sure it will help.

Will it help to downgrade Java version to 21.0.4?

I did meant to downgrade to that one, not to 11

I think those new HTTP limit aren't aligned between builder and master, and might be the root cause

@Annamikhlin
Copy link

@Annamikhlin
base on our differences in version and the ones reported in stackoverflow
I can bet this change might be the cause:

can we downgrade the java version on jenkins master ? (changing the builders would take much longer, including backporting to all branches)

We can't downgrade Java back to version 11, cause we already upgraded Jenkins server to version 2.488 - where Java version 11 is not supported and in addition java 11 reached end of life.
We can downgrade to Java 17 - but I'm not sure it will help.
Will it help to downgrade Java version to 21.0.4?

I did meant to downgrade to that one, not to 11

I think those new HTTP limit aren't aligned between builder and master, and might be the root cause

@fruch - seems that openjdk-21.0.4 is not available in the repository. The available versions are:
21.0.5+11-1ubuntu1~24.04
21.0.3+9-1ubuntu1

jenkins-scylladb-com:~$ apt-cache policy openjdk-21-jdk
openjdk-21-jdk:
  Installed: 21.0.5+11-1ubuntu1~24.04
  Candidate: 21.0.5+11-1ubuntu1~24.04
  Version table:
 *** 21.0.5+11-1ubuntu1~24.04 500
        500 http://europe-north1.gce.archive.ubuntu.com/ubuntu noble-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu noble-security/main amd64 Packages
        100 /var/lib/dpkg/status
     21.0.3+9-1ubuntu1 500
        500 http://europe-north1.gce.archive.ubuntu.com/ubuntu noble/main amd64 Packages

I checked current java version on all our nodes/agents:

Node: sirenada-builder-1 - Java Version: 21.0.5
...
Node: aws-eu-west-1-qa-builder-v3-1 i-0bb4a069168396b35 - Java Version: 21.0.4
Node: aws-eu-west-1-qa-builder-v3-1 i-0e38c96d7a81c74c8 - Java Version: 21.0.4
...
Node: pkgSpotASG-x86-releng i-0f05e9b8ad5d27c97 - Java Version: 21.0.4
Node: pkgSpotASG-x86-releng i-01580b59d3d09e2c3 - Java Version: 21.0.4
...
Node: DtestCloudSpotASG-releng i-0f6cbf82fbbad4089 - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0f7487c782c48960f - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0f7d37c27f6b1c615 - Java Version: 21.0.1
Node: Dtest4CpuFleetASG-releng i-0f9712fb6c65cbb5a - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0fd120acd62485dec - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0fe0425936d7b7212 - Java Version: 21.0.1
...
Node: godzilla - Java Version: 21.0.4
Node: monster - Java Version: 21.0.4
Node: ran - Java Version: 21.0.4
Node: sif - Java Version: 21.0.4
Node: spider1.cloudius-systems.com - Java Version: 21.0.5
Node: spider2.cloudius-systems.com - Java Version: 21.0.5
Node: spider3.cloudius-systems.com - Java Version: 21.0.5
Node: spider4.cloudius-systems.com - Java Version: 21.0.5
Node: spider5.cloudius-systems.com - Java Version: 21.0.5
Node: spider6.cloudius-systems.com - Java Version: 21.0.5
Node: spider7.cloudius-systems.com - Java Version: 21.0.5
Node: spider8.cloudius-systems.com - Java Version: 21.0.5
Node: thor - Java Version: 21.0.4

@fruch
Copy link
Contributor

fruch commented Dec 12, 2024

@Annamikhlin
base on our differences in version and the ones reported in stackoverflow
I can bet this change might be the cause:

can we downgrade the java version on jenkins master ? (changing the builders would take much longer, including backporting to all branches)

We can't downgrade Java back to version 11, cause we already upgraded Jenkins server to version 2.488 - where Java version 11 is not supported and in addition java 11 reached end of life.
We can downgrade to Java 17 - but I'm not sure it will help.
Will it help to downgrade Java version to 21.0.4?

I did meant to downgrade to that one, not to 11

I think those new HTTP limit aren't aligned between builder and master, and might be the root cause

@fruch - seems that openjdk-21.0.4 is not available in the repository. The available versions are:
21.0.5+11-1ubuntu1~24.04
21.0.3+9-1ubuntu1

jenkins-scylladb-com:~$ apt-cache policy openjdk-21-jdk
openjdk-21-jdk:
  Installed: 21.0.5+11-1ubuntu1~24.04
  Candidate: 21.0.5+11-1ubuntu1~24.04
  Version table:
 *** 21.0.5+11-1ubuntu1~24.04 500
        500 http://europe-north1.gce.archive.ubuntu.com/ubuntu noble-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu noble-security/main amd64 Packages
        100 /var/lib/dpkg/status
     21.0.3+9-1ubuntu1 500
        500 http://europe-north1.gce.archive.ubuntu.com/ubuntu noble/main amd64 Packages

I checked current java version on all our nodes/agents:

Node: sirenada-builder-1 - Java Version: 21.0.5
...
Node: aws-eu-west-1-qa-builder-v3-1 i-0bb4a069168396b35 - Java Version: 21.0.4
Node: aws-eu-west-1-qa-builder-v3-1 i-0e38c96d7a81c74c8 - Java Version: 21.0.4
...
Node: pkgSpotASG-x86-releng i-0f05e9b8ad5d27c97 - Java Version: 21.0.4
Node: pkgSpotASG-x86-releng i-01580b59d3d09e2c3 - Java Version: 21.0.4
...
Node: DtestCloudSpotASG-releng i-0f6cbf82fbbad4089 - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0f7487c782c48960f - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0f7d37c27f6b1c615 - Java Version: 21.0.1
Node: Dtest4CpuFleetASG-releng i-0f9712fb6c65cbb5a - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0fd120acd62485dec - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0fe0425936d7b7212 - Java Version: 21.0.1
...
Node: godzilla - Java Version: 21.0.4
Node: monster - Java Version: 21.0.4
Node: ran - Java Version: 21.0.4
Node: sif - Java Version: 21.0.4
Node: spider1.cloudius-systems.com - Java Version: 21.0.5
Node: spider2.cloudius-systems.com - Java Version: 21.0.5
Node: spider3.cloudius-systems.com - Java Version: 21.0.5
Node: spider4.cloudius-systems.com - Java Version: 21.0.5
Node: spider5.cloudius-systems.com - Java Version: 21.0.5
Node: spider6.cloudius-systems.com - Java Version: 21.0.5
Node: spider7.cloudius-systems.com - Java Version: 21.0.5
Node: spider8.cloudius-systems.com - Java Version: 21.0.5
Node: thor - Java Version: 21.0.4

If one of those builders is using multiple excecuters ?

@Annamikhlin
Copy link

@fruch - seems that openjdk-21.0.4 is not available in the repository. The available versions are:
21.0.5+11-1ubuntu1~24.04
21.0.3+9-1ubuntu1

jenkins-scylladb-com:~$ apt-cache policy openjdk-21-jdk
openjdk-21-jdk:
  Installed: 21.0.5+11-1ubuntu1~24.04
  Candidate: 21.0.5+11-1ubuntu1~24.04
  Version table:
 *** 21.0.5+11-1ubuntu1~24.04 500
        500 http://europe-north1.gce.archive.ubuntu.com/ubuntu noble-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu noble-security/main amd64 Packages
        100 /var/lib/dpkg/status
     21.0.3+9-1ubuntu1 500
        500 http://europe-north1.gce.archive.ubuntu.com/ubuntu noble/main amd64 Packages

I checked current java version on all our nodes/agents:

Node: sirenada-builder-1 - Java Version: 21.0.5
...
Node: aws-eu-west-1-qa-builder-v3-1 i-0bb4a069168396b35 - Java Version: 21.0.4
Node: aws-eu-west-1-qa-builder-v3-1 i-0e38c96d7a81c74c8 - Java Version: 21.0.4
...
Node: pkgSpotASG-x86-releng i-0f05e9b8ad5d27c97 - Java Version: 21.0.4
Node: pkgSpotASG-x86-releng i-01580b59d3d09e2c3 - Java Version: 21.0.4
...
Node: DtestCloudSpotASG-releng i-0f6cbf82fbbad4089 - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0f7487c782c48960f - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0f7d37c27f6b1c615 - Java Version: 21.0.1
Node: Dtest4CpuFleetASG-releng i-0f9712fb6c65cbb5a - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0fd120acd62485dec - Java Version: 21.0.1
Node: DtestCloudSpotASG-releng i-0fe0425936d7b7212 - Java Version: 21.0.1
...
Node: godzilla - Java Version: 21.0.4
Node: monster - Java Version: 21.0.4
Node: ran - Java Version: 21.0.4
Node: sif - Java Version: 21.0.4
Node: spider1.cloudius-systems.com - Java Version: 21.0.5
Node: spider2.cloudius-systems.com - Java Version: 21.0.5
Node: spider3.cloudius-systems.com - Java Version: 21.0.5
Node: spider4.cloudius-systems.com - Java Version: 21.0.5
Node: spider5.cloudius-systems.com - Java Version: 21.0.5
Node: spider6.cloudius-systems.com - Java Version: 21.0.5
Node: spider7.cloudius-systems.com - Java Version: 21.0.5
Node: spider8.cloudius-systems.com - Java Version: 21.0.5
Node: thor - Java Version: 21.0.4

If one of those builders is using multiple excecuters ?

Our nodes, no.

@fruch
Copy link
Contributor

fruch commented Dec 15, 2024

@Annamikhlin

  1. did you downgrade the java version ?
  2. did you find references of those logs in jenkins master, so we can cross check if it's still happening ?

@fruch
Copy link
Contributor

fruch commented Dec 15, 2024

give it a go with job known to reproduce it:

@Annamikhlin
Copy link

@Annamikhlin

  1. did you downgrade the java version ?
  2. did you find references of those logs in jenkins master, so we can cross check if it's still happening ?
  1. No, I didn't downgrade the Jenkins server to 21.0.3. Cause we can't downgrade to 21.0.4 - I stopped.
  2. yes, I found the location of the logs on the server.. example for https://jenkins.scylladb.com/job/scylla-master/job/artifacts/job/artifacts-azure-image-test/696/
    on the server the log located under:
    /var/lib/jenkins/jobs/scylla-master/jobs/artifacts/jobs/artifacts-azure-image-test/builds/696

@fruch
Copy link
Contributor

fruch commented Dec 15, 2024

@Annamikhlin

  1. did you downgrade the java version ?
  2. did you find references of those logs in jenkins master, so we can cross check if it's still happening ?
  1. No, I didn't downgrade the Jenkins server to 21.0.3. Cause we can't downgrade to 21.0.4 - I stopped.
  2. yes, I found the location of the logs on the server.. example for https://jenkins.scylladb.com/job/scylla-master/job/artifacts/job/artifacts-azure-image-test/696/
    on the server the log located under:
    /var/lib/jenkins/jobs/scylla-master/jobs/artifacts/jobs/artifacts-azure-image-test/builds/696

can we scan the /var/lib/jenkins/jobs folder for occurence of it, so we can track it after apply changes ?

can we downgrade java to 21.0.3 ?

@Annamikhlin
Copy link

@Annamikhlin

  1. did you downgrade the java version ?
  2. did you find references of those logs in jenkins master, so we can cross check if it's still happening ?
  1. No, I didn't downgrade the Jenkins server to 21.0.3. Cause we can't downgrade to 21.0.4 - I stopped.
  2. yes, I found the location of the logs on the server.. example for https://jenkins.scylladb.com/job/scylla-master/job/artifacts/job/artifacts-azure-image-test/696/
    on the server the log located under:
    /var/lib/jenkins/jobs/scylla-master/jobs/artifacts/jobs/artifacts-azure-image-test/builds/696

can we scan the /var/lib/jenkins/jobs folder for occurence of it, so we can track it after apply changes ?

can we downgrade java to 21.0.3 ?

Java on Jenkins server was downgraded to version to 21.0.3

jenkins-scylladb-com:~$ java --version
openjdk 21.0.3 2024-04-16
OpenJDK Runtime Environment (build 21.0.3+9-Ubuntu-1ubuntu1)
OpenJDK 64-Bit Server VM (build 21.0.3+9-Ubuntu-1ubuntu1, mixed mode, sharing)

Image

I scanned the /var/lib/jenkins/jobs folder for the error and saved the output on file. Will scan again next week or in the week after and will compare between the files.

@fruch fruch changed the title SCT builders failed on checkout Jenkins master mismatch java version with builders (java 21.0.5 vs 21.0.4) causing java.io.StreamCorruptedException: invalid stream header Dec 16, 2024
fruch added a commit to fruch/scylla-cluster-tests that referenced this issue Dec 18, 2024
cause of issue cause by missmatch of java version between
master and sct builder, we are building a new sct runner image
that has up-to-date java version

Ref: scylladb#9444
@fruch
Copy link
Contributor

fruch commented Dec 18, 2024

started the process of updating the SCT builders in:
#9586

@fruch
Copy link
Contributor

fruch commented Dec 22, 2024

@Annamikhlin

I've run into it again:
https://jenkins.scylladb.com/job/scylla-enterprise/job/perf-regression/job/scylla-enterprise-perf-regression-latency-650gb-elasticity/25/pipeline-console/?selected-node=19

so we might be barking the wrong tree, i.e. it's not the java version, but something else...

@fruch
Copy link
Contributor

fruch commented Dec 24, 2024

@Annamikhlin

I'll move forward with upgrade of the jvm on the builder side
and if that won't help

I'll limit SCT builder to 1 executer, for the time being.

@Annamikhlin
Copy link

@Annamikhlin

I'll move forward with upgrade of the jvm on the builder side and if that won't help

I'll limit SCT builder to 1 executer, for the time being.

ok, let me know once you done. I will update the server side as well

@fruch
Copy link
Contributor

fruch commented Dec 30, 2024

limiting the executors to 1, didn't helped, issue still happening across the board

@fruch
Copy link
Contributor

fruch commented Jan 2, 2025

@Annamikhlin

I've updated the SCT builder with newer java version 2 days ago

can you check in logs if this issue happened since ?

fruch added a commit that referenced this issue Jan 2, 2025
cause of issue cause by missmatch of java version between
master and sct builder, we are building a new sct runner image
that has up-to-date java version

Ref: #9444
@Annamikhlin
Copy link

@Annamikhlin

I've updated the SCT builder with newer java version 2 days ago

can you check in logs if this issue happened since ?

Checking.. the scan taking time.

BTW, Jenkins server also in java version 21.0.5

@Annamikhlin
Copy link

Annamikhlin commented Jan 2, 2025

@Annamikhlin
I've updated the SCT builder with newer java version 2 days ago
can you check in logs if this issue happened since ?

Checking.. the scan taking time.

BTW, Jenkins server also in java version 21.0.5

For now there is no new invalid stream header: 636F7272 exceptions checked for 2025* (01-02 of January)

@Annamikhlin
Copy link

@Annamikhlin
I've updated the SCT builder with newer java version 2 days ago
can you check in logs if this issue happened since ?

Checking.. the scan taking time.
BTW, Jenkins server also in java version 21.0.5

For now there is no new invalid stream header: 636F7272 exceptions checked for 2025* (01-02 of January)

Did another logs scan, for now clean.. (no new invalid stream header: 636F7272 exceptions)

@fruch
Copy link
Contributor

fruch commented Jan 6, 2025

@Annamikhlin
I've updated the SCT builder with newer java version 2 days ago
can you check in logs if this issue happened since ?

Checking.. the scan taking time.
BTW, Jenkins server also in java version 21.0.5

For now there is no new invalid stream header: 636F7272 exceptions checked for 2025* (01-02 of January)

Did another logs scan, for now clean.. (no new invalid stream header: 636F7272 exceptions)

Good, so I think so far it's safe to assume the issue was on the builders side, and upgrading elicated that issue
(sadly it raise other issue #9645)

@fruch
Copy link
Contributor

fruch commented Jan 6, 2025

closing for now, seem like this one was solved

@fruch fruch closed this as completed Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants