Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String endianness is incorrect: big-endian strings cannot be read on little-endian machines #5595

Open
BloCamLimb opened this issue Feb 26, 2024 · 3 comments
Assignees

Comments

@BloCamLimb
Copy link

According to the specification:

The character set is Unicode in the UTF-8 encoding scheme. The UTF-8 octets (8-bit bytes) are packed four per word, following the little-endian convention (i.e., the first octet is in the lowest-order 8 bits of the word).

This means that bytes must be swapped before reading a big-endian spv file on a little-endian machine, and vice versa. For example, glslang outputs big-endian encoded spv binary files on big-endian machines. But when disassembling this file via spirv-dis on little-endian machines, only words and operands are handled properly via spvFixWord, strings are handled in host endianness, which is not correct.

More specifically:
"GLSL.std.450" in big-endian encoding (or on big-endian machines), the first octet 'G' should be in the lowest-order byte, which is the fourth byte in a word. Then in the file (or memory), from the first byte to the last byte, from left to right is, 'L''S''L''G' 'd''t''s''.' '0''5''4''.' '\0''\0''\0''\0', 16 bytes and 4 words in total.
When reading this big-endian encoded file:
On big-endian machines, reinterpret the each consecutive 4 bytes as unit32_t, and use bit shift to obtain the first octet, like (word >> 0) & 0xFF. We will get the fourth byte, which is 'G', and this is correct.
On little-endian machines, the result of the bit operation is the first byte, which is 'L'. This is not correct, because there is no call to spvFixWord.

I'm making a compiler in Java myself and can selectively output spv binary files in little-endian or big-endian (the default is host endianness). I encountered this issue when running spirv-dis:

; SPIR-V
; Version: 1.5
; Generator: Khronos; 0
; Bound: 25
; Schema: 0
               OpCapability Shader
error: 2: Invalid extended instruction import 'LSLGdts.054.'

A related issue is #149 and PR #4622, but it does not fix this issue.

My spirv-dis version: SPIRV-Tools v2023.6 v2023.6.rc1-50-gdc667644
Here is my spv binary file for testing purposes, git describes this file as Khronos SPIR-V binary, big-endian, version 0x010500, generator 00000000, my CPU is little-endian
test_shader.zip

@awilfox
Copy link

awilfox commented Nov 22, 2024

Does #5302 fix this for you?

On 2024.4 rc1 "vanilla" using an x86_64 computer I see:

awilcox on lab-x86_64-lin-1 ~/Code/awilfox/wl-next/user/spirv-tools % spirv-val test_shader.spv 
error: line 2: Invalid extended instruction import 'LSLGdts.054.'
awilcox on lab-x86_64-lin-1 ~/Code/awilfox/wl-next/user/spirv-tools % spirv-dis test_shader.spv 
; SPIR-V
; Version: 1.5
; Generator: Khronos; 0
; Bound: 25
; Schema: 0
               OpCapability Shader
error: 2: Invalid extended instruction import 'LSLGdts.054.'

If I then apply #5302 I see:

awilcox on lab-x86_64-lin-1 ~/Code/awilfox/wl-next/user/spirv-tools % spirv-val test_shader.spv
awilcox on lab-x86_64-lin-1 ~/Code/awilfox/wl-next/user/spirv-tools % spirv-dis test_shader.spv
; SPIR-V
; Version: 1.5
; Generator: Khronos; 0
; Bound: 25
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint Fragment %2 "main" %16 %12 %14 %9 %3
               OpExecutionMode %2 OriginLowerLeft
               OpMemberDecorate %_struct_7 0 Offset 0
               OpMemberDecorate %_struct_7 0 ColMajor
               OpMemberDecorate %_struct_7 0 MatrixStride 16
               OpMemberDecorate %_struct_7 1 Offset 64
               OpMemberDecorate %_struct_7 1 ColMajor
               OpMemberDecorate %_struct_7 1 MatrixStride 16
               OpMemberDecorate %_struct_7 2 Offset 128
               OpDecorate %_struct_7 Block
               OpDecorate %3 Binding 0
               OpDecorate %3 DescriptorSet 0
               OpDecorate %9 Location 0
               OpDecorate %12 Location 1
               OpDecorate %14 Location 0
               OpDecorate %14 Index 0
               OpDecorate %16 Location 0
               OpDecorate %16 Index 1
      %float = OpTypeFloat 32
    %v4float = OpTypeVector %float 4
%mat4v4float = OpTypeMatrix %v4float 4
  %_struct_7 = OpTypeStruct %mat4v4float %mat4v4float %v4float
%_ptr_Uniform__struct_7 = OpTypePointer Uniform %_struct_7
          %3 = OpVariable %_ptr_Uniform__struct_7 Uniform
    %v2float = OpTypeVector %float 2
%_ptr_Input_v2float = OpTypePointer Input %v2float
          %9 = OpVariable %_ptr_Input_v2float Input
%_ptr_Input_v4float = OpTypePointer Input %v4float
         %12 = OpVariable %_ptr_Input_v4float Input
%_ptr_Output_v4float = OpTypePointer Output %v4float
         %14 = OpVariable %_ptr_Output_v4float Output
         %16 = OpVariable %_ptr_Output_v4float Output
       %void = OpTypeVoid
         %18 = OpTypeFunction %void
        %int = OpTypeInt 32 1
      %int_2 = OpConstant %int 2
%_ptr_Uniform_v4float = OpTypePointer Uniform %v4float
          %2 = OpFunction %void None %18
         %19 = OpLabel
         %22 = OpAccessChain %_ptr_Uniform_v4float %3 %int_2
         %24 = OpLoad %v4float %22
               OpStore %14 %24
               OpReturn
               OpFunctionEnd

It seems to also help my Power9 (which is a big endian PPC64) handle little-endian examples from the SPIRV-LLVM-Translator test suite.

@BloCamLimb
Copy link
Author

Cool, the fix works fine.

@dneto0
Copy link
Collaborator

dneto0 commented Nov 26, 2024

FYI: This kind of issue was originally discussed in 2016: https://www.khronos.org/members/login/bugzilla-public/show_bug.cgi?id=1474
(the bug database had been down but it's now restored)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants