Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s = s.decode("utf16") - UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 20-21: illegal UTF-16 surrogate #77

Open
wit0k opened this issue Apr 8, 2017 · 2 comments

Comments

@wit0k
Copy link

wit0k commented Apr 8, 2017

Hello,

It fails while decoding value.value() ... (Registry hive taken from Windows 7)

  • Faulty value:b'\x00\x00\xd1w\x03\x00\x00\x00\xec\xa8\xcdw\x089l\x00\x88\xef\xd1\x01\xa9\xdb\xcfw\x089l\x00\x8cEl\x00\x96\x00\x00\x00l\xf2\xd1\x01\x00\x00\x00\x00\x00\x00\x00\x00\x9e\x00\x00\x00\xf0\xf1\xd1\x01{k\xcfw\xd0\xef\xd1\x01\x90\xf2\xd1\x01\x00\x00\x00\x00l\xf2\xd1\x01\x80El\x00\xd7\xa8\xcdw.\x00\x00\x00\x01\x00\x00\x00@\x96l\x00\xcc\xef\xd1\x01p\xe7l\x00\x01\x00\x00\x00\x88\xe8l\x00\xdc\xef\xd1\x01\x00\x00\x00\x00\x98\xf5l\x00\x9e\x00\xa0\x00\x84El\x00\x90\xf2\xd1\x01\xf0\xef\xd1\x01\xad\x14\xf8u\x00\x00f\x00\x00\x00\x00\x00\x90\xe8l\x00\x04\xf0\xd1\x01Z\x12\x19v\x00\x00f\x00\x00\x00\x00\x00\x90\xe8l\x00\xa1\xfb\xcbw\xb0\x0f2v\x84\x02\x00\x00\x00\x00\x00\x00 \xf0\xd1\x01\x8f!\x19v\x84\x02\x00\x00\xbc\xf2\xd1\x01\x16"\x19v\x80\'\x19v\xae|0\xf1(2k\x00(2k\x004\x03\xe8u\x00\x00\x00\x00(\xf0\xd1\x01\x00\x00\x00\x00\x01\x00\x00\x00\xb8\x96l\x00r\xca\xf8\x86\x00\x00\x00\x00\x00\x00\x00\x00\x90\xe8l\x00\x00\x00\x00\x008\xe8l\x00\xc0\xf0\xd1\x01\x8c\x98\xddu\x00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff\xf0\xdel\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe8\xafm\x00\xf0\xdel\x00\x01\x00\x00\x00\xa44ak\x00\x00\x00\x00g\x90gw\x08\xf5\xd1\x01[\xeagw\xf8\xf4\xd1\x01U\x00S\x00B\x00\\\x00V\x00I\x00D\x00_\x008\x000\x008\x007\x00&\x00P\x00I\x00D\x00_\x000\x007\x00D\x00C\x00\\\x005\x00&\x00]\x01\xccw\x00\x00f\x00\xf88l\x00\x00\x00\x00\x00\x07\x00\x00\x07\x00\x00\x00\x00\xa89l\x00\xb81k\x00\xd0\x96m\x008\x00\x00\x00\x00\x00f\x00\x00\x00\x00\x10\xf88l\x00\x14\xf2\xd1\x01\xce8\xcdw8\x01f\x00\xaa8\xcdw\x11\xa2\nv\x00\x00\x00\x00\x00\x00f\x00\x009l\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x1b\x00\x00\x00\x00\x00'
Traceback (most recent call last):
  ...
    self.content = Value.value()
  File "/frameworks/virtualenvs/.../Registry.py", line 156, in value
    return self._vkrecord.data()
  File "/frameworks/virtualenvs/.../lib/python3.6/site-packages/Registry/RegistryParse.py", line 748, in data
    s = decode_utf16le(s)
  File "/frameworks/virtualenvs/.../lib/python3.6/site-packages/Registry/RegistryParse.py", line 555, in decode_utf16le
    **s = s.decode("utf16")
UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 20-21: illegal UTF-16 surrogate**

Is it expected?

Thank you in advance.

P.S Python-registry is awsome !

@williballenthin
Copy link
Owner

looks to me that there is some data that is stored in a string value that is not valid UTF-16. while occasionally we've found issues with unicode decoding in python-registry, the above doesn't look much like a string to me (unless its a very non-english language). so, i wonder if perhaps this hive contains some strange data that you'll simply have to note and continue along.

@wit0k
Copy link
Author

wit0k commented Apr 9, 2017

Ok, Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants