Skip to content
This repository has been archived by the owner on Feb 8, 2021. It is now read-only.

more upstream fixes for CVE-2016-9602 #2

Open
wants to merge 2 commits into
base: 2.4.1-template
Choose a base branch
from

Conversation

bergwolf
Copy link
Member

@bergwolf bergwolf commented Apr 7, 2017

upstream commit qemu@b003fc0 (9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()) and qemu@918112c (9pfs: fix O_PATH build break with older glibc versions)

gkurz added 2 commits April 7, 2017 17:12
When O_PATH is used with O_DIRECTORY, it only acts as an optimization: the
openat() syscall simply finds the name in the VFS, and doesn't trigger the
underlying filesystem.

On systems that don't define O_PATH, because they have glibc version 2.13
or older for example, we can safely omit it. We don't want to deactivate
O_PATH globally though, in case it is used without O_DIRECTORY. The is done
with a dedicated macro.

Systems without O_PATH may thus fail to resolve names that involve
unreadable directories, compared to newer systems succeeding, but such
corner case failure is our only option on those older systems to avoid
the security hole of chasing symlinks inappropriately.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
(added last paragraph to changelog as suggested by Eric Blake)
Signed-off-by: Greg Kurz <[email protected]>

(cherry picked from commit 918112c)
Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
We should pass O_NOFOLLOW otherwise openat() will follow symlinks and make
QEMU vulnerable.

While here, we also fix local_unlinkat_common() to use openat_dir() for
the same reasons (it was a leftover in the original patchset actually).

This fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
(cherry picked from commit b003fc0)
Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
laijs pushed a commit that referenced this pull request Jun 12, 2017
It turns out qemu is calling exit() in various places from various
threads without taking much care of resources state. The atexit()
cleanup handlers cannot easily destroy resources that are in use (by
the same thread or other).

Since c1111a2, TCG arm guests run into the following abort() when
running tests, the chardev mutex is locked during the write, so
qemu_mutex_destroy() returns an error:

 #0  0x00007fffdbb806f5 in raise () at /lib64/libc.so.6
 #1  0x00007fffdbb822fa in abort () at /lib64/libc.so.6
 #2  0x00005555557616fe in error_exit (err=<optimized out>, msg=msg@entry=0x555555c38c30 <__func__.14622> "qemu_mutex_destroy")
     at /home/drjones/code/qemu/util/qemu-thread-posix.c:39
 qemu#3  0x0000555555b0be20 in qemu_mutex_destroy (mutex=mutex@entry=0x5555566aa0e0) at /home/drjones/code/qemu/util/qemu-thread-posix.c:57
 qemu#4  0x00005555558aab00 in qemu_chr_free_common (chr=0x5555566aa0e0) at /home/drjones/code/qemu/qemu-char.c:4029
 qemu#5  0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4038
 qemu#6  0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4044
 qemu#7  0x00005555558b062c in qemu_chr_cleanup () at /home/drjones/code/qemu/qemu-char.c:4557
 qemu#8  0x00007fffdbb851e8 in __run_exit_handlers () at /lib64/libc.so.6
 qemu#9  0x00007fffdbb85235 in  () at /lib64/libc.so.6
 qemu#10 0x00005555558d1b39 in testdev_write (testdev=0x5555566aa0a0) at /home/drjones/code/qemu/backends/testdev.c:71
 qemu#11 0x00005555558d1b39 in testdev_write (chr=<optimized out>, buf=0x7fffc343fd9a "", len=0) at /home/drjones/code/qemu/backends/testdev.c:95
 qemu#12 0x00005555558adced in qemu_chr_fe_write (s=0x5555566aa0e0, buf=buf@entry=0x7fffc343fd98 "0q", len=len@entry=2) at /home/drjones/code/qemu/qemu-char.c:282

Instead of using a atexit() handler, only run the chardev cleanup as
initially proposed at the end of main(), where there are less chances
(hic) of conflicts or other races.

Signed-off-by: Marc-André Lureau <[email protected]>
Reported-by: Andrew Jones <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
laijs pushed a commit that referenced this pull request Jun 12, 2017
This avoids a segfault like the following for at least some 4.8 versions
of gcc when configured with --static if avx2 instructions are also
enabled:

	Program received signal SIGSEGV, Segmentation fault.
	buffer_find_nonzero_offset_ifunc () at ./util/cutils.c:333
	333     {
	(gdb) bt
	#0  buffer_find_nonzero_offset_ifunc () at ./util/cutils.c:333
	#1  0x0000000000939c58 in __libc_start_main ()
	#2  0x0000000000419337 in _start ()

Signed-off-by: Aaron Lindsay <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
laijs pushed a commit that referenced this pull request Jun 12, 2017
When VNC is started with '-vnc none' there will be no
listener socket present. When we try to populate the
VncServerInfo we'll crash accessing a NULL 'lsock'
field.

 #0  qio_channel_socket_get_local_address (ioc=0x0, errp=errp@entry=0x7ffd5b8aa0f0) at io/channel-socket.c:33
 #1  0x00007f4b9a297d6f in vnc_init_basic_info_from_server_addr (errp=0x7ffd5b8aa0f0, info=0x7f4b9d425460, ioc=<optimized out>)  at ui/vnc.c:146
 #2  vnc_server_info_get (vd=0x7f4b9e858000) at ui/vnc.c:223
 qemu#3  0x00007f4b9a29d318 in vnc_qmp_event (vs=0x7f4b9ef82000, vs=0x7f4b9ef82000, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:279
 qemu#4  vnc_connect (vd=vd@entry=0x7f4b9e858000, sioc=sioc@entry=0x7f4b9e8b3a20, skipauth=skipauth@entry=true, websocket=websocket @entry=false) at ui/vnc.c:2994
 qemu#5  0x00007f4b9a29e8c8 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/v nc.c:3825
 qemu#6  0x00007f4b9a18d8a1 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7ffd5b8aa230) at qmp-marsh al.c:123
 qemu#7  0x00007f4b9a0b53f5 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.6.0/mon itor.c:3922
 qemu#8  0x00007f4b9a348580 in json_message_process_token (lexer=0x7f4b9c78dfe8, input=0x7f4b9c7350e0, type=JSON_RCURLY, x=111, y=5 9) at qobject/json-streamer.c:94
 qemu#9  0x00007f4b9a35cfeb in json_lexer_feed_char (lexer=lexer@entry=0x7f4b9c78dfe8, ch=125 '}', flush=flush@entry=false) at qobj ect/json-lexer.c:310
 qemu#10 0x00007f4b9a35d0ae in json_lexer_feed (lexer=0x7f4b9c78dfe8, buffer=<optimized out>, size=<optimized out>) at qobject/json -lexer.c:360
 qemu#11 0x00007f4b9a348679 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at q object/json-streamer.c:114
 qemu#12 0x00007f4b9a0b3a1b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/deb ug/qemu-2.6.0/monitor.c:3938
 qemu#13 0x00007f4b9a186751 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x7f4b9c7add40) at qemu-char.c:2895
 qemu#14 0x00007f4b92b5c79a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
 qemu#15 0x00007f4b9a2bb0c0 in glib_pollfds_poll () at main-loop.c:213
 qemu#16 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 qemu#17 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 qemu#18 0x00007f4b9a0835cf in main_loop () at vl.c:1934
 qemu#19 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4667

Do an upfront check for a NULL lsock and report an error to
the caller, which matches behaviour from before

  commit 04d2529
  Author: Daniel P. Berrange <[email protected]>
  Date:   Fri Feb 27 16:20:57 2015 +0000

    ui: convert VNC server to use QIOChannelSocket

where getsockname() would be given a FD value -1 and thus report
an error to the caller.

Signed-off-by: Daniel P. Berrange <[email protected]>
Message-id: [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>
laijs pushed a commit that referenced this pull request Jun 12, 2017
The vnc_server_info_get will allocate the VncServerInfo
struct and then call vnc_init_basic_info_from_server_addr
to populate the basic fields. If this returns an error
though, the qapi_free_VncServerInfo call will then crash
because the VncServerInfo struct instance was not properly
NULL-initialized and thus contains random stack garbage.

 #0  0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6
 #1  0x00007f1987c902fa in abort () at /lib64/libc.so.6
 #2  0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6
 qemu#3  0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6
 qemu#4  0x00007f1987cdb2ac in free () at /lib64/libc.so.6
 qemu#5  0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0
 qemu#6  0x0000559193cdcf54 in visit_type_str (v=v@entry=
     0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80)
     at qapi/qapi-visit-core.c:255
 qemu#7  0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry=
     0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307
 qemu#8  0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry=
     0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632
 qemu#9  0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry=
     0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658
 qemu#10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970
 qemu#11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233
 qemu#12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284
 qemu#13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039
 qemu#14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>)
     at ui/vnc.c:3877
 qemu#15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90)
     at qmp-marshal.c:105
 qemu#16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3971
 qemu#17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105
 qemu#18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false)
     at qobject/json-lexer.c:319
 qemu#19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-lexer.c:369
 qemu#20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-streamer.c:124
 qemu#21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3987
 qemu#22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900)
     at qemu-char.c:2895
 qemu#23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
 qemu#24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213
 qemu#25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 qemu#26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 qemu#27 0x0000559193964c55 in main () at vl.c:1908
 qemu#28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603

This was introduced in

  commit 98481bf
  Author: Eric Blake <[email protected]>
  Date:   Mon Oct 26 16:34:45 2015 -0600

    vnc: Hoist allocation of VncBasicInfo to callers

which added error reporting for vnc_init_basic_info_from_server_addr
but didn't change the g_malloc calls to g_malloc0.

Signed-off-by: Daniel P. Berrange <[email protected]>
Message-id: [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>
laijs pushed a commit that referenced this pull request Jun 12, 2017
ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20)
    #1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58)
    #2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896
    qemu#3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367
    qemu#4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844
    qemu#5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333
    qemu#6 0x556a187b650b in ide_start_dma hw/ide/core.c:921
    qemu#7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911
    qemu#8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486
    qemu#9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027
    qemu#10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204
    qemu#11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254
    qemu#12 0x556a187d168a in check_cmd hw/ide/ahci.c:510
    qemu#13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314
    qemu#14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435
    qemu#15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525
    qemu#16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591
    qemu#17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262
    qemu#18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578
    qemu#19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635
    qemu#20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737
    qemu#21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746
    qemu#22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72
    qemu#23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382
    qemu#24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573
    qemu#25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585
    qemu#26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387
    qemu#27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399
    qemu#28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902
    qemu#29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84

Follow John Snow recommendation:
  Everywhere else ncq_err is used, it is accompanied by a list cleanup
  except for ncq_cb, which is the case you are fixing here.

  Move the sglist destruction inside of ncq_err and then delete it from
  the other two locations to keep it tidy.

  Call dma_buf_commit in ide_dma_cb after the early return. Though, this
  is also a little wonky because this routine does more than clear the
  list, but it is at the moment the centralized "we're done with the
  sglist" function and none of the other side effects that occur in
  dma_buf_commit will interfere with the reset that occurs from
  ide_restart_bh, I think

Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: John Snow <[email protected]>
laijs pushed a commit that referenced this pull request Jun 12, 2017
Since aa5cb7f, the chardevs are being cleaned up when leaving
qemu. However, the monitor has still references to them, which may
lead to crashes when running atexit() and trying to send monitor
events:

 #0  0x00007fffdb18f6f5 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
 #1  0x00007fffdb1912fa in __GI_abort () at abort.c:89
 #2  0x0000555555c263e7 in error_exit (err=22, msg=0x555555d47980 <__func__.13537> "qemu_mutex_lock") at util/qemu-thread-posix.c:39
 qemu#3  0x0000555555c26488 in qemu_mutex_lock (mutex=0x5555567a2420) at util/qemu-thread-posix.c:66
 qemu#4  0x00005555558c52db in qemu_chr_fe_write (s=0x5555567a2420, buf=0x55555740dc40 "{\"timestamp\": {\"seconds\": 1470041716, \"microseconds\": 989699}, \"event\": \"SPICE_DISCONNECTED\", \"data\": {\"server\": {\"port\": \"5900\", \"family\": \"ipv4\", \"host\": \"127.0.0.1\"}, \"client\": {\"port\": \"40272\", \"f"..., len=240) at qemu-char.c:280
 qemu#5  0x0000555555787cad in monitor_flush_locked (mon=0x5555567bd9e0) at /home/elmarco/src/qemu/monitor.c:311
 qemu#6  0x0000555555787e46 in monitor_puts (mon=0x5555567bd9e0, str=0x5555567a44ef "") at /home/elmarco/src/qemu/monitor.c:353
 qemu#7  0x00005555557880fe in monitor_json_emitter (mon=0x5555567bd9e0, data=0x5555567c73a0) at /home/elmarco/src/qemu/monitor.c:401
 qemu#8  0x00005555557882d2 in monitor_qapi_event_emit (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x5555567c73a0) at /home/elmarco/src/qemu/monitor.c:472
 qemu#9  0x000055555578838f in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x5555567c73a0, errp=0x7fffffffca88) at /home/elmarco/src/qemu/monitor.c:497
 qemu#10 0x0000555555c15541 in qapi_event_send_spice_disconnected (server=0x5555571139d0, client=0x5555570d0db0, errp=0x5555566c0428 <error_abort>) at qapi-event.c:1038
 qemu#11 0x0000555555b11bc6 in channel_event (event=3, info=0x5555570d6c00) at ui/spice-core.c:248
 qemu#12 0x00007fffdcc9983a in adapter_channel_event (event=3, info=0x5555570d6c00) at reds.c:120
 qemu#13 0x00007fffdcc99a25 in reds_handle_channel_event (reds=0x5555567a9d60, event=3, info=0x5555570d6c00) at reds.c:324
 qemu#14 0x00007fffdcc7d4c4 in main_dispatcher_self_handle_channel_event (self=0x5555567b28b0, event=3, info=0x5555570d6c00) at main-dispatcher.c:175
 qemu#15 0x00007fffdcc7d5b1 in main_dispatcher_channel_event (self=0x5555567b28b0, event=3, info=0x5555570d6c00) at main-dispatcher.c:194
 qemu#16 0x00007fffdcca7674 in reds_stream_push_channel_event (s=0x5555570d9910, event=3) at reds-stream.c:354
 qemu#17 0x00007fffdcca749b in reds_stream_free (s=0x5555570d9910) at reds-stream.c:323
 qemu#18 0x00007fffdccb5dad in snd_disconnect_channel (channel=0x5555576a89a0) at sound.c:229
 qemu#19 0x00007fffdccb9e57 in snd_detach_common (worker=0x555557739720) at sound.c:1589
 qemu#20 0x00007fffdccb9f0e in snd_detach_playback (sin=0x5555569fe3f8) at sound.c:1602
 qemu#21 0x00007fffdcca3373 in spice_server_remove_interface (sin=0x5555569fe3f8) at reds.c:3387
 qemu#22 0x00005555558ff6e2 in line_out_fini (hw=0x5555569fe370) at audio/spiceaudio.c:152
 qemu#23 0x00005555558f909e in audio_atexit () at audio/audio.c:1754
 qemu#24 0x00007fffdb1941e8 in __run_exit_handlers (status=0, listp=0x7fffdb5175d8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
 qemu#25 0x00007fffdb194235 in __GI_exit (status=<optimized out>) at exit.c:104
 qemu#26 0x00007fffdb17b738 in __libc_start_main (main=0x5555558d7874 <main>, argc=67, argv=0x7fffffffcf48, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffcf38) at ../csu/libc-start.c:323

Add a monitor_cleanup() functions to remove all the monitors before
cleaning up the chardev. Note that we are "losing" some events that
used to be sent during atexit().

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>
laijs pushed a commit that referenced this pull request Mar 31, 2018
Running postcopy-test with ASAN produces the following error:

QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64  tests/postcopy-test
...
=================================================================
==23641==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1556600000 at pc 0x55b8e9d28208 bp 0x7f1555f4d3c0 sp 0x7f1555f4d3b0
READ of size 8 at 0x7f1556600000 thread T6
    #0 0x55b8e9d28207 in htab_save_first_pass /home/elmarco/src/qq/hw/ppc/spapr.c:1528
    #1 0x55b8e9d2939c in htab_save_iterate /home/elmarco/src/qq/hw/ppc/spapr.c:1665
    #2 0x55b8e9beae3a in qemu_savevm_state_iterate /home/elmarco/src/qq/migration/savevm.c:1044
    qemu#3 0x55b8ea677733 in migration_thread /home/elmarco/src/qq/migration/migration.c:1976
    qemu#4 0x7f15845f46c9 in start_thread (/lib64/libpthread.so.0+0x76c9)
    qemu#5 0x7f157d9d0f7e in clone (/lib64/libc.so.6+0x107f7e)

0x7f1556600000 is located 0 bytes to the right of 2097152-byte region [0x7f1556400000,0x7f1556600000)
allocated by thread T0 here:
    #0 0x7f159bb76980 in posix_memalign (/lib64/libasan.so.3+0xc7980)
    #1 0x55b8eab185b2 in qemu_try_memalign /home/elmarco/src/qq/util/oslib-posix.c:106
    #2 0x55b8eab186c8 in qemu_memalign /home/elmarco/src/qq/util/oslib-posix.c:122
    qemu#3 0x55b8e9d268a8 in spapr_reallocate_hpt /home/elmarco/src/qq/hw/ppc/spapr.c:1214
    qemu#4 0x55b8e9d26e04 in ppc_spapr_reset /home/elmarco/src/qq/hw/ppc/spapr.c:1261
    qemu#5 0x55b8ea12e913 in qemu_system_reset /home/elmarco/src/qq/vl.c:1697
    qemu#6 0x55b8ea13fa40 in main /home/elmarco/src/qq/vl.c:4679
    qemu#7 0x7f157d8e9400 in __libc_start_main (/lib64/libc.so.6+0x20400)

Thread T6 created by T0 here:
    #0 0x7f159bae0488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488)
    #1 0x55b8eab1d9cb in qemu_thread_create /home/elmarco/src/qq/util/qemu-thread-posix.c:465
    #2 0x55b8ea67874c in migrate_fd_connect /home/elmarco/src/qq/migration/migration.c:2096
    qemu#3 0x55b8ea66cbb0 in migration_channel_connect /home/elmarco/src/qq/migration/migration.c:500
    qemu#4 0x55b8ea678f38 in socket_outgoing_migration /home/elmarco/src/qq/migration/socket.c:87
    qemu#5 0x55b8eaa5a03a in qio_task_complete /home/elmarco/src/qq/io/task.c:142
    qemu#6 0x55b8eaa599cc in gio_task_thread_result /home/elmarco/src/qq/io/task.c:88
    qemu#7 0x7f15823e38e6  (/lib64/libglib-2.0.so.0+0x468e6)
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/elmarco/src/qq/hw/ppc/spapr.c:1528 in htab_save_first_pass

index seems to be wrongly incremented, unless I miss something that
would be worth a comment.

Signed-off-by: Marc-André Lureau <[email protected]>
Signed-off-by: David Gibson <[email protected]>
laijs pushed a commit that referenced this pull request Mar 31, 2018
During block job completion, nothing is preventing
block_job_defer_to_main_loop_bh from being called in a nested
aio_poll(), which is a trouble, such as in this code path:

    qmp_block_commit
      commit_active_start
        bdrv_reopen
          bdrv_reopen_multiple
            bdrv_reopen_prepare
              bdrv_flush
                aio_poll
                  aio_bh_poll
                    aio_bh_call
                      block_job_defer_to_main_loop_bh
                        stream_complete
                          bdrv_reopen

block_job_defer_to_main_loop_bh is the last step of the stream job,
which should have been "paused" by the bdrv_drained_begin/end in
bdrv_reopen_multiple, but it is not done because it's in the form of a
main loop BH.

Similar to why block jobs should be paused between drained_begin and
drained_end, BHs they schedule must be excluded as well.  To achieve
this, this patch forces draining the BH in BDRV_POLL_WHILE.

As a side effect this fixes a hang in block_job_detach_aio_context
during system_reset when a block job is ready:

    #0  0x0000555555aa79f3 in bdrv_drain_recurse
    #1  0x0000555555aa825d in bdrv_drained_begin
    #2  0x0000555555aa8449 in bdrv_drain
    qemu#3  0x0000555555a9c356 in blk_drain
    qemu#4  0x0000555555aa3cfd in mirror_drain
    qemu#5  0x0000555555a66e11 in block_job_detach_aio_context
    qemu#6  0x0000555555a62f4d in bdrv_detach_aio_context
    qemu#7  0x0000555555a63116 in bdrv_set_aio_context
    qemu#8  0x0000555555a9d326 in blk_set_aio_context
    qemu#9  0x00005555557e38da in virtio_blk_data_plane_stop
    qemu#10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd
    qemu#11 0x00005555559fa49b in virtio_bus_stop_ioeventfd
    qemu#12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd
    qemu#13 0x00005555559f6a18 in virtio_pci_reset
    qemu#14 0x00005555559139a9 in qdev_reset_one
    qemu#15 0x0000555555916738 in qbus_walk_children
    qemu#16 0x0000555555913318 in qdev_walk_children
    qemu#17 0x0000555555916738 in qbus_walk_children
    qemu#18 0x00005555559168ca in qemu_devices_reset
    qemu#19 0x000055555581fcbb in pc_machine_reset
    qemu#20 0x00005555558a4d96 in qemu_system_reset
    qemu#21 0x000055555577157a in main_loop_should_exit
    qemu#22 0x000055555577157a in main_loop
    qemu#23 0x000055555577157a in main

The rationale is that the loop in block_job_detach_aio_context cannot
make any progress in pausing/completing the job, because bs->in_flight
is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
BH. With this patch, it does.

Reported-by: Jeff Cody <[email protected]>
Signed-off-by: Fam Zheng <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Tested-by: Jeff Cody <[email protected]>
Signed-off-by: Fam Zheng <[email protected]>
laijs pushed a commit that referenced this pull request Mar 31, 2018
ASAN detects an "unknown-crash" when running pxe-test:

/ppc64/pxe/spapr-vlan: =================================================================
==7143==ERROR: AddressSanitizer: unknown-crash on address 0x7f6dcd298d30 at pc 0x55e22218830d bp 0x7f6dcd2989e0 sp 0x7f6dcd2989d0
READ of size 128 at 0x7f6dcd298d30 thread T2
    #0 0x55e22218830c in tftp_session_allocate /home/elmarco/src/qq/slirp/tftp.c:73
    #1 0x55e22218a1f8 in tftp_handle_rrq /home/elmarco/src/qq/slirp/tftp.c:289
    #2 0x55e22218b54c in tftp_input /home/elmarco/src/qq/slirp/tftp.c:446
    qemu#3 0x55e2221833fe in udp6_input /home/elmarco/src/qq/slirp/udp6.c:82
    qemu#4 0x55e222137b17 in ip6_input /home/elmarco/src/qq/slirp/ip6_input.c:67

Address 0x7f6dcd298d30 is located in stack of thread T2 at offset 96 in frame
    #0 0x55e222182420 in udp6_input /home/elmarco/src/qq/slirp/udp6.c:13

  This frame has 3 object(s):
    [32, 48) '<unknown>'
    [96, 124) 'lhost' <== Memory access at offset 96 partially overflows this variable
    [160, 200) 'save_ip' <== Memory access at offset 96 partially underflows this variable

The sockaddr_storage pointer is the sockaddr_in6 lhost on the
stack. Copy only the source addr size.

Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Samuel Thibault <[email protected]>
laijs pushed a commit that referenced this pull request Mar 31, 2018
If trace backend is set to TRACE_NOP, trace_get_vcpu_event_count
returns 0, cause bitmap_new call abort.

The abort can be triggered as follows:

  $ ./configure --enable-trace-backend=nop --target-list=x86_64-softmmu
  $ gdb ./x86_64-softmmu/qemu-system-x86_64 -M q35,accel=kvm -m 1G
  (gdb) bt
  #0  0x00007ffff04e25f7 in raise () from /lib64/libc.so.6
  #1  0x00007ffff04e3ce8 in abort () from /lib64/libc.so.6
  #2  0x00005555559de905 in bitmap_new (nbits=<optimized out>)
      at /home/root/git/qemu2.git/include/qemu/bitmap.h:96
  qemu#3  cpu_common_initfn (obj=0x555556621d30) at qom/cpu.c:399
  qemu#4  0x0000555555a11869 in object_init_with_type (obj=0x555556621d30, ti=0x55555656bbb0) at qom/object.c:341
  qemu#5  0x0000555555a11869 in object_init_with_type (obj=0x555556621d30, ti=0x55555656bd30) at qom/object.c:341
  qemu#6  0x0000555555a11efc in object_initialize_with_type (data=data@entry=0x555556621d30, size=76560,
      type=type@entry=0x55555656bd30) at qom/object.c:376
  qemu#7  0x0000555555a12061 in object_new_with_type (type=0x55555656bd30) at qom/object.c:484
  qemu#8  0x0000555555a121c5 in object_new (typename=typename@entry=0x555556550340 "qemu64-x86_64-cpu")
      at qom/object.c:494
  qemu#9  0x00005555557f6e3d in pc_new_cpu (typename=typename@entry=0x555556550340 "qemu64-x86_64-cpu", apic_id=0,
      errp=errp@entry=0x5555565391b0 <error_fatal>) at /home/root/git/qemu2.git/hw/i386/pc.c:1101
  qemu#10 0x00005555557fa33e in pc_cpus_init (pcms=pcms@entry=0x5555565f9690)
      at /home/root/git/qemu2.git/hw/i386/pc.c:1184
  qemu#11 0x00005555557fe0f6 in pc_q35_init (machine=0x5555565f9690) at /home/root/git/qemu2.git/hw/i386/pc_q35.c:121
  qemu#12 0x000055555574fbad in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4562

Signed-off-by: Anthony Xu <[email protected]>
Message-id: [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>
laijs pushed a commit that referenced this pull request Mar 31, 2018
…taging

make display updates thread safe, batch #2

# gpg: Signature made Thu 11 May 2017 03:41:51 PM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-vga-20170511-1:
  vga: fix display update region calculation
  sm501: make display updates thread safe
  tcx: make display updates thread safe
  cg3: make display updates thread safe

Signed-off-by: Stefan Hajnoczi <[email protected]>
laijs pushed a commit that referenced this pull request Mar 31, 2018
This commit fixes a bug which causes the guest to hang. The bug was
observed upon a "receive overrun" (bit qemu#6 of the ICR register)
interrupt which could be triggered post migration in a heavy traffic
environment. Even though the "receive overrun" bit (qemu#6) is masked out
by the IMS register (refer to the log below) the driver still receives
an interrupt as the "receive overrun" bit (qemu#6) causes the "Other" -
bit qemu#24 of the ICR register - bit to be set as documented below. The
driver handles the interrupt and clears the "Other" bit (qemu#24) but
doesn't clear the "receive overrun" bit (qemu#6) which leads to an
infinite loop. Apparently the Windows driver expects that the "receive
overrun" bit and other ones - documented below - to be cleared when
the "Other" bit (qemu#24) is cleared.

So to sum that up:
1. Bit qemu#6 of the ICR register is set by heavy traffic
2. As a results of setting bit qemu#6, bit qemu#24 is set
3. The driver receives an interrupt for bit 24 (it doesn't receieve an
   interrupt for bit qemu#6 as it is masked out by IMS)
4. The driver handles and clears the interrupt of bit qemu#24
5. Bit qemu#6 is still set.
6. 2 happens all over again

The Interrupt Cause Read - ICR register:

The ICR has the "Other" bit - bit qemu#24 - that is set when one or more
of the following ICR register's bits are set:

LSC - bit #2, RXO - bit qemu#6, MDAC - bit qemu#9, SRPD - bit qemu#16, ACK - bit
qemu#17, MNG - bit qemu#18

This bug can occur with any of these bits depending on the driver's
behaviour and the way it configures the device. However, trying to
reproduce it with any bit other than RX0 is challenging and came to
failure as the drivers don't implement most of these bits, trying to
reproduce it with LSC (Link Status Change - bit #2) bit didn't succeed
too as it seems that Windows handles this bit differently.

Log sample of the storm:

[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
[email protected]:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)

* This bug behaviour wasn't observed with the Linux driver.

This commit solves:
https://bugzilla.redhat.com/show_bug.cgi?id=1447935
https://bugzilla.redhat.com/show_bug.cgi?id=1449490

Cc: [email protected]
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants