As noted by qemu-plugins.h, enum qemu_plugin_cb_flags is
currently unused -- plugins can neither read nor write
guest registers.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add function that transforms named fd inside SocketAddress structure
into number representation. This way it may be then used in a context
where current monitor is not available.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu_co_queue_next() and qemu_co_queue_restart_all() just call
aio_co_wake() which works well in non-coroutine context. So these
functions can be called from non-coroutine context as well. And
actually qemu_co_queue_restart_all() is called from
nbd_cancel_in_flight(), which is called from non-coroutine context.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
* avoid deprecation warnings for SASL on macOS 10.11 or newer
* fix -readconfig when config blocks have an id (like [chardev "qmp"])
* Error* initialization fixes
* Improvements to ESP emulation (Mark)
* Allow creating noreserve memory backends (David)
* Improvements to query-memdev (David)
* Bump compiler to C11 (Richard)
* First round of SVM fixes from GSoC project (Lara)
# gpg: Signature made Wed 16 Jun 2021 16:37:49 BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini-gitlab/tags/for-upstream: (45 commits)
configure: Remove probe for _Static_assert
qemu/compiler: Remove QEMU_GENERIC
include/qemu/lockable: Use _Generic instead of QEMU_GENERIC
util: Use unique type for QemuRecMutex in thread-posix.h
util: Pass file+line to qemu_rec_mutex_unlock_impl
util: Use real functions for thread-posix QemuRecMutex
softfloat: Use _Generic instead of QEMU_GENERIC
configure: Use -std=gnu11
target/i386: Added Intercept CR0 writes check
target/i386: Added consistency checks for CR0
target/i386: Added consistency checks for VMRUN intercept and ASID
target/i386: Refactored intercept checks into cpu_svm_has_intercept
configure: map x32 to cpu_family x86_64 for meson
hmp: Print "reserve" property of memory backends with "info memdev"
qmp: Include "reserve" property of memory backends
hmp: Print "share" property of memory backends with "info memdev"
qmp: Include "share" property of memory backends
qmp: Clarify memory backend properties returned via query-memdev
hostmem: Wire up RAM_NORESERVE via "reserve" property
util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Change to -std=gnu11.
Replace QEMU_GENERIC with _Generic.
Remove configure detect of _Static_assert.
# gpg: Signature made Wed 16 Jun 2021 02:32:32 BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth-gitlab/tags/pull-c11-20210615:
configure: Remove probe for _Static_assert
qemu/compiler: Remove QEMU_GENERIC
include/qemu/lockable: Use _Generic instead of QEMU_GENERIC
util: Use unique type for QemuRecMutex in thread-posix.h
util: Pass file+line to qemu_rec_mutex_unlock_impl
util: Use real functions for thread-posix QemuRecMutex
softfloat: Use _Generic instead of QEMU_GENERIC
configure: Use -std=gnu11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently the ARM SVE helper code defines locally some utility
functions for swapping 16-bit halfwords within 32-bit or 64-bit
values and for swapping 32-bit words within 64-bit values,
parallel to the byte-swapping bswap16/32/64 functions.
We want these also for the ARM MVE code, and they're potentially
generally useful for other targets, so move them to bitops.h.
(We don't put them in bswap.h with the bswap* functions because
they are implemented in terms of the rotate operations also
defined in bitops.h, and including bitops.h from bswap.h seems
better avoided.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210614151007.4545-17-peter.maydell@linaro.org
This is both more and less complicated than our expansion
using __builtin_choose_expr and __builtin_types_compatible_p.
The expansion through QEMU_MAKE_LOCKABLE_ doesn't work because
we're not emumerating all of the types within the same _Generic,
which results in errors about unhandled cases. We must also
handle void* explicitly, so that the NULL constant can be used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-7-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is both more and less complicated than our expansion
using __builtin_choose_expr and __builtin_types_compatible_p.
The expansion through QEMU_MAKE_LOCKABLE_ doesn't work because
we're not emumerating all of the types within the same _Generic,
which results in errors about unhandled cases. We must also
handle void* explicitly, so that the NULL constant can be used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-7-richard.henderson@linaro.org>
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
effect on most shared mappings - except for hugetlbfs and anonymous memory.
Linux man page:
"MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
space is reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get SIGSEGV
upon a write if no physical memory is available. See also the discussion
of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
2.6, this flag had effect only for private writable mappings."
Note that the "guarantee" part is wrong with memory overcommit in Linux.
Also, in Linux hugetlbfs is treated differently - we configure reservation
of huge pages from the pool, not reservation of swap space (huge pages
cannot be swapped).
The rough behavior is [1]:
a) !Hugetlbfs:
1) Without MAP_NORESERVE *or* with memory overcommit under Linux
disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
accounting/reservation happens:
For a file backed map
SHARED or READ-only - 0 cost (the file is the map not swap)
PRIVATE WRITABLE - size of mapping per instance
For an anonymous or /dev/zero map
SHARED - size of mapping
PRIVATE READ-only - 0 cost (but of little use)
PRIVATE WRITABLE - size of mapping per instance
2) With MAP_NORESERVE, no accounting/reservation happens.
b) Hugetlbfs:
1) Without MAP_NORESERVE, huge pages are reserved.
2) With MAP_NORESERVE, no huge pages are reserved.
Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
to configure it for !hugetlbfs globally; this toggle now allows
configuring it more fine-grained, not for the whole system.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
[1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:
"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"
Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()
... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's pass flags instead of bools to prepare for passing other flags and
update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_
flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify
it.
We expose only flags that are currently supported by qemu_ram_mmap().
Maybe, we'll see qemu_mmap() in the future as well that can implement these
flags.
Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only
defined for some systems and we want to always be able to identify
these flags reliably inside qemu_ram_mmap() -- for example, to properly
warn when some future flags are not available or effective on a system.
Also, this way we can simplify PROT_ handling as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can create shared anonymous memory via
"-object memory-backend-ram,share=on,..."
which is, for example, required by PVRDMA for mremap() to work.
Shared anonymous memory is weird, though. Instead of MADV_DONTNEED, we
have to use MADV_REMOVE: MADV_DONTNEED will only remove / zap all
relevant page table entries of the current process, the backend storage
will not get removed, resulting in no reduced memory consumption and
a repopulation of previous content on next access.
Shared anonymous memory is internally really just shmem, but without a
fd exposed. As we cannot use fallocate() without the fd to discard the
backing storage, MADV_REMOVE gets the same job done without a fd as
documented in "man 2 madvise". Removing backing storage implicitly
invalidates all page table entries with relevant mappings - an additional
MADV_DONTNEED is not required.
Fixes: 06329ccecf ("mem: add share parameter to memory-backend-ram")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210406080126.24010-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Change the parser to put the values into a QDict and pass them
to a callback. qemu_config_parse's QemuOpts creation is
itself turned into a callback function.
This is useful for -readconfig to support keyval-based options;
getting a QDict from the parser removes a roundtrip from
QDict to QemuOpts and then back to QDict.
Unfortunately there is a disadvantage in that semantic errors will
point to the last line of the group, because the entries of the QDict
do not have a location attached.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210524105752.3318299-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 15e8699f00 ("atomics: convert to reStructuredText") converted
docs/devel/atomics.txt to docs/devel/atomics.rst.
We still have several references to the old file, so let's fix them
with the following command:
sed -i s/atomics.txt/atomics.rst/ $(git grep -l docs/devel/atomics.txt)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-3-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Right now the SPICE module is special cased to be loaded when processing
of the -spice command line option. However, the spice option group
can also be brought in via -readconfig, in which case the module is
not loaded.
Add a generic hook to load modules that provide a QemuOpts group,
and use it for the "spice" and "iscsi" groups.
Fixes: #194
Fixes: https://bugs.launchpad.net/qemu/+bug/1910696
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These headers are also included from softmmu/vl.c, so they should be
in include/. Remove qemu-options-wrapper.h, since elsewhere
we include "template" headers directly and #define the parameters in
the including file; move qemu-options.h to include/.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow using QemuCoSleep to sleep forever until woken by qemu_co_sleep_wake.
This makes the logic of qemu_co_sleep_ns_wakeable easy to understand.
In the future we will introduce an API that can work even if the
sleep and wake happen from different threads. For now, initializing
w->to_wake after timer_mod is fine because the timer can only fire in
the same AioContext.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-7-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Right now, users of qemu_co_sleep_ns_wakeable are simply passing
a pointer to QemuCoSleepState by reference to the function. But
QemuCoSleepState really is just a Coroutine*; making the
content of the struct public is just as efficient and lets us
skip the user_state_pointer indirection.
Since the usage is changed, take the occasion to rename the
struct to QemuCoSleep.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-6-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reorg FloatParts to use QEMU_GENERIC.
Begin replacing the Berkeley float128 routines with FloatParts128.
- includes a new implementation of float128_muladd
- includes the snan silencing that was missing from
float{32,64}_to_float128 and float128_to_float{32,64}.
- does not include float128_min/max* (written but not yet reviewed).
# gpg: Signature made Sun 16 May 2021 13:27:10 BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth-gitlab/tags/pull-fp-20210516: (46 commits)
softfloat: Move round_to_int_and_pack to softfloat-parts.c.inc
softfloat: Move round_to_int to softfloat-parts.c.inc
softfloat: Convert float-to-float conversions with float128
softfloat: Split float_to_float
softfloat: Move div_floats to softfloat-parts.c.inc
softfloat: Introduce sh[lr]_double primitives
softfloat: Tidy mul128By64To192
softfloat: Use add192 in mul128To256
softfloat: Use mulu64 for mul64To128
softfloat: Move muladd_floats to softfloat-parts.c.inc
softfloat: Move mul_floats to softfloat-parts.c.inc
softfloat: Implement float128_add/sub via parts
softfloat: Move addsub_floats to softfloat-parts.c.inc
softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h
softfloat: Move round_canonical to softfloat-parts.c.inc
softfloat: Move sf_canonicalize to softfloat-parts.c.inc
softfloat: Move pick_nan_muladd to softfloat-parts.c.inc
softfloat: Move pick_nan to softfloat-parts.c.inc
softfloat: Move return_nan to softfloat-parts.c.inc
softfloat: Convert float128_default_nan to parts
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These builtins came in clang 3.8, but are not present in gcc through
version 11. Even in clang the optimization is only ideal on x86_64,
but never worse than the hand-coding that we currently do.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These builtins came in with gcc 5 and clang 3.8, which are
slightly newer than our supported minimum compiler versions.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If mirror is READY than cancel operation is not discarding the whole
result of the operation, but instead it's a documented way get a
point-in-time snapshot of source disk.
So, we should not cancel any requests if mirror is READ and
force=false. Let's fix that case.
Note, that bug that we have before this commit is not critical, as the
only .bdrv_cancel_in_flight implementation is nbd_cancel_in_flight()
and it cancels only requests waiting for reconnection, so it should be
rare case.
Fixes: 521ff8b779
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210421075858.40197-1-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
qemu_add_child_watch is not called anywhere since commit 2bdb920ece
("slirp: simplify fork_exec()", 2019-01-14), remove it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
overall, all devices' realize functions take an Error **errp, but return void.
hw/core/qdev.c code, which realizes devices, therefore does:
local_err = NULL;
dc->realize(dev, &local_err);
if (local_err != NULL) {
goto fail;
}
However, we can improve at least accel_cpu to return a meaningful bool value.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210322132800.7470-9-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make bswap.h handle being included outside an 'extern "C"' block:
all system headers are included first, then all declarations are
put inside an 'extern "C"' block.
This requires a little rearrangement as currently we have an ifdef
ladder that has some system includes and some local declarations
or definitions, and we need to separate those out.
We want to do this because dis-asm.h includes bswap.h, dis-asm.h
may need to be included from C++ files, and system headers should
not be included within 'extern "C"' blocks.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Both os-win32.h and os-posix.h include system header files. Instead
of having osdep.h include them inside its 'extern "C"' block, make
these headers handle that themselves, so that we don't include the
system headers inside 'extern "C"'.
This doesn't fix any current problems, but it's conceptually the
right way to handle system headers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Right now, rate limiting is protected by the AioContext mutex, which is
taken for example both by the block jobs and by qmp_block_job_set_speed
(via find_block_job).
We would like to remove the dependency of block layer code on the
AioContext mutex, since most drivers and the core I/O code are already
not relying on it. However, there is no existing lock that can easily
be taken by both ratelimit_set_speed and ratelimit_calculate_delay,
especially because the latter might run in coroutine context (and
therefore under a CoMutex) but the former will not.
Since concurrent calls to ratelimit_calculate_delay are not possible,
one idea could be to use a seqlock to get a snapshot of slice_ns and
slice_quota. But for now keep it simple, and just add a mutex to the
RateLimit struct; block jobs are generally not performance critical to
the point of optimizing the clock cycles spent in synchronization.
This also requires the introduction of init/destroy functions, so
add them to the two users of ratelimit.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add simple transaction API to use in further update of block graph
operations.
Supposed usage is:
- "prepare" is main function of the action and it should make the main
effect of the action to be visible for the following actions, keeping
possibility of roll-back, saving necessary things in action state,
which is prepended to the action list (to do that, prepare func
should call tran_add()). So, driver struct doesn't include "prepare"
field, as it is supposed to be called directly.
- commit/rollback is supposed to be called for the list of action
states, to commit/rollback all the actions in reverse order
- When possible "commit" should not make visible effect for other
actions, which make possible transparent logical interaction between
actions.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-9-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
System headers may include templates if compiled with a C++ compiler,
which cause the compiler to complain if qemu/osdep.h is included
within a C++ source file's 'extern "C"' block. Add
an 'extern "C"' block directly to qemu/osdep.h, so that
system headers can be kept out of it.
There is a stray declaration early in qemu/osdep.h, which needs
to be special cased. Add a definition in qemu/compiler.h to
make it look nice.
config-host.h, CONFIG_TARGET, exec/poison.h and qemu/compiler.h
are included outside the 'extern "C"' block; that is not
an issue because they consist entirely of preprocessor directives.
This allows us to move the include of osdep.h in our two C++
source files outside the extern "C" block they were previously
using for it, which in turn means that they compile successfully
against newer versions of glib which insist that glib.h is
*not* inside an extern "C" block.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-3-peter.maydell@linaro.org
[PMM: Moved disas/arm-a64.cc osdep.h include out of its extern "C" block;
explained in commit message why we're doing this]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The pseries machines introduced the concept of 'unplug timeout' for CPU
hotunplugs. The idea was to circunvent a deficiency in the pSeries
specification (PAPR), that currently does not define a proper way for
the hotunplug to fail. If the guest refuses to release the CPU (see [1]
for an example) there is no way for QEMU to detect the failure.
Further discussions about how to send a QAPI event to inform about the
hotunplug timeout [2] exposed problems that weren't predicted back when
the idea was developed. Other QEMU machines don't have any type of
hotunplug timeout mechanism for any device, e.g. ACPI based machines
have a way to make hotunplug errors visible to the hypervisor. This
would make this timeout mechanism exclusive to pSeries, which is not
ideal.
The real problem is that a QAPI event that reports hotunplug timeouts
puts the management layer (namely Libvirt) in a weird spot. We're not
telling that the hotunplug failed, because we can't be 100% sure of
that, and yet we're resetting the unplug state back, preventing any
DEVICE_DEL events to reach out in case the guest decides to release the
device. Libvirt would need to inspect the guest itself to see if the
device was released or not, otherwise the internal domain states will be
inconsistent. Moreover, Libvirt already has an 'unplug timeout'
concept, and a QEMU side timeout would need to be juggled together with
the existing Libvirt timeout.
All this considered, this solution ended up creating more trouble than
it solved. This patch reverts the 3 commits that introduced the timeout
mechanism for CPU hotplugs in pSeries machines.
This reverts commit 4515a5f786
"qemu_timer.c: add timer_deadline_ms() helper"
This reverts commit d1c2e3ce3d
"spapr_drc.c: add hotunplug timeout for CPUs"
This reverts commit 51254ffb32
"spapr_drc.c: introduce unplug_timeout_timer"
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1911414
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg04682.html
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210401000437.131140-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>