When something tries to run one of the spawn syscalls (eg clone),
our seccomp deny filter is set to cause a fatal trap which kills
the process.
This is found to be unhelpful when QEMU has loaded the nvidia
GL library. This tries to spawn a process to modprobe the nvidia
kmod. This is a dubious thing to do, but at the same time, the
code will gracefully continue if this fails. Our seccomp filter
rightly blocks the spawning, but prevent the graceful continue.
Switching to reporting EPERM will make QEMU behave more gracefully
without impacting the level of protect we have.
https://gitlab.com/qemu-project/qemu/-/issues/2116
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A small number of migration options are accessed by migration clients,
but to see them clients must include all of options.h, which is mostly
for migration core code. migrate_mode() in particular will be needed by
multiple clients.
Refactor the option declarations so clients can see the necessary few via
misc.h, which already exports a portion of the client API.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179319-294320-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
If the access is bigger than the MemoryRegion supports,
flatview_read/write_continue() will attempt to update the Memory Region.
but the address passed to flatview_translate() is relative to the cache, not
to the FlatView.
On arm/virt with interleaved CXL memory emulation and virtio-blk-pci this
lead to the first part of descriptor being read from the CXL memory and the
second part from PA 0x8 which happens to be a blank region
of a flash chip and all ffs on this particular configuration.
Note this test requires the out of tree ARM support for CXL, but
the problem is more general.
Avoid this by adding new address_space_read_continue_cached()
and address_space_write_continue_cached() which share all the logic
with the flatview versions except for the MemoryRegion lookup which
is unnecessary as the MemoryRegionCache only covers one MemoryRegion.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240307153710.30907-5-Jonathan.Cameron@huawei.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The calls to flatview_read/write[_continue]() have parameters addr and
addr1 but the names give no indication of what they are addresses of.
Rename addr1 to mr_addr to reflect that it is the translated address
offset within the MemoryRegion returned by flatview_translate().
Similarly rename the parameter in address_space_read/write_cached_slow()
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20240307153710.30907-2-Jonathan.Cameron@huawei.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Misc HW patch queue
- hmp: Shorter 'info qtree' output (Zoltan)
- qdev: Add a granule_mode property (Eric)
- Some ERRP_GUARD() fixes (Zhao)
- Doc & style fixes in docs/interop/firmware.json (Thomas)
- hw/xen: Housekeeping (Phil)
- hw/ppc/mac99: Change timebase frequency 25 -> 100 MHz (Mark)
- hw/intc/apic: Memory leak fix (Paolo)
- hw/intc/grlib_irqmp: Ensure ncpus value is in range (Clément)
- hw/m68k/mcf5208: Add support for reset (Angelo)
- hw/i386/pc: Housekeeping (Phil)
- hw/core/smp: Remove/deprecate parameter=0,1 adapting test-smp-parse (Zhao)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmXstpMACgkQ4+MsLN6t
# wN6XBw//dNItFhf1YX+Au4cNoQVDgHE9RtzEIGOnwcL1CgrA9rAQQfLRE5KWM6sN
# 1qiPh+T6SPxtiQ2rw4AIpsI7TXjO72b/RDWpUUSwnfH39eC77pijkxIK+i9mYI9r
# p0sPjuP6OfolUFYeSbYX+DmNZh1ONPf27JATJQEf0st8dyswn7lTQvJEaQ97kwxv
# UKA0JD5l9LZV8Zr92cgCzlrfLcbVblJGux9GYIL09yN78yqBuvTm77GBC/rvC+5Q
# fQC5PQswJZ0+v32AXIfSysMp2R6veo4By7VH9Lp51E/u9jpc4ZbcDzxzaJWE6zOR
# fZ01nFzou1qtUfZi+MxNiDR96LP6YoT9xFdGYfNS6AowZn8kymCs3eo7M9uvb+rN
# A2Sgis9rXcjsR4e+w1YPBXwpalJnLwB0QYhEOStR8wo1ceg7GBG6zHUJV89OGzsA
# KS8X0aV1Ulkdm/2H6goEhzrcC6FWLg8pBJpfKK8JFWxXNrj661xM0AAFVL9we356
# +ymthS2x/RTABSI+1Lfsoo6/SyXoimFXJJWA82q9Yzoaoq2gGMWnfwqxfix6JrrA
# PuMnNP5WNvh04iWcNz380P0psLVteHWcVfTRN3JvcJ9iJ2bpjcU1mQMJtvSF9wBn
# Y8kiJTUmZCu3br2e5EfxmypM/h8y29VD/1mxPk8Dtcq3gjx9AU4=
# =juZH
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 09 Mar 2024 19:20:51 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'hw-misc-20240309' of https://github.com/philmd/qemu: (43 commits)
hw/m68k/mcf5208: add support for reset
tests/unit/test-smp-parse: Test "parameter=0" SMP configurations
tests/unit/test-smp-parse: Test smp_props.has_clusters
tests/unit/test-smp-parse: Test the full 7-levels topology hierarchy
tests/unit/test-smp-parse: Test "drawers" and "books" combination case
tests/unit/test-smp-parse: Test "drawers" parameter in -smp
tests/unit/test-smp-parse: Test "books" parameter in -smp
tests/unit/test-smp-parse: Make test cases aware of the book/drawer
tests/unit/test-smp-parse: Bump max_cpus to 4096
tests/unit/test-smp-parse: Use CPU number macros in invalid topology case
tests/unit/test-smp-parse: Drop the unsupported "dies=1" case
hw/core/machine-smp: Calculate total CPUs once in machine_parse_smp_config()
hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations
hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurations
docs/interop/firmware.json: Fix doc for FirmwareFlashMode
docs/interop/firmware.json: Align examples
hw/intc/grlib_irqmp: abort realize when ncpus value is out of range
mac_newworld: change timebase frequency from 100MHz to 25MHz for mac99 machine
hmp: Add option to info qtree to omit details
qdev: Add a granule_mode property
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
trivial patches for 2024-03-09
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmXshtIPHG1qdEB0bHMu
# bXNrLnJ1AAoJEHAbT2saaT5ZMFUIAKTd1rYzRs6x4wXitaWbYIIs2d6UB/HLbzz2
# BVHZwoYqsW3TuNFJp4njHhexZ76nFlT8xMuOKB5tAm4KOmqOdxS/mfThuSGsWGP7
# CAk35ENOMQbii/jp6tqawop+H0rVMSJjBrkU4vLRAtQ7g1ISnX6tJi3wiyS+FtHq
# 9eIfgJgM77tvq6RLPZTUrUBevMWQfjMcvXmMnYqL4Z1dnibIb5/R3RKAnEc4CUoS
# hMw94wBcq+ZOQNPnY7d+WioKq7JcSWX7UW5NuHo+C+G83nq1/5vE8Oe2kNwzFyDL
# 9sIqL8bz6v8iiqcVMIBykSAZhYH9QEuVRJso18UE5w0B8k4CQcM=
# =dIAF
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 09 Mar 2024 15:57:06 GMT
# gpg: using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59
# gpg: issuer "mjt@tls.msk.ru"
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full]
# gpg: aka "Michael Tokarev <mjt@corpit.ru>" [full]
# gpg: aka "Michael Tokarev <mjt@debian.org>" [full]
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59
* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu:
docs/acpi/bits: add some clarity and details while also improving formating
hw/mem/cxl_type3: Fix problem with g_steal_pointer()
hw/pci-bridge/cxl_upstream: Fix problem with g_steal_pointer()
hw/cxl/cxl-cdat: Fix type of buf in ct3_load_cdat()
qerror: QERR_DEVICE_IN_USE is no longer used, drop
blockdev: Fix block_resize error reporting for op blockers
char: Slightly better error reporting when chardev is in use
make-release: switch to .xz format by default
hw/scsi/lsi53c895a: Fix typo in comment
hw/vfio/pci.c: Make some structure static
replay: Improve error messages about configuration conflicts
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Improve
Record/replay feature is not supported for '-rtc base=localtime'
Record/replay feature is not supported for 'smp'
Record/replay feature is not supported for '-snapshot'
to
Record/replay is not supported with -rtc base=localtime
Record/replay is not supported with multiple CPUs
Record/replay is not supported with -snapshot
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Original goal of addition of drain_call_rcu to qmp_device_add was to cover
the failure case of qdev_device_add. It seems call of drain_call_rcu was
misplaced in 7bed89958b what led to waiting for pending RCU callbacks
under happy path too. What led to overall performance degradation of
qmp_device_add.
In this patch call of drain_call_rcu moved under handling of failure of
qdev_device_add.
Signed-off-by: Dmitrii Gavrilov <ds-gavr@yandex-team.ru>
Message-ID: <20231103105602.90475-1-ds-gavr@yandex-team.ru>
Fixes: 7bed89958b ("device_core: use drain_call_rcu in in qmp_device_add", 2020-10-12)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a fd-bootchk property to PC machine types, so that -no-fd-bootchk
returns an error if the machine does not support booting from floppies
and checking for boot signatures therein.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently the qemu_register_reset() API permits the reset handler functions
registered with it to remove themselves from within the callback function.
This is fine with our current implementation, but is a bit odd, because
generally reset is supposed to be idempotent, and doesn't fit well in a
three-phase-reset world where a resettable object will get multiple
callbacks as the system is reset.
We now have only one user of qemu_register_reset() which makes use of
the ability to unregister itself within the callback:
restore_boot_order(). We want to change our implementation of
qemu_register_reset() to something where it would be awkward to
maintain the "can self-unregister" feature. Rather than making that
reimplementation complicated, change restore_boot_order() so that it
doesn't unregister itself but instead returns doing nothing for any
calls after it has done the "restore the boot order" work.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240220160622.114437-4-peter.maydell@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Input grab key should be Ctrl-Alt-g, not just Ctrl-Alt.
Fixes: f8d2c9369b ("sdl: use ctrl-alt-g as grab hotkey")
Signed-off-by: Tianlan Zhou <bobby825@126.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Arguments `ram_block` are reassigned to local declarations `block`
without further use. Remove re-assignment to reduce noise.
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Some SuperI/O devices such as the VIA south bridges or the PC87312 controller
allow to enable or disable their SuperI/O functions. Add a convenience function
for implementing this in the VIA south bridges.
The naming of the functions is inspired by its memory_region_set_enabled()
pendant.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some SuperI/O devices such as the VIA south bridges or the PC87312 controller
are able to relocate their SuperI/O functions. Add a convenience function for
implementing this in the VIA south bridges.
This convenience function relies on previous simplifications in exec/ioport
which avoids some duplicate synchronization of I/O port base addresses. The
naming of the function is inspired by its memory_region_set_address() pendant.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
portio_list_add_1() creates a MemoryRegionPortioList instance which holds a
MemoryRegion `mr` and an array of MemoryRegionPortio elements named `ports`.
Each element in the array gets assigned the same value for its .base attribute.
The same value also ends up as the .addr attribute of `mr` due to the
memory_region_add_subregion() call. This means that all .base attributes are
the same as `mr.addr`.
The only usages of MemoryRegionPortio::base were in portio_read() and
portio_write(). Both functions get above MemoryRegionPortioList as their
opaque parameter. In both cases find_portio() can only return one of the
MemoryRegionPortio elements of the `ports` array. Due to above observation any
element will have the same .base value equal to `mr.addr` which is also
accessible.
Hence, `mrpio->mr.addr` is equivalent to `mrp->base` and
MemoryRegionPortio::base is redundant and can be removed.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For a long time now, libvirt has pre-created the monitor connection
socket and passed the pre-opened FD into QEMU during startup. Thus
libvirt does not have any timeouts waiting for the monitor socket
to appear, it is immediately connected.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In many configurations, e.g. multiple vNICs with multiple queues or
with many Ceph OSDs, the default soft limit of 1024 is not enough.
QEMU is supposed to work fine with file descriptors >= 1024 and does
not use select() on POSIX. Bump the soft limit to the allowed hard
limit to avoid issues with the aforementioned configurations.
Of course the limit could be raised from the outside, but the man page
of systemd.exec states about 'LimitNOFILE=':
> Don't use.
> [...]
> Typically applications should increase their soft limit to the hard
> limit on their own, if they are OK with working with file
> descriptors above 1023,
If the soft limit is already the same as the hard limit, avoid the
superfluous setrlimit call. This can avoid a warning with a strict
seccomp filter blocking setrlimit if NOFILE was already raised before
executing QEMU.
Buglink: https://bugzilla.proxmox.com/show_bug.cgi?id=4507
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
QEMU initializes preallocated backend memory as the objects are parsed from
the command line. This is not optimal in some cases (e.g. memory spanning
multiple NUMA nodes) because the memory objects are initialized in series.
Allow the initialization to occur in parallel (asynchronously). In order to
ensure optimal thread placement, asynchronous initialization requires prealloc
context threads to be in use.
Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
Message-ID: <20240131165327.3154970-2-mark.kanda@oracle.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Also remove the stale declaration of host_net_devices; the actual
definition was removed long ago in commit 7cc28cb061 ("net: Remove
the deprecated 'host_net_add' and 'host_net_remove' HMP commands")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Currently if the user passes multiple -serial options on the command
line, we mostly treat those as applying to the different serial
devices in order, so that for example
-serial stdio -serial file:filename
will connect the first serial port to stdio and the second to the
named file.
The exception to this is the '-serial none' serial device type. This
means "don't allocate this serial device", but a bug means that
following -serial options are not correctly handled, so that
-serial none -serial stdio
has the unexpected effect that stdio is connected to the first serial
port, not the second.
This is a very long-standing bug that dates back at least as far as
commit 998bbd74b9 from 2009.
Make the 'none' serial type move forward in the indexing of serial
devices like all the other serial types, so that any subsequent
-serial options are correctly handled.
Note that if your commandline mistakenly had a '-serial none' that
was being overridden by a following '-serial something' option, you
should delete the unnecessary '-serial none'. This will give you the
same behaviour as before, on QEMU versions both with and without this
bug fix.
Cc: qemu-stable@nongnu.org
Reported-by: Bohdan Kostiv <bohdan.kostiv@tii.ae>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240122163607.459769-2-peter.maydell@linaro.org
Fixes: 998bbd74b9 ("default devices: core code & serial lines")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>