virtio, vhost, pc: fixes
Most notably this fixes a regression with vhost introduced by the pull before
last.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 18 Nov 2016 03:51:55 PM GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* mst/tags/for_upstream:
acpi: Use apic_id_limit when calculating legacy ACPI table size
ipmi: fix qemu crash while migrating with ipmi
ivshmem: Fix 64 bit memory bar configuration
virtio: set ISR on dataplane notifications
virtio: access ISR atomically
virtio: introduce grab/release_ioeventfd to fix vhost
virtio-crypto: fix virtio_queue_set_notification() race
Message-id: 1479484366-7977-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The code that calculates the legacy ACPI table size for migration
compatibility uses max_cpus when calculating legacy_aml_len (the size of
the DSDT and SSDT tables). However, the SSDT grows according to APIC ID
limit, not max_cpus.
The bug is not triggered very often because of the 4k alignment on the
table size. But it can be triggered if you are unlucky enough to cross a
4k boundary.
Change the legacy_aml_len calculation to use apic_id_limit, to calculate
the right size.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PC will use this field in other way, so move it outside the common
code so PC could set a different value, i.e. all CPUs
regardless of where they are coming from (-smp X | -device cpu...).
It's quick and dirty hack as it could be implemented in more generic
way in MashineClass. But do it in simple way since only PC is affected
so far.
Later we can generalize it when another affected target gets support
for -device cpu.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479212236-183810-3-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We should not use cpu_to_le16() here, instead each of device/function
value is stored in a 8 byte field.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce this field to control whether ACPI build is enabled by a
particular machine or accelerator.
It defaults to true if the machine itself supports ACPI build. Xen
accelerator will disable it because Xen is in charge of building ACPI
tables for the guest.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
The buffer is used to save the FIT info for all the presented nvdimm
devices which is updated after the nvdimm device is plugged or
unplugged. In the later patch, it will be used to construct NVDIMM
ACPI _FIT method which reflects the presented nvdimm devices after
nvdimm hotplug
As FIT buffer can not completely mapped into guest address space,
OSPM will exit to QEMU multiple times, however, there is the race
condition - FIT may be changed during these multiple exits, so that
some rules are introduced:
1) the user should hold the @lock to access the buffer and
2) mark @dirty whenever the buffer is updated.
@dirty is cleared for the first time OSPM gets fit buffer, if
dirty is detected in the later access, OSPM will restart the
access
As fit should be updated after nvdimm device is successfully realized
so that a new hotplug callback, post_hotplug, is introduced
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For each NVDIMM present or intended to be supported by platform,
platform firmware also exposes an ACPI Namespace Device under
the root device
So it builds nvdimm devices for all slots to support vNVDIMM hotplug
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Base patches for MTTCG enablement.
# gpg: Signature made Mon 31 Oct 2016 14:01:41 GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream-mttcg:
tcg: move locking for tb_invalidate_phys_page_range up
*_run_on_cpu: introduce run_on_cpu_data type
cpus: re-factor out handle_icount_deadline
tcg: cpus rm tcg_exec_all()
tcg: move tcg_exec_all and helpers above thread fn
target-arm/arm-powerctl: wake up sleeping CPUs
tcg: protect translation related stuff with tb_lock.
translate-all: Add assert_(memory|tb)_lock annotations
linux-user/elfload: ensure mmap_lock() held while setting up
tcg: comment on which functions have to be called with tb_lock held
cpu-exec: include cpu_index in CPU_LOG_EXEC messages
translate-all: add DEBUG_LOCKING asserts
translate_all: DEBUG_FLUSH -> DEBUG_TB_FLUSH
cpus: make all_vcpus_paused() return bool
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This changes the *_run_on_cpu APIs (and helpers) to pass data in a
run_on_cpu_data type instead of a plain void *. This is because we
sometimes want to pass a target address (target_ulong) and this fails on
32 bit hosts emulating 64 bit guests.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some files contain multiple #includes of the same header file.
Removed most of those unnecessary duplicate entries using
scripts/clean-includes.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Anand J <anand.indukala@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It would prevent starting guest with incorrect configs
where interrupts couldn't be delivered to CPUs with
APIC IDs > 255.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently firmware uses 1 byte at 0x5F offset in RTC CMOS
to get number of CPUs present at boot. However 1 byte is
not enough to handle more than 255 CPUs. So add a new
fw_cfg file that would allow QEMU to tell it.
For compat reasons add file only for machine types that
support more than 255 CPUs.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
That's enough to make old code that depends on it
to prevent QEMU starting with more than 255 CPUs.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Implement SUSE specific unplug protocol for emulated PCI devices
in PVonHVM guests. Its a simple 'outl(1, (ioaddr + 4));'.
This protocol was implemented and used since Xen 3.0.4.
It is used in all SUSE/SLES/openSUSE releases up to SLES11SP3 and
openSUSE 12.3.
In addition old (pre-2011) VMDP versions are handled as well.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Using 'vdev=sd[a-o]' will create an emulated LSI controller, which can
be used by the emulated BIOS to boot from disk. If the HVM domU has also
PV driver the disk may appear twice in the guest. To avoid this an
unplug of the emulated hardware is needed, similar to what is done for
IDE and NIC drivers already.
Since the SCSI controller provides only disks the entire controller can
be unplugged at once.
Impact of the change for classic and pvops based guest kernels:
vdev=sda:disk0
before: pvops: disk0=pv xvda + emulated sda
classic: disk0=pv sda + emulated sdq
after: pvops: disk0=pv xvda
classic: disk0=pv sda
vdev=hda:disk0, vdev=sda:disk1
before: pvops: disk0=pv xvda
disk1=emulated sda
classic: disk0=pv hda
disk1=pv sda + emulated sdq
after: pvops: disk0=pv xvda
disk1=not accessible by blkfront, index hda==index sda
classic: disk0=pv hda
disk1=pv sda
vdev=hda:disk0, vdev=sda:disk1, vdev=sdb:disk2
before: pvops: disk0=pv xvda
disk1=emulated sda
disk2=pv xvdb + emulated sdb
classic: disk0=pv hda
disk1=pv sda + emulated sdq
disk2=pv sdb + emulated sdr
after: pvops: disk0=pv xvda
disk1=not accessible by blkfront, index hda==index sda
disk2=pv xvdb
classic: disk0=pv hda
disk1=pv sda
disk2=pv sda
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Cluster x2APIC cannot work without KVM's x2apic API when the maximal
APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
forbid other APICs and also the old KVM case with less than 9, to
simplify the code.
There is no point in enabling EIM in forbidden APICs, so we keep it
enabled only for the KVM APIC; unconditionally, because making the
option depend on KVM version would be a maintanance burden.
Old QEMUs would enable eim whenever intremap was on, which would trick
guests into thinking that they can enable cluster x2APIC even if any
interrupt destination would get clamped to 8 bits.
Depending on your configuration, QEMU could notice that the destination
LAPIC is not present and report it with a very non-obvious:
KVM: injection failed, MSI lost (Operation not permitted)
Or the guest could say something about unexpected interrupts, because
clamping leads to aliasing so interrupts were being delivered to
incorrect VCPUs.
KVM_X2APIC_API is the feature that allows us to enable EIM for KVM.
QEMU 2.7 allowed EIM whenever interrupt remapping was enabled. In order
to keep backward compatibility, we again allow guests to misbehave in
non-obvious ways, and make it the default for old machine types.
A user can enable the buggy mode it with "x-buggy-eim=on".
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The default (auto) emulates the current behavior.
A user can now control EIM like
-device intel-iommu,intremap=on,eim=off
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* there no point in configuring the device if realization is going to
fail, so move the check to the beginning,
* create a separate function for the check,
* use error_setg() instead error_report().
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The MMIO interface to APIC only allowed 8 bit addresses, which is not
enough for 32 bit addresses from EIM remapping.
Intel stored upper 24 bits in the high MSI address, so use the same
technique. The technique is also used in KVM MSI interface.
Other APICs are unlikely to handle those upper bits.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The MMIO based interface to APIC doesn't work well with MSIs that have
upper address bits set (remapped x2APIC MSIs). A specialized interface
is a quick and dirty way to avoid the shortcoming.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* Thread Sanitizer fixes (Alex)
* Coverity fixes (David)
* test-qht fixes (Emilio)
* QOM interface for info irq/info pic (Hervé)
* -rtc clock=rt fix (Junlian)
* mux chardev fixes (Marc-André)
* nicer report on death by signal (Michal)
* qemu-tech TLC (Paolo)
* MSI support for edu device (Peter)
* qemu-nbd --offset fix (Tomáš)
# gpg: Signature made Fri 07 Oct 2016 17:25:10 BST
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (39 commits)
qemu-doc: merge qemu-tech and qemu-doc
qemu-tech: rewrite some parts
qemu-tech: reorganize content
qemu-tech: move TCG test documentation to tests/tcg/README
qemu-tech: move user mode emulation features from qemu-tech
qemu-tech: document lazy condition code evaluation in cpu.h
qemu-tech: move text from qemu-tech to tcg/README
qemu-doc: drop installation and compilation notes
qemu-doc: replace introduction with the one from the internals manual
qemu-tech: drop index
test-qht: perform lookups under rcu_read_lock
qht: fix unlock-after-free segfault upon resizing
qht: simplify qht_reset_size
qemu-nbd: Shrink image size by specified offset
qemu_kill_report: Report PID name too
util: Introduce qemu_get_pid_name
char: update read handler in all cases
char: use a fixed idx for child muxed chr
i8259: give ISA device when registering ISA ioports
.travis.yml: add gcc sanitizer build
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Trigger Mode field of IOAPIC must match the Trigger Mode in
the IRTE according to VT-d Spec 5.1.5.1.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Replace repeated pattern
for (i = 0; i < nb_numa_nodes; i++) {
if (test_bit(idx, numa_info[i].node_cpu)) {
...
break;
with a helper function to lookup numa node index for cpu.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Block layer patches
# gpg: Signature made Thu 29 Sep 2016 14:11:30 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
oslib-posix: add a configure switch to debug stack usage
coroutine-sigaltstack: use helper for allocating stack memory
coroutine-ucontext: use helper for allocating stack memory
coroutine: add a macro for the coroutine stack size
coroutine-sigaltstack: rename coroutine struct appropriately
oslib-posix: add helpers for stack alloc and free
block: Remove qemu_root_bds_opts
block: Move 'discard' option to bdrv_open_common()
block: Use 'detect-zeroes' option for 'blockdev-change-medium'
block: Parse 'detect-zeroes' in bdrv_open_common()
block/qapi: Move 'aio' option to file driver
block/qapi: Use separate options type for curl driver
block: Drop aio/cache consistency check from qmp_blockdev_add()
block: Fix error path in qmp_blockdev_change_medium()
block-backend: remove blk_flush_all
qemu: use bdrv_flush_all for vm_stop et al
block: reintroduce bdrv_flush_all
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.
The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>