Switch the .bdrv_read method implementation from using bdrv_pread() to
bdrv_read() on the underlying file, since the latter is subject to i/o
throttling while the former is not.
Besides, since bdrv_read() operates in sectors rather than bytes, adjust
the helper functions to do so too.
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1430207220-24458-4-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QMP pull request
# gpg: Signature made Mon May 11 14:15:19 2015 BST using RSA key ID E24ED5A7
# gpg: Good signature from "Luiz Capitulino <lcapitulino@gmail.com>"
* remotes/qmp-unstable/tags/for-upstream:
scripts: qmp-shell: Add verbose flag
scripts: qmp-shell: add transaction subshell
scripts: qmp-shell: Expand support for QMP expressions
scripts: qmp-shell: refactor helpers
MAINTAINERS: New maintainer for QMP and QAPI
json-parser: Accept 'null' in QMP
qobject: Add a special null QObject
qobject: Clean up around qtype_code
QJSON: Use OBJECT_CHECK
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QTYPE_NONE is a sentinel value. No QObject has this type code.
Document it properly.
Fix dump_qobject() to abort() on QTYPE_NONE, just like for any other
invalid type code.
Fix to_json() to abort() on all invalid type codes, not just
QTYPE_MAX.
Clean up Property member qtype's type: it's a qtype_code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The block.c file has grown to over 6000 lines. It is time to split this
file so there are fewer conflicts and the code is easier to maintain.
Extract I/O request processing code:
* Read
* Write
* Zero writes and making the image empty
* Flush
* Discard
* ioctl
* Tracked requests and queuing
* Throttling and copy-on-read
* Block status and allocated functions
* Refreshing block limits
* Reading/writing vmstate
* qemu_blockalign() and friends
The patch simply moves code from block.c into block/io.c.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Coverity spotted this.
The field is 32 bits, but if it's possible to overflow in 32 bit
left shift.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The mirror block job is trying to take a clever shortcut if delay_ns is
0 and skips block_job_sleep_ns() in that case. But that function must be
called in every block job iteration, because otherwise it is for example
impossible to pause the job.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The new command pair is added to manage a user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same device,
but different devices can have bitmaps with the same names.
The granularity is an optional field. If it is not specified, we will
choose a default granularity based on the cluster size if available,
clamped to between 4K and 64K to mirror how the 'mirror' code was
already choosing granularity. If we do not have cluster size info
available, we choose 64K. This code has been factored out into a helper
shared with block/mirror.
This patch also introduces the 'block_dirty_bitmap_lookup' helper,
which takes a device name and a dirty bitmap name and validates the
lookup, returning NULL and setting errp if there is a problem with
either field. This helper will be re-used in future patches in this
series.
The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}'
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1429314609-29776-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This field will be set for user created dirty bitmap. Also pass in an
error pointer to bdrv_create_dirty_bitmap, so when a name is already
taken on this BDS, it can report an error message. This is not global
check, two BDSes can have dirty bitmap with a common name.
Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will
be used later when other QMP commands want to reference dirty bitmap by
name.
Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1429314609-29776-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
the allocationmap has only a hint character. The driver always
double checks that blocks marked unallocated in the cache are
still unallocated before taking the fast path and return zeroes.
So using the allocationmap is migration safe and can
also be enabled with cache.direct=on.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-10-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The idea is that a command is retried in a BUSY condition
up a time of approx. 60 seconds before it is failed. This should
be far higher than any command timeout in the guest.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-7-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
SCSI allowes to tell the target to not return from a write command
if the date is not written to the disk. Use this so called FUA
bit if it is supported to optimize WRITE commands if writeback is
not allowed.
In this case qemu always issues a WRITE followed by a FLUSH. This
is 2 round trip times. If we set the FUA bit we can ignore the
following FLUSH.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-6-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The image field in BlockDeviceInfo is supposed to contain an ImageInfo
object. However that is being filled in by bdrv_query_info(), not by
bdrv_block_device_info(), which is where BlockDeviceInfo is actually
created.
Anyone calling bdrv_block_device_info() directly will get a null image
field. As a consequence of this, the HMP command 'info block -n -v'
crashes QEMU.
This patch moves the code that fills in that field from
bdrv_query_info() to bdrv_block_device_info().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1429271563-3765-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are several error messages that identify a BlockDriverState by
its device name. However those errors can be produced in nodes that
don't have a device name associated.
In those cases we should use bdrv_get_device_or_node_name() to fall
back to the node name and produce a more meaningful message. The
messages are also updated to use the more generic term 'node' instead
of 'device'.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch changes block_job_pause to increase the pause counter and
block_job_resume to decrease it.
The counter will allow calling block_job_pause/block_job_resume
unconditionally on a job when we need to suspend the IO temporarily.
From now on, each block_job_resume must be paired with a block_job_pause
to keep the counter balanced.
The user pause from QMP or HMP will only trigger block_job_pause once
until it's resumed, this is achieved by adding a user_paused flag in
BlockJob.
One occurrence of block_job_resume in mirror_complete is replaced with
block_job_enter which does what is necessary.
In block_job_cancel, the cancel flag is good enough to instruct
coroutines to quit loop, so use block_job_enter to replace the unpaired
block_job_resume.
Upon block job IO error, user is notified about the entering to the
pause state, so this pause belongs to user pause, set the flag
accordingly and expect a matching QMP resume.
[Extended doc comments as suggested by Paolo Bonzini
<pbonzini@redhat.com>.
--Stefan]
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fix the length of the zero-fill for the back, which was accidentally
using the same value as for the front. This is caught by qemu-iotests
033.
For consistency, change the code for the front as well to use the length
stored in the iov (it is the same value, copied four lines above).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Jeff Cody <jcody@redhat.com>
This is, amongst others, required for qemu-iotests 033 to run as
intended on VHDX, which uses explicit bdrv_truncate() calls to bs->file
when allocating new blocks.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
This commit was generated mechanically by coccinelle from the following
semantic patch:
@@
expression val;
@@
- (ffs(val) - 1)
+ ctz32(val)
The call sites have been audited to ensure the ffs(0) - 1 == -1 case
never occurs (due to input validation, asserts, etc). Therefore we
don't need to worry about the fact that ctz32(0) == 32.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-5-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The command "virsh create" will fail in such condition: vm has two
disks: vda and vdb. vda has snapshot s1 with id "1", vdb doesn't have
s1 but has snapshot s2 with id "1". When we want to run command "virsh
create s1", del_existing_snapshots() only deletes s1 in vda, and
bdrv_snapshot_create() tries to create vdb's snapshot s1 with id "1",
but id "1" alreay exists in vdb with name "s2"!
The simplest way is call find_new_snapshot_id() unconditionally.
Signed-off-by: Yi Wang <up2wing@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
newer libiscsi versions may return zero events from iscsi_which_events.
In this case iscsi_service will return immediately without any progress.
To avoid busy waiting for iscsi_which_events to change we deregister all
read and write handlers in this case and schedule a timer to periodically
check iscsi_which_events for changed events.
Next libiscsi version will introduce async reconnects and zero events
are returned while libiscsi is waiting for a reconnect retry.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1428437295-29577-1-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In recent qemu versions, it is possible to override the backing file
name and format that is stored in the image file with values given at
runtime. In such cases, the temporary override could end up in the
image header if the qcow2 header was updated, while obviously correct
behaviour would be to leave the on-disk backing file path/format
unchanged.
Fix this and add a test case for it.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1428411796-2852-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block patches for 2.3.0-rc1
# gpg: Signature made Thu Mar 19 15:03:26 2015 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream:
block: Fix blockdev-backup not to use funky error class
raw-posix: Deprecate aio=threads fallback without O_DIRECT
raw-posix: Deprecate host floppy passthrough
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, if the user requests aio=native, but forgets to choose a
cache mode that sets O_DIRECT, that request is silently ignored and raw
falls back to aio=threads.
Deprecate that behaviour so we can make it an error in future qemu
versions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Raise your hand if you have a physical floppy drive in a computer
you've powered on in 2015. Okay, I see we got a few weirdos in the
audience. That's okay, weirdos are welcome here.
Kidding aside, media change detection doesn't fully work, isn't going
to be fixed, and floppy passthrough just isn't earning its keep
anymore.
Deprecate block driver host_floppy now, so we can drop it after a
grace period.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sparse reports this warning:
block/qapi.c:417:47: warning:
too long initializer-string for array of char(no space for nul char)
Replacing the string by an array of characters fixes this warning.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>