RdmaProtectedQList provides a thread-safe queue of int64_t on top of a
QList.
rdma_protected_qlist_destroy() calls qlist_destroy_obj() directly.
qlist_destroy_obj() is actually for use by qobject_destroy() only.
The next commit will make that obvious.
The minimal fix would be calling qobject_unref() instead. But QList
is actually a bad fit here. It's designed for representing JSON
arrays. We're better off with a GQueue here. Replace.
Cc: Yuval Shaia <yuval.shaia.ml@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201211171152.146877-8-armbru@redhat.com>
clang's C11 atomic_fetch_*() functions only take a C11 atomic type
pointer argument. QEMU uses direct types (int, etc) and this causes a
compiler error when a QEMU code calls these functions in a source file
that also included <stdatomic.h> via a system header file:
$ CC=clang CXX=clang++ ./configure ... && make
../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid)
Avoid using atomic_*() names in QEMU's atomic.h since that namespace is
used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h
and <stdatomic.h> can co-exist. I checked /usr/include on my machine and
searched GitHub for existing "qatomic_" users but there seem to be none.
This patch was generated using:
$ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \
sort -u >/tmp/changed_identifiers
$ for identifier in $(</tmp/changed_identifiers); do
sed -i "s%\<$identifier\>%q$identifier%g" \
$(git grep -I -l "\<$identifier\>")
done
I manually fixed line-wrap issues and misaligned rST tables.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
With the change made in commit 68b89aee71 ("Utilize ibv_reg_mr_iova for
memory registration") the MR emulation is no longer needed in order to
translate the guest addresses into host addresses.
With that, the next obvious step is to skip entirely the processing in
data-path.
To accomplish this, return the backend's lkey to driver so we will not
need to do the emulated mr_id to backend mr_id translation in data-path.
The function build_host_sge_array is still called in data-path but only
for backward computability with statistics collection.
While there, as a cosmetic change to make the code cleaner - make one
copy of the function rdma_backend_create_mr and leave the redundant
guest_start argument in the legacy code.
Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20200320143429.9490-3-yuval.shaia.ml@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
The virtual address that is provided by the guest in post_send and
post_recv operations is related to the guest address space. This address
space is unknown to the HCA resides on host so extra step in these
operations is needed to adjust the address to host virtual address.
This step, which is done in data-path affects performances.
An enhanced verion of MR registration introduced here
https://patchwork.kernel.org/patch/11044467/ can be used so that the
guest virtual address space for this MR is known to the HCA in host.
This will save the data-path adjustment.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20190818132107.18181-3-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
This parameter has no effect, fix it.
The function init_dev_caps sets the front-end's max-sge to MAX_SGE. Then
it checks backend's max-sge and adjust it accordingly (we can't send
more than what the device supports).
On send and recv we need to make sure the num_sge in the WQE does not
exceeds the backend device capability.
This check is done in pvrdma level so check on rdma level is deleted.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Message-Id: <20190109194123.3468-1-yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
With commit 4481985c (rdma: check num_sge does not exceed MAX_SGE)
macro VENDOR_ERR_NO_SGE is no longer in use - delete it.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
rdma back-end has scatter/gather array ibv_sge[MAX_SGE=4] set
to have 4 elements. A guest could send a 'PvrdmaSqWqe' ring element
with 'num_sge' set to > MAX_SGE, which may lead to OOB access issue.
Add check to avoid it.
Reported-by: Saar Amar <saaramar5@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
The control over the RDMA device's GID table is done by updating the
device's Ethernet function addresses.
Usually the first GID entry is determined by the MAC address, the second
by the first IPv6 address and the third by the IPv4 address. Other
entries can be added by adding more IP addresses. The opposite is the
same, i.e. whenever an address is removed, the corresponding GID entry
is removed.
The process is done by the network and RDMA stacks. Whenever an address
is added the ib_core driver is notified and calls the device driver
add_gid function which in turn update the device.
To support this in pvrdma device we need to hook into the create_bind
and destroy_bind HW commands triggered by pvrdma driver in guest.
Whenever a change is made to the pvrdma port's GID table a special QMP
message is sent to be processed by libvirt to update the address of the
backend Ethernet device.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
MAD (Management Datagram) packets are widely used by various modules
both in kernel and in user space for example the rdma_* API which is
used to create and maintain "connection" layer on top of RDMA uses
several types of MAD packets.
For more information please refer to chapter 13.4 in Volume 1
Architecture Specification, Release 1.1 available here:
https://www.infinibandta.org/ibta-specifications-download/
To support MAD packets the device uses an external utility
(contrib/rdmacm-mux) to relay packets from and to the guest driver.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
There are certain operations that are well considered as part of device
configuration while others are needed only when "start" command is
triggered by the guest driver. An example of device initialization step
is msix_init and example of "device start" stage is the creation of a CQ
completion handler thread.
Driver expects such distinction - implement it.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20180805153518.2983-2-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
To avoid compilation warnings on 32-bit machines:
rdma_backend.c: In function 'rdma_backend_create_mr':
rdma_backend.c:409:37: error: cast to pointer from integer of different
size [-Werror=int-to-pointer-cast]
mr->ibmr = ibv_reg_mr(pd->ibpd, (void *)addr, length, access);
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180322095220.9976-2-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Our rule right now is to use <> for external headers only.
RDMA code violates that, fix it up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
This layer is composed of two sub-modules, backend and resource manager.
Backend sub-module is responsible for all the interaction with IB layers
such as ibverbs and umad (external libraries).
Resource manager is a collection of functions and structures to manage
RDMA resources such as QPs, CQs and MRs.
Reviewed-by: Dotan Barak <dotanb@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>