Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Improve stop_machine wait logic: replace cpu_relax_yield call in
generic stop_machine function with a weak stop_machine_yield
function. This is overridden on s390, which yields the current cpu to
the neighbouring cpu after a couple of retries, instead of blindly
giving up the cpu to the hipervisor. This significantly improves
stop_machine performance on s390 in overcommitted scenarios.
This includes common code changes which have been Acked by Peter
Zijlstra and Thomas Gleixner.
- Improve jump label transformation speed: transform jump labels
without using stop_machine.
- Refactoring of the vfio-ccw cp handling, simplifying the code and
avoiding unneeded allocating/copying.
- Various vfio-ccw fixes (ccw translation, state machine).
- Add support for vfio-ap queue interrupt control in the guest. This
includes s390 kvm changes which have been Acked by Christian
Borntraeger.
- Add protected virtualization support for virtio-ccw.
- Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to
remove some code which most likely isn't working at all, besides that
s390 didn't even compile for !CONFIG_SMP.
- Support for special flagged EP11 CPRBs for zcrypt.
- Handle PCI devices with no support for new MIO instructions.
- Avoid KASAN false positives in reworked stack unwinder.
- Couple of fixes for the QDIO layer.
- Convert s390 specific documentation to ReST format.
- Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if
hardware is missing. This way our modules behave like most other
modules and which is also what systemd's systemd-modules-load.service
expects.
- Replace defconfig with performance_defconfig, so there is one config
file less to maintain.
- Remove the SCLP call home device driver, which was never useful.
- Cleanups all over the place.
* tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
docs: s390: s390dbf: typos and formatting, update crash command
docs: s390: unify and update s390dbf kdocs at debug.c
docs: s390: restore important non-kdoc parts of s390dbf.rst
vfio-ccw: Fix the conversion of Format-0 CCWs to Format-1
s390/pci: correctly handle MIO opt-out
s390/pci: deal with devices that have no support for MIO instructions
s390: ap: kvm: Enable PQAP/AQIC facility for the guest
s390: ap: implement PAPQ AQIC interception in kernel
vfio: ap: register IOMMU VFIO notifier
s390: ap: kvm: add PQAP interception for AQIC
s390/unwind: cleanup unused READ_ONCE_TASK_STACK
s390/kasan: avoid false positives during stack unwind
s390/qdio: don't touch the dsci in tiqdio_add_input_queues()
s390/qdio: (re-)initialize tiqdio list entries
s390/dasd: Fix a precision vs width bug in dasd_feature_list()
s390/cio: introduce driver_override on the css bus
vfio-ccw: make convert_ccw0_to_ccw1 static
vfio-ccw: Remove copy_ccw_from_iova()
vfio-ccw: Factor out the ccw0-to-ccw1 transition
vfio-ccw: Copy CCW data outside length calculation
...
|
|
When processing Format-0 CCWs, we use the "len" variable as the
number of CCWs to convert to Format-1. But that variable
contains zero here, and is not a meaningful CCW count until
ccwchain_calc_length() returns. Since that routine requires and
expects Format-1 CCWs to identify the chaining behavior, the
format conversion must be done first.
Convert the 2KB we copied even if it's more than we need.
Fixes: 7f8e89a8f2fd ("vfio-ccw: Factor out the ccw0-to-ccw1 transition")
Reported-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190702180928.18113-1-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
We register a AP PQAP instruction hook during the open
of the mediated device. And unregister it on release.
During the probe of the AP device, we allocate a vfio_ap_queue
structure to keep track of the information we need for the
PQAP/AQIC instruction interception.
In the AP PQAP instruction hook, if we receive a demand to
enable IRQs,
- we retrieve the vfio_ap_queue based on the APQN we receive
in REG1,
- we retrieve the page of the guest address, (NIB), from
register REG2
- we retrieve the mediated device to use the VFIO pinning
infrastructure to pin the page of the guest address,
- we retrieve the pointer to KVM to register the guest ISC
and retrieve the host ISC
- finaly we activate GISA
If we receive a demand to disable IRQs,
- we deactivate GISA
- unregister from the GIB
- unpin the NIB
When removing the AP device from the driver the device is
reseted and this process unregisters the GISA from the GIB,
and unpins the NIB address then we free the vfio_ap_queue
structure.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
To be able to use the VFIO interface to facilitate the
mediated device memory pinning/unpinning we need to register
a notifier for IOMMU.
While we will start to pin one guest page for the interrupt indicator
byte, this is still ok with ballooning as this page will never be
used by the guest virtio-balloon driver.
So the pinned page will never be freed. And even a broken guest does
so, that would not impact the host as the original page is still
in control by vfio.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
We prepare the interception of the PQAP/AQIC instruction for
the case the AQIC facility is enabled in the guest.
First of all we do not want to change existing behavior when
intercepting AP instructions without the SIE allowing the guest
to use AP instructions.
In this patch we only handle the AQIC interception allowed by
facility 65 which will be enabled when the complete interception
infrastructure will be present.
We add a callback inside the KVM arch structure for s390 for
a VFIO driver to handle a specific response to the PQAP
instruction with the AQIC command and only this command.
But we want to be able to return a correct answer to the guest
even there is no VFIO AP driver in the kernel.
Therefor, we inject the correct exceptions from inside KVM for the
case the callback is not initialized, which happens when the vfio_ap
driver is not loaded.
We do consider the responsibility of the driver to always initialize
the PQAP callback if it defines queues by initializing the CRYCB for
a guest.
If the callback has been setup we call it.
If not we setup an answer considering that no queue is available
for the guest when no callback has been setup.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Current code sets the dsci to 0x00000080. Which doesn't make any sense,
as the indicator area is located in the _left-most_ byte.
Worse: if the dsci is the _shared_ indicator, this potentially clears
the indication of activity for a _different_ device.
tiqdio_thinint_handler() will then have no reason to call that device's
IRQ handler, and the device ends up stalling.
Fixes: d0c9d4a89fff ("[S390] qdio: set correct bit in dsci")
Cc: <stable@vger.kernel.org>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
When tiqdio_remove_input_queues() removes a queue from the tiq_list as
part of qdio_shutdown(), it doesn't re-initialize the queue's list entry
and the prev/next pointers go stale.
If a subsequent qdio_establish() fails while sending the ESTABLISH cmd,
it calls qdio_shutdown() again in QDIO_IRQ_STATE_ERR state and
tiqdio_remove_input_queues() will attempt to remove the queue entry a
second time. This dereferences the stale pointers, and bad things ensue.
Fix this by re-initializing the list entry after removing it from the
list.
For good practice also initialize the list entry when the queue is first
allocated, and remove the quirky checks that papered over this omission.
Note that prior to
commit e521813468f7 ("s390/qdio: fix access to uninitialized qdio_q fields"),
these checks were bogus anyway.
setup_queues_misc() clears the whole queue struct, and thus needs to
re-init the prev/next pointers as well.
Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
The "len" variable is the length of the option up to the next option or
to the end of the string which ever first. We want to print the invalid
option so we want precision "%.*s" but the format is width "%*s" so it
prints up to the end of the string.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Sometimes, we want to control which of the matching drivers
binds to a subchannel device (e.g. for subchannels we want to
handle via vfio-ccw).
For pci devices, a mechanism to do so has been introduced in
782a985d7af2 ("PCI: Introduce new device binding path using
pci_dev.driver_override"). It makes sense to introduce the
driver_override attribute for subchannel devices as well, so
that we can easily extend the 'driverctl' tool (which makes
use of the driver_override attribute for pci).
Note that unlike pci we still require a driver override to
match the subchannel type; matching more than one subchannel
type is probably not useful anyway.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Reported by sparse.
Fixes: 7f8e89a8f2fd ("vfio-ccw: Factor out the ccw0-to-ccw1 transition")
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190624090721.16241-1-cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features
Refactoring of the vfio-ccw cp handling, simplifying the
code and avoiding unneeded allocating/copying.
* tag 'vfio-ccw-20190621' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw:
vfio-ccw: Remove copy_ccw_from_iova()
vfio-ccw: Factor out the ccw0-to-ccw1 transition
vfio-ccw: Copy CCW data outside length calculation
vfio-ccw: Skip second copy of guest cp to host
vfio-ccw: Move guest_cp storage into common struct
s390/cio: Combine direct and indirect CCW paths
vfio-ccw: Rearrange IDAL allocation in direct CCW
vfio-ccw: Remove pfn_array_table
vfio-ccw: Adjust the first IDAW outside of the nested loops
vfio-ccw: Rearrange pfn_array and pfn_array_table arrays
s390/cio: Use generalized CCW handler in cp_init()
s390/cio: Generalize the TIC handler
s390/cio: Refactor the routine that handles TIC CCWs
s390/cio: Squash cp_free() and cp_unpin_free()
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Just to keep things tidy.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-6-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
This is a really useful function, but it's buried in the
copy_ccw_from_iova() routine so that ccwchain_calc_length()
can just work with Format-1 CCWs while doing its counting.
But it means we're translating a full 2K of "CCWs" to Format-1,
when in reality there's probably far fewer in that space.
Let's factor it out, so maybe we can do something with it later.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-5-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
It doesn't make much sense to "hide" the copy to the channel_program
struct inside a routine that calculates the length of the chain.
Let's move it to the calling routine, which will later copy from
channel_program to the memory it allocated itself.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-4-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
We already pinned/copied/unpinned 2K (256 CCWs) of guest memory
to the host space anchored off vfio_ccw_private. There's no need
to do that again once we have the length calculated, when we could
just copy the section we need to the "permanent" space for the I/O.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-3-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Rather than allocating/freeing a piece of memory every time
we try to figure out how long a CCW chain is, let's use a piece
of memory allocated for each device.
The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect
the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ
event that accesses/uses this space, so there should be no race
concerns with another CPU attempting an (unexpected) SSCH for the
same device.
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190618202352.39702-2-farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
This allows device drivers (eg. qeth) to use the struct when processing
information retrieved via RCD.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
This feature has never been used, so remove it.
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
With both the direct-addressed and indirect-addressed CCW paths
simplified to this point, the amount of shared code between them is
(hopefully) more easily visible. Move the processing of IDA-specific
bits into the direct-addressed path, and add some useful commentary of
what the individual pieces are doing. This allows us to remove the
entire ccwchain_fetch_idal() routine and maintain a single function
for any non-TIC CCW.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-10-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
This is purely deck furniture, to help understand the merge of the
direct and indirect handlers.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-9-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Now that both CCW codepaths build this nested array:
ccwchain->pfn_array_table[1]->pfn_array[#idaws/#pages]
We can collapse this into simply:
ccwchain->pfn_array[#idaws/#pages]
Let's do that, so that we don't have to continually navigate two
nested arrays when the first array always has a count of one.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-8-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Now that pfn_array_table[] is always an array of 1, it seems silly to
check for the very first entry in an array in the middle of two nested
loops, since we know it'll only ever happen once.
Let's move this outside the loops to simplify things, even though
the "k" variable is still necessary.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-7-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
While processing a channel program, we currently have two nested
arrays that carry a slightly different structure. The direct CCW
path creates this:
ccwchain->pfn_array_table[1]->pfn_array[#pages]
while an IDA CCW creates:
ccwchain->pfn_array_table[#idaws]->pfn_array[1]
The distinction appears to state that each pfn_array_table entry
points to an array of contiguous pages, represented by a pfn_array,
um, array. Since the direct-addressed scenario can ONLY represent
contiguous pages, it makes the intermediate array necessary but
difficult to recognize. Meanwhile, since an IDAL can contain
non-contiguous pages and there is no logic in vfio-ccw to detect
adjacent IDAWs, it is the second array that is necessary but appearing
to be superfluous.
I am not aware of any documentation that states the pfn_array[] needs
to be of contiguous pages; it is just what the code does today.
I don't see any reason for this either, let's just flip the IDA
codepath around so that it generates:
ch_pat->pfn_array_table[1]->pfn_array[#idaws]
This will bring it in line with the direct-addressed codepath,
so that we can understand the behavior of this memory regardless
of what type of CCW is being processed. And it means the casual
observer does not need to know/care whether the pfn_array[]
represents contiguous pages or not.
NB: The existing vfio-ccw code only supports 4K-block Format-2 IDAs,
so that "#pages" == "#idaws" in this area. This means that we will
have difficulty with this overlap in terminology if support for
Format-1 or 2K-block Format-2 IDAs is ever added. I don't think that
this patch changes our ability to make that distinction.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-6-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
It is now pretty apparent that ccwchain_handle_ccw()
(nee ccwchain_handle_tic()) does everything that cp_init()
wants to do.
Let's remove that duplicated code from cp_init() and let
ccwchain_handle_ccw() handle it itself.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-5-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Refactor ccwchain_handle_tic() into a routine that handles a channel
program address (which itself is a CCW pointer), rather than a CCW pointer
that is only a TIC CCW. This will make it easier to reuse this code for
other CCW commands.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-4-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Extract the "does the target of this TIC already exist?" check from
ccwchain_handle_tic(), so that it's easier to refactor that function
into one that cp_init() is able to use.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-3-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
The routine cp_free() does nothing but call cp_unpin_free(), and while
most places call cp_free() there is one caller of cp_unpin_free() used
when the cp is guaranteed to have not been marked initialized.
This seems like a dubious way to make a distinction, so let's combine
these routines and make cp_free() do all the work.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190606202831.44135-2-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
The hypervisor needs to interact with the summary indicators, so these
need to be DMA memory as well (at least for protected virtualization
guests).
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Before virtio-ccw could get away with not using DMA API for the pieces of
memory it does ccw I/O with. With protected virtualization this has to
change, since the hypervisor needs to read and sometimes also write these
pieces of memory.
The hypervisor is supposed to poke the classic notifiers, if these are
used, out of band with regards to ccw I/O. So these need to be allocated
as DMA memory (which is shared memory for protected virtualization
guests).
Let us factor out everything from struct virtio_ccw_device that needs to
be DMA memory in a satellite that is allocated as such.
Note: The control blocks of I/O instructions do not need to be shared.
These are marshalled by the ultravisor.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
This will come in handy soon when we pull out the indicators from
virtio_ccw_device to a memory area that is shared with the hypervisor
(in particular for protected virtualization guests).
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
The flag AIRQ_IV_CACHELINE was recently added to airq_iv_create(). Let
us use it! We actually wanted the vector to span a cacheline all along.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Protected virtualization guests have to use shared pages for airq
notifier bit vectors, because the hypervisor needs to write these bits.
Let us make sure we allocate DMA memory for the notifier bit vectors by
replacing the kmem_cache with a dma_cache and kalloc() with
cio_dma_zalloc().
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
As virtio-ccw devices are channel devices, we need to use the
dma area within the common I/O layer for any communication with
the hypervisor.
Note that we do not need to use that area for control blocks
directly referenced by instructions, e.g. the orb.
It handles neither QDIO in the common code, nor any device type specific
stuff (like channel programs constructed by the DASD driver).
An interesting side effect is that virtio structures are now going to
get allocated in 31 bit addressable storage.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
To support protected virtualization cio will need to make sure the
memory used for communication with the hypervisor is DMA memory.
Let us introduce one global pool for cio.
Our DMA pools are implemented as a gen_pool backed with DMA pages. The
idea is to avoid each allocation effectively wasting a page, as we
typically allocate much less than PAGE_SIZE.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
systemd-modules-load.service automatically tries to load the pkey module
on systems that have MSA.
Pkey also requires the MSA3 facility and a bunch of subfunctions.
Failing with -EOPNOTSUPP makes "systemd-modules-load.service" fail on
any system that does not have all needed subfunctions. For example,
when running under QEMU TCG (but also on systems where protected keys
are disabled via the HMC).
Let's use -ENODEV, so systemd-modules-load.service properly ignores
failing to load the pkey module because of missing HW functionality.
While at it, also convert the -EOPNOTSUPP in pkey_clr2protkey() to -ENODEV.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Free the vfio_ccw_cmd_region on module exit.
Fixes: d5afd5d135c8 ("vfio-ccw: add handling for async channel instructions")
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <c0f39039d28af39ea2939391bf005e3495d890fd.1559576250.git.alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Convert all text files with s390 documentation to ReST format.
Tried to preserve as much as possible the original document
format. Still, some of the files required some work in order
for it to be visible on both plain text and after converted
to html.
The conversion is actually:
- add blank lines and identation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Pull networking fixes from David Miller:
1) Free AF_PACKET po->rollover properly, from Willem de Bruijn.
2) Read SFP eeprom in max 16 byte increments to avoid problems with
some SFP modules, from Russell King.
3) Fix UDP socket lookup wrt. VRF, from Tim Beale.
4) Handle route invalidation properly in s390 qeth driver, from Julian
Wiedmann.
5) Memory leak on unload in RDS, from Zhu Yanjun.
6) sctp_process_init leak, from Neil HOrman.
7) Fix fib_rules rule insertion semantic change that broke Android,
from Hangbin Liu.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
pktgen: do not sleep with the thread lock held.
net: mvpp2: Use strscpy to handle stat strings
net: rds: fix memory leak in rds_ib_flush_mr_pool
ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
ipv6: use READ_ONCE() for inet->hdrincl as in ipv4
Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"
net: aquantia: fix wol configuration not applied sometimes
ethtool: fix potential userspace buffer overflow
Fix memory leak in sctp_process_init
net: rds: fix memory leak when unload rds_rdma
ipv6: fix the check before getting the cookie in rt6_get_cookie
ipv4: not do cache for local delivery if bc_forwarding is enabled
s390/qeth: handle error when updating TX queue count
s390/qeth: fix VLAN attribute in bridge_hostnotify udev event
s390/qeth: check dst entry before use
s390/qeth: handle limited IPv4 broadcast in L3 TX path
net: fix indirect calls helpers for ptype list hooks.
net: ipvlan: Fix ipvlan device tso disabled while NETIF_F_IP_CSUM is set
udp: only choose unbound UDP socket for multicast when not in a VRF
net/tls: replace the sleeping lock around RX resync with a bit lock
...
|
|
When a CQ-enabled device uses QEBSM for SBAL state inspection,
get_buf_states() can return the PENDING state for an Output Queue.
get_outbound_buffer_frontier() isn't prepared for this, and any PENDING
buffer will permanently stall all further completion processing on this
Queue.
This isn't a concern for non-QEBSM devices, as get_buf_states() for such
devices will manually turn PENDING buffers into EMPTY ones.
Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Add missing parameter description to fix the following warning:
drivers/s390/cio/qdio_thinint.c:183: warning:
Function parameter or member 'floating' not described in 'tiqdio_thinint_handler'
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
Within an EP11 cprb there exists a byte field flags. Bit 0x20
of this field indicates a special cprb. A special cprb triggers
special handling in the firmware below the OS layer.
However, a special cprb also needs to have the S bit in GPR0
set when NQAP is called. This was not the case for EP11 cprbs
and this patch now introduces the code to support this.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
netif_set_real_num_tx_queues() can return an error, deal with it.
Fixes: 73dc2daf110f ("s390/qeth: add TX multiqueue support for OSA devices")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enabling sysfs attribute bridge_hostnotify triggers a series of udev events
for the MAC addresses of all currently connected peers. In case no VLAN is
set for a peer, the device reports the corresponding MAC addresses with
VLAN ID 4096. This currently results in attribute VLAN=4096 for all
non-VLAN interfaces in the initial series of events after host-notify is
enabled.
Instead, no VLAN attribute should be reported in the udev event for
non-VLAN interfaces.
Only the initial events face this issue. For dynamic changes that are
reported later, the device uses a validity flag.
This also changes the code so that it now sets the VLAN attribute for
MAC addresses with VID 0. On Linux, no qeth interface will ever be
registered with VID 0: Linux kernel registers VID 0 on all network
interfaces initially, but qeth will drop .ndo_vlan_rx_add_vid for VID 0.
Peers with other OSs could register MACs with VID 0.
Fixes: 9f48b9db9a22 ("qeth: bridgeport support - address notifications")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While qeth_l3 uses netif_keep_dst() to hold onto the dst, a skb's dst
may still have been obsoleted (via dst_dev_put()) by the time that we
end up using it. The dst then points to the loopback interface, which
means the neighbour lookup in qeth_l3_get_cast_type() determines a bogus
cast type of RTN_BROADCAST.
For IQD interfaces this causes us to place such skbs on the wrong
HW queue, resulting in TX errors.
Fix-up the various call sites to first validate the dst entry with
dst_check(), and fall back accordingly.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When selecting the cast type of a neighbourless IPv4 skb (eg. on a raw
socket), qeth_l3 falls back to the packet's destination IP address.
For this case we should classify traffic sent to 255.255.255.255 as
broadcast.
This fixes DHCP requests, which were misclassified as unicast
(and for IQD interfaces thus ended up on the wrong HW queue).
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features
various vfio-ccw fixes (ccw translation, state machine)
|
|
Formatting of Kconfig files doesn't look so pretty, so just
take damp cloth and clean it up.
Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|
If the CCW being processed is a No-Operation, then by definition no
data is being transferred. Let's fold those checks into the normal
CCW processors, rather than skipping out early.
Likewise, if the CCW being processed is a "test" (a category defined
here as an opcode that contains zero in the lowest four bits) then no
special processing is necessary as far as vfio-ccw is concerned.
These command codes have not been valid since the S/370 days, meaning
they are invalid in the same way as one that ends in an eight [1] or
an otherwise valid command code that is undefined for the device type
in question. Considering that, let's just process "test" CCWs like
any other CCW, and send everything to the hardware.
[1] POPS states that a x08 is a TIC CCW, and that having any high-order
bits enabled is invalid for format-1 CCWs. For format-0 CCWs, the
high-order bits are ignored.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190516161403.79053-4-farman@linux.ibm.com>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
It is possible that a guest might issue a CCW with a length of zero,
and will expect a particular response. Consider this chain:
Address Format-1 CCW
-------- -----------------
0 33110EC0 346022CC 33177468
1 33110EC8 CF200000 3318300C
CCW[0] moves a little more than two pages, but also has the
Suppress Length Indication (SLI) bit set to handle the expectation
that considerably less data will be moved. CCW[1] also has the SLI
bit set, and has a length of zero. Once vfio-ccw does its magic,
the kernel issues a start subchannel on behalf of the guest with this:
Address Format-1 CCW
-------- -----------------
0 021EDED0 346422CC 021F0000
1 021EDED8 CF240000 3318300C
Both CCWs were converted to an IDAL and have the corresponding flags
set (which is by design), but only the address of the first data
address is converted to something the host is aware of. The second
CCW still has the address used by the guest, which happens to be (A)
(probably) an invalid address for the host, and (B) an invalid IDAW
address (doubleword boundary, etc.).
While the I/O fails, it doesn't fail correctly. In this example, we
would receive a program check for an invalid IDAW address, instead of
a unit check for an invalid command.
To fix this, revert commit 4cebc5d6a6ff ("vfio: ccw: validate the
count field of a ccw before pinning") and allow the individual fetch
routines to process them like anything else. We'll make a slight
adjustment to our allocation of the pfn_array (for direct CCWs) or
IDAL (for IDAL CCWs) memory, so that we have room for at least one
address even though no guest memory will be pinned and thus the
IDAW will not be populated with a host address.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190516161403.79053-3-farman@linux.ibm.com>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
The skip flag of a CCW offers the possibility of data not being
transferred, but is only meaningful for certain commands.
Specifically, it is only applicable for a read, read backward, sense,
or sense ID CCW and will be ignored for any other command code
(SA22-7832-11 page 15-64, and figure 15-30 on page 15-75).
(A sense ID is xE4, while a sense is x04 with possible modifiers in the
upper four bits. So we will cover the whole "family" of sense CCWs.)
For those scenarios, since there is no requirement for the target
address to be valid, we should skip the call to vfio_pin_pages() and
rely on the IDAL address we have allocated/built for the channel
program. The fact that the individual IDAWs within the IDAL are
invalid is fine, since they aren't actually checked in these cases.
Set pa_nr to zero when skipping the pfn_array_pin() call, since it is
defined as the number of pages pinned and is used to determine
whether to call vfio_unpin_pages() upon cleanup.
The pfn_array_pin() routine returns the number of pages that were
pinned, but now might be skipped for some CCWs. Thus we need to
calculate the expected number of pages ourselves such that we are
guaranteed to allocate a reasonable number of IDAWs, which will
provide a valid address in CCW.CDA regardless of whether the IDAWs
are filled in with pinned/translated addresses or not.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20190516161403.79053-2-farman@linux.ibm.com>
Acked-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|