Age | Commit message (Collapse) | Author |
|
Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This avoids fallbacks to explicit zeroing in (__)blkdev_issue_zeroout if
the caller doesn't want them.
Also clean up the convoluted check for the return condition that this
new flag is added to.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
If this flag is set logical provisioning capable device should
release space for the zeroed blocks if possible, if it is not set
devices should keep the blocks anchored.
Also remove an out of sync kerneldoc comment for a static function
that would have become even more out of data with this change.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Turn the existing discard flag into a new BLKDEV_ZERO_UNMAP flag with
similar semantics, but without referring to diѕcard.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Copy & paste from the REQ_OP_WRITE_SAME code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Make life easy for implementations that needs to send a data buffer
to the device (e.g. SCSI) by numbering it as a data out command.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Schedulers need to be informed when a hardware queue is added or removed
at runtime so they can allocate/free per-hardware queue data. So,
replace the blk_mq_sched_init_hctx_data() helper, which only makes sense
at init time, with .init_hctx() and .exit_hctx() hooks.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
We've added a considerable amount of fixes for stalls and issues
with the blk-mq scheduling in the 4.11 series since forking
off the for-4.12/block branch. We need to do improvements on
top of that for 4.12, so pull in the previous fixes to make
our lives easier going forward.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
To improve scalability, if hardware queues are shared, restart
a single hardware queue in round-robin fashion. Rename
blk_mq_sched_restart_queues() to reflect the new semantics.
Remove blk_mq_sched_mark_restart_queue() because this function
has no callers. Remove flag QUEUE_FLAG_RESTART because this
patch removes the code that uses this flag.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Introduce a function that runs a hardware queue unconditionally
after a delay. Note: there is already a function that stops and
restarts a hardware queue after a delay, namely blk_mq_delay_queue().
This function will be used in the next patch in this series.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Long Li <longli@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently only dm and md/raid5 bios trigger
trace_block_bio_complete(). Now that we have bio_chain() and
bio_inc_remaining(), it is not possible, in general, for a driver to
know when the bio is really complete. Only bio_endio() knows that.
So move the trace_block_bio_complete() call to bio_endio().
Now trace_block_bio_complete() pairs with trace_block_bio_queue().
Any bio for which a 'queue' event is traced, will subsequently
generate a 'complete' event.
There are a few cases where completion tracing is not wanted.
1/ If blk_update_request() has already generated a completion
trace event at the 'request' level, there is no point generating
one at the bio level too. In this case the bi_sector and bi_size
will have changed, so the bio level event would be wrong
2/ If the bio hasn't actually been queued yet, but is being aborted
early, then a trace event could be confusing. Some filesystems
call bio_endio() but do not want tracing.
3/ The bio_integrity code interposes itself by replacing bi_end_io,
then restoring it and calling bio_endio() again. This would produce
two identical trace events if left like that.
To handle these, we introduce a flag BIO_TRACE_COMPLETION and only
produce the trace event when this is set.
We address point 1 above by clearing the flag in blk_update_request().
We address point 2 above by only setting the flag when
generic_make_request() is called.
We address point 3 above by clearing the flag after generating a
completion event.
When bio_split() is used on a bio, particularly in blk_queue_split(),
there is an extra complication. A new bio is split off the front, and
may be handle directly without going through generic_make_request().
The old bio, which has been advanced, is passed to
generic_make_request(), so it will trigger a trace event a second
time.
Probably the best result when a split happens is to see a single
'queue' event for the whole bio, then multiple 'complete' events - one
for each component. To achieve this was can:
- copy the BIO_TRACE_COMPLETION flag to the new bio in bio_split()
- avoid generating a 'queue' event if BIO_TRACE_COMPLETION is already set.
This way, the split-off bio won't create a queue event, the original
won't either even if it re-submitted to generic_make_request(),
but both will produce completion events, each for their own range.
So if generic_make_request() is called (which generates a QUEUED
event), then bi_endio() will create a single COMPLETE event for each
range that the bio is split into, unless the driver has explicitly
requested it not to.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The comment for the 'flags' field of 'bio' mentions
"command" which is no longer stored there, and doesn't
mention the bvec pool number, which is.
BIO_RESET_BITS is set in such a way that it would need to be
updated if new bits were added, which is easy to miss.
BVEC_POOL_BITS is larger than needed. The BVEC_POOL_IDX()
ranges from 0 to 6, so 3 bits are sufficient.
This patch make improvements in each of these areas.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
In elevator_switch(), if blk_mq_init_sched() fails, we attempt to fall
back to the original scheduler. However, at this point, we've already
torn down the original scheduler's tags, so this causes a crash. Doing
the fallback like the legacy elevator path is much harder for mq, so fix
it by just falling back to none, instead.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
After commit 64c7f1d1572c, we went from 1 to 2 holes in my
test setup. If we move the timeout field a bit, we remove
both of those holes and shrink struct request by 8 bytes.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Instead of bloating the generic struct request with it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The block layer core sets blk_mq_queue_data.list but no block
drivers read that member. Hence remove it and also the code that
is used to set this member.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
As Dan Carpenter pointed out: mixing 16-bit nvme status with 32-bit
error status from driver. Corrected comment on fcp request struct
status field, and converted done routine to explicitly set nvme status
codes for nvme status.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Update FC-NVME definitions to match FC-NVME r1.14 (16-020vB) plus
change voted in by 2/22 FC-NVME Adhoc (see HOSTID below).
Includes the following:
- Addition of "status_code" field to ERSP IU
- Addition of FC-NVME LS RJT reason_codes and reason_explanations
- CreateAssociation payload, HostID field shortened to 16 bytes
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Several locations in the stack need to handle ipv4/ipv6
(with scope) and port strings conversion to sockaddr.
Add a helper that takes either AF_INET, AF_INET6 or
AF_UNSPEC (for wildcard) to centralize this handling.
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The enum values for QPTYPE, PRTYPE and CMS are off by 1 from the
values defined in figure 42 of the NVM Express over Fabrics 1.0:
http://www.nvmexpress.org/wp-content/uploads/NVMe_over_Fabrics_1_0_Gold_20160605-1.pdf
Fix our enums to match the final spec.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|
|
I inadvertently applied the v5 version of this patch, whereas
the agreed upon version was v5. Revert this one so we can apply
the right one.
This reverts commit 7fc6b87a9ff537e7df32b1278118ce9c5bcd6788.
|
|
Now that the remaining drivers have been converted to one request queue
per gendisk, let's warn if a request queue gets registered more than
once. This will catch future drivers which might do it inadvertently or
any old drivers that I may have missed.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
As the .q_usage_counter is used by both legacy and
mq path, we need to block new I/O if queue becomes
dead in blk_queue_enter().
So rename it and we can use this function in both
paths.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
blkg_conf_prep() currently calls blkg_lookup_create() while holding
request queue spinlock. This means allocating memory for struct
blkcg_gq has to be made non-blocking. This causes occasional -ENOMEM
failures in call paths like below:
pcpu_alloc+0x68f/0x710
__alloc_percpu_gfp+0xd/0x10
__percpu_counter_init+0x55/0xc0
cfq_pd_alloc+0x3b2/0x4e0
blkg_alloc+0x187/0x230
blkg_create+0x489/0x670
blkg_lookup_create+0x9a/0x230
blkg_conf_prep+0x1fb/0x240
__cfqg_set_weight_device.isra.105+0x5c/0x180
cfq_set_weight_on_dfl+0x69/0xc0
cgroup_file_write+0x39/0x1c0
kernfs_fop_write+0x13f/0x1d0
__vfs_write+0x23/0x120
vfs_write+0xc2/0x1f0
SyS_write+0x44/0xb0
entry_SYSCALL_64_fastpath+0x18/0xad
In the code path above, percpu allocator cannot call vmalloc() due to
queue spinlock.
A failure in this call path gives grief to tools which are trying to
configure io weights. We see occasional failures happen shortly after
reboots even when system is not under any memory pressure. Machines
with a lot of cpus are more vulnerable to this condition.
Update blkg_create() function to temporarily drop the rcu and queue
locks when it is allowed by gfp mask.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Pull KVM fixes from Paolo Bonzini:
"All x86-specific, apart from some arch-independent syzkaller fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: cleanup the page tracking SRCU instance
KVM: nVMX: fix nested EPT detection
KVM: pci-assign: do not map smm memory slot pages in vt-d page tables
KVM: kvm_io_bus_unregister_dev() should never fail
KVM: VMX: Fix enable VPID conditions
KVM: nVMX: Fix nested VPID vmx exec control
KVM: x86: correct async page present tracepoint
kvm: vmx: Flush TLB when the APIC-access address changes
KVM: x86: use pic/ioapic destructor when destroy vm
KVM: x86: check existance before destroy
KVM: x86: clear bus pointer when destroyed
KVM: Documentation: document MCE ioctls
KVM: nVMX: don't reset kvm mmu twice
PTP: fix ptr_ret.cocci warnings
kvm: fix usage of uninit spinlock in avic_vm_destroy()
KVM: VMX: downgrade warning on unexpected exit code
|
|
User configures latency target, but the latency threshold for each
request size isn't fixed. For a SSD, the IO latency highly depends on
request size. To calculate latency threshold, we sample some data, eg,
average latency for request size 4k, 8k, 16k, 32k .. 1M. The latency
threshold of each request size will be the sample latency (I'll call it
base latency) plus latency target. For example, the base latency for
request size 4k is 80us and user configures latency target 60us. The 4k
latency threshold will be 80 + 60 = 140us.
To sample data, we calculate the order base 2 of rounded up IO sectors.
If the IO size is bigger than 1M, it will be accounted as 1M. Since the
calculation does round up, the base latency will be slightly smaller
than actual value. Also if there isn't any IO dispatched for a specific
IO size, we will use the base latency of smaller IO size for this IO
size.
But we shouldn't sample data at any time. The base latency is supposed
to be latency where disk isn't congested, because we use latency
threshold to schedule IOs between cgroups. If disk is congested, the
latency is higher, using it for scheduling is meaningless. Hence we only
do the sampling when block throttling is in the LOW limit, with
assumption disk isn't congested in such state. If the assumption isn't
true, eg, low limit is too high, calculated latency threshold will be
higher.
Hard disk is completely different. Latency depends on spindle seek
instead of request size. Currently this feature is SSD only, we probably
can use a fixed threshold like 4ms for hard disk though.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently there is no way to know the request size when the request is
finished. Next patch will need this info. We could add extra field to
record the size, but blk_issue_stat has enough space to record it, so
this patch just overloads blk_issue_stat. With this, we will have 49bits
to track time, which still is very long time.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
A cgroup gets assigned a low limit, but the cgroup could never dispatch
enough IO to cross the low limit. In such case, the queue state machine
will remain in LIMIT_LOW state and all other cgroups will be throttled
according to low limit. This is unfair for other cgroups. We should
treat the cgroup idle and upgrade the state machine to lower state.
We also have a downgrade logic. If the state machine upgrades because of
cgroup idle (real idle), the state machine will downgrade soon as the
cgroup is below its low limit. This isn't what we want. A more
complicated case is cgroup isn't idle when queue is in LIMIT_LOW. But
when queue gets upgraded to lower state, other cgroups could dispatch
more IO and this cgroup can't dispatch enough IO, so the cgroup is below
its low limit and looks like idle (fake idle). In this case, the queue
should downgrade soon. The key to determine if we should do downgrade is
to detect if cgroup is truely idle.
Unfortunately it's very hard to determine if a cgroup is real idle. This
patch uses the 'think time check' idea from CFQ for the purpose. Please
note, the idea doesn't work for all workloads. For example, a workload
with io depth 8 has disk utilization 100%, hence think time is 0, eg,
not idle. But the workload can run higher bandwidth with io depth 16.
Compared to io depth 16, the io depth 8 workload is idle. We use the
idea to roughly determine if a cgroup is idle.
We treat a cgroup idle if its think time is above a threshold (by
default 1ms for SSD and 100ms for HD). The idea is think time above the
threshold will start to harm performance. HD is much slower so a longer
think time is ok.
The patch (and the latter patches) uses 'unsigned long' to track time.
We convert 'ns' to 'us' with 'ns >> 10'. This is fast but loses
precision, should not a big deal.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"A smattering of different small fixes for some random driver
subsystems. Nothing all that major, just resolutions for reported
issues and bugs.
All have been in linux-next with no reported issues"
* tag 'char-misc-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
extcon: int3496: Set the id pin to direction-input if necessary
extcon: int3496: Use gpiod_get instead of gpiod_get_index
extcon: int3496: Add dependency on X86 as it's Intel specific
extcon: int3496: Add GPIO ACPI mapping table
extcon: int3496: Rename GPIO pins in accordance with binding
vmw_vmci: handle the return value from pci_alloc_irq_vectors correctly
ppdev: fix registering same device name
parport: fix attempt to write duplicate procfiles
auxdisplay: img-ascii-lcd: add missing sentinel entry in img_ascii_lcd_matches
Drivers: hv: vmbus: Don't leak memory when a channel is rescinded
Drivers: hv: vmbus: Don't leak channel ids
Drivers: hv: util: don't forget to init host_ts.lock
Drivers: hv: util: move waiting for release to hv_utils_transport itself
vmbus: remove hv_event_tasklet_disable/enable
vmbus: use rcu for per-cpu channel list
mei: don't wait for os version message reply
mei: fix deadlock on mei reset
intel_th: pci: Add Gemini Lake support
intel_th: pci: Add Denverton SOC support
intel_th: Don't leak module refcount on failure to activate
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO driver fixes from Greg KH:
"Here are some small IIO driver fixes for 4.11-rc4 that resolve a
number of tiny reported issues. All of these have been in linux-next
for a while with no reported issues"
* tag 'staging-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: imu: st_lsm6dsx: fix FIFO_CTRL2 overwrite during watermark configuration
iio: adc: ti_am335x_adc: fix fifo overrun recovery
iio: sw-device: Fix config group initialization
iio: magnetometer: ak8974: remove incorrect __exit markups
iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 4.11-rc4.
Nothing major here, just an bunch of small fixes, and a handfull of
good fixes from Johan for devices with crazy descriptors. There are a
few new device ids in here as well.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits)
usb: gadget: f_hid: fix: Don't access hidg->req without spinlock held
usb: gadget: udc: remove pointer dereference after free
usb: gadget: f_uvc: Sanity check wMaxPacketSize for SuperSpeed
usb: gadget: f_uvc: Fix SuperSpeed companion descriptor's wBytesPerInterval
usb: gadget: acm: fix endianness in notifications
usb: dwc3: gadget: delay unmap of bounced requests
USB: serial: qcserial: add Dell DW5811e
usb: hub: Fix crash after failure to read BOS descriptor
ACM gadget: fix endianness in notifications
USB: usbtmc: fix probe error path
USB: usbtmc: add missing endpoint sanity check
USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems
usb: musb: fix possible spinlock deadlock
usb: musb: dsps: fix iounmap in error and exit paths
usb: musb: cppi41: don't check early-TX-interrupt for Isoch transfer
usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk
uwb: i1480-dfu: fix NULL-deref at probe
uwb: hwa-rc: fix NULL-deref at probe
USB: wusbcore: fix NULL-deref at probe
USB: uss720: fix NULL-deref at probe
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt
Pull fscrypto fixes from Ted Ts'o:
"A code cleanup and bugfix for fs/crypto"
* tag 'fscrypt-for-linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
fscrypt: eliminate ->prepare_context() operation
fscrypt: remove broken support for detecting keyring key revocation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- bug fixes in asus_atk0110, it87 and max31790 drivers
- added missing API definition to hwmon core
* tag 'hwmon-for-linus-v4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (asus_atk0110) fix uninitialized data access
hwmon: Add missing HWMON_T_ALARM
hwmon: (it87) Avoid registering the same chip on both SIO addresses
hwmon: (max31790) Set correct PWM value
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"This has been a slow -rc cycle for the RDMA subsystem. We really
haven't had a lot of rc fixes come in. This pull request is the first
of this entire rc cycle and it has all of the suitable fixes so far
and it's still only about 20 patches. The fix for the minor breakage
cause by the dma mapping patchset is in here, as well as a couple
other potential oops fixes, but the rest is more minor.
Summary:
- fix for dma_ops change in this kernel, resolving the s390, powerpc,
and IOMMU operation
- a few other oops fixes
- the rest are all minor fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/qib: fix false-postive maybe-uninitialized warning
RDMA/iser: Fix possible mr leak on device removal event
IB/device: Convert ib-comp-wq to be CPU-bound
IB/cq: Don't process more than the given budget
IB/rxe: increment msn only when completing a request
uapi: fix rdma/mlx5-abi.h userspace compilation errors
IB/core: Restore I/O MMU, s390 and powerpc support
IB/rxe: Update documentation link
RDMA/ocrdma: fix a type issue in ocrdma_put_pd_num()
IB/rxe: double free on error
RDMA/vmw_pvrdma: Activate device on ethernet link up
RDMA/vmw_pvrdma: Dont hardcode QP header page
RDMA/vmw_pvrdma: Cleanup unused variables
infiniband: Fix alignment of mmap cookies to support VIPT caching
IB/core: Protect against self-requeue of a cq work item
i40iw: Receive netdev events post INET_NOTIFIER state
|
|
blk_integrity_profile's are never modified, so mark them 'const' so that
they are placed in .rodata and benefit from memory protection.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Consistently use types from linux/types.h to fix the following
rdma/mlx5-abi.h userspace compilation errors:
/usr/include/rdma/mlx5-abi.h:69:25: error: 'u64' undeclared here (not in a function)
MLX5_LIB_CAP_4K_UAR = (u64)1 << 0,
/usr/include/rdma/mlx5-abi.h:69:29: error: expected ',' or '}' before numeric constant
MLX5_LIB_CAP_4K_UAR = (u64)1 << 0,
Include <linux/if_ether.h> to fix the following rdma/mlx5-abi.h
userspace compilation error:
/usr/include/rdma/mlx5-abi.h:286:12: error: 'ETH_ALEN' undeclared here (not in a function)
__u8 dmac[ETH_ALEN];
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Avoid that the following error message is reported on the console
while loading an RDMA driver with I/O MMU support enabled:
DMAR: Allocating domain for mlx5_0 failed
Ensure that DMA mapping operations that use to_pci_dev() to
access to struct pci_dev see the correct PCI device. E.g. the s390
and powerpc DMA mapping operations use to_pci_dev() even with I/O
MMU support disabled.
This patch preserves the following changes of the DMA mapping updates
patch series:
- Introduction of dma_virt_ops.
- Removal of ib_device.dma_ops.
- Removal of struct ib_dma_mapping_ops.
- Removal of an if-statement from each ib_dma_*() operation.
- IB HW drivers no longer set dma_device directly.
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reported-by: Parav Pandit <parav@mellanox.com>
Fixes: commit 99db9494035f ("IB/core: Remove ib_device.dma_device")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: parav@mellanox.com
Tested-by: parav@mellanox.com
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
- a couple of OMAP 4.11 regression fixes, including a boot regression
for SmartReflex, hypervisor mode in thumb2 mode, and reference
counting of device nodes
- a fix for cpu_idle on at91
- minor DT fixes on across several platforms: sunxi, bcm53xx, at91,
nsp, ns2, ux500, omap
- a fix to correct an API change in the reset controllers
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (22 commits)
arm64: dts: NS2: Add dma-coherent to relevant DT entries
reset: fix optional reset_control_get stubs to return NULL
ARM: sun8i: a23/a33: drop bl_en_pin GPIO pinmux in reference design DTSI
ARM: dts: sun7i: lamobo-r1: Fix CPU port RGMII settings
ARM: dts: NSP: GPIO reboot open-source
ARM: at91: pm: cpu_idle: switch DDR to power-down mode
ARM: dts: add the AB8500 clocks to the device tree
ARM: dts: imx6sx-udoo-neo: Fix reboot hang
ARM: sun8i: Fix the mali clock rate
ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags
ARM: dts: BCM5301X: Fix memory start address
ARM: dts: BCM5301X: Fix UARTs on bcm953012k
Revert "ARM: at91/dt: sama5d2: Use new compatible for ohci node"
ARM: OMAP2+: Release device node after it is no longer needed.
ARM: OMAP2+: Fix device node reference counts
ARM: OMAP2+: Remove legacy gpmc-nand.c
ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure
ARM: dts: am335x-pcm953: Fix legacy wakeup source binding
ARM: omap2plus_defconfig: Enable INPUT_MOUSEDEV as loadable modules
ARM: dts: am57xx-idk: tpic2810 is on I2C bus, not SPI
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"There's a kaslr fix and then two patches to update our native and
compat syscall tables. Arnd asked that we take the addition of statx
to the asm-generic unistd.h via arm64, as he didn't have anything
queued in the asm-generic tree.
Summary:
- Fix mapping of kernel image under certain kaslr offsets
- Hook up new statx syscall in asm-generic syscall table
- Update compat syscall table to match arch/arm/ (pkeys and statx)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kaslr: Fix up the kernel image alignment
arm64: compat: Update compat syscalls
generic syscalls: Wire up statx syscall
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes regressions in the crypto ccp driver and the hwrng drivers
for amd and geode"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: geode - Revert managed API changes
hwrng: amd - Revert managed API changes
crypto: ccp - Assign DMA commands to the channel's CCP
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"A few fixes piled up:
- fix a NULL-ptr dereference that happens in VT-d on some platforms
- a fix for ARM MSI region reporting, so that a sane interface makes
it to a released kernel
- fixes for leaf-checking in ARM io-page-table code
- two fixes for IO/TLB flushing code on ARM Exynos platforms"
* tag 'iommu-fixes-v4.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Disambiguate MSI region types
iommu/exynos: Workaround FLPD cache flush issues for SYSMMU v5
iommu/exynos: Block SYSMMU while invalidating FLPD cache
iommu/vt-d: Fix NULL pointer dereference in device_to_iommu
iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
iommu/io-pgtable-arm: Check for leaf entry before dereferencing it
|
|
git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
- one core drm/fbdev regression fix
- a set of i915 fixes including a few GVT related fixes, along with
some reset fixes
- one new PCI id for amdgpu, and some minor workaround regression
fixes
- .. and a set of exynos fixes, dropping support for an old unsupported
SoC, some vblank timing fixes, and an info leak fix
* tag 'drm-fixes-for-v4.11-rc4' of git://people.freedesktop.org/~airlied/linux: (34 commits)
drm/fb-helper: Allow var->x/yres(_virtual) < fb->width/height again
drm/i915: make context status notifier head be per engine
drm/i915: Avoid rcu_barrier() from reclaim paths (shrinker)
drm/exynos/dsi: make te-gpios optional
drm/exynos: Print kernel pointers in a restricted form
drm/exynos/decon5433: fix software trigger mask
drm/exynos/fimd: signal frame done interrupt at front porch
drm/exynos/decon5433: signal frame done interrupt at front porch
drm/exynos/decon5433: fix vblank event handling
drm/exynos: move crtc event handling to drivers callbacks
drm/exynos: Remove support for Exynos4415 (SoC not supported anymore)
drm/exynos/decon5433: & vs | typo
drm/amd/amdgpu: add POLARIS12 PCI ID
drm/i915/gvt: Fix gvt scheduler interval time
drm/i915/gvt: GVT pin/unpin shadow context
drm/i915/gvt: scan shadow indirect context image when valid
drm/i915/kvmgt: fix suspicious rcu dereference usage
drm/i915/gvt: add enable_execlists check before enable gvt
drm/i915/gvt: Remove bogus retry around i915_wait_request
drm/i915/gvt: correct the ggtt valid bit check in pipe control command
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
Just several fixups,
- fix page fault and vblank timeout issues due to delayed vblank handling.
- fix panel driver probing to fail without te-gpios property.
- fix potential security hole by using "%pK" format.
- fix wrong if statement condition.
And one cleanup which removes Exynos4415 SoC support which is not supported
anymore.
* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos/dsi: make te-gpios optional
drm/exynos: Print kernel pointers in a restricted form
drm/exynos/decon5433: fix software trigger mask
drm/exynos/fimd: signal frame done interrupt at front porch
drm/exynos/decon5433: signal frame done interrupt at front porch
drm/exynos/decon5433: fix vblank event handling
drm/exynos: move crtc event handling to drivers callbacks
drm/exynos: Remove support for Exynos4415 (SoC not supported anymore)
drm/exynos/decon5433: & vs | typo
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus
Felipe writes:
usb: fixes for v4.11-rc4
f_acm got an endianness fix by Oliver Neukum. This has been around for a
long time but it's finally fixed.
f_hid learned that it should never access hidg->req without first
grabbing the spinlock.
Roger Quadros fixed two bugs in the f_uvc function driver.
Janusz Dziedzic fixed a very peculiar bug with EP0, one that's rather
difficult to trigger. When we're dealing with bounced EP0 requests, we
should delay unmap until after ->complete() is called.
UDC class got a use-after-free fix.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"Zygo tracked down a very old bug with inline compressed extents.
I didn't tag this one for stable because I want to do individual
tested backports. It's a little tricky and I'd rather do some extra
testing on it along the way"
* 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: add missing memset while reading compressed inline extents
Btrfs: fix regression in lock_delalloc_pages
btrfs: remove btrfs_err_str function from uapi/linux/btrfs.h
|
|
Pull networking fixes from David Miller:
1) Several netfilter fixes from Pablo and the crew:
- Handle fragmented packets properly in netfilter conntrack, from
Florian Westphal.
- Fix SCTP ICMP packet handling, from Ying Xue.
- Fix big-endian bug in nftables, from Liping Zhang.
- Fix alignment of fake conntrack entry, from Steven Rostedt.
2) Fix feature flags setting in fjes driver, from Taku Izumi.
3) Openvswitch ipv6 tunnel source address not set properly, from Or
Gerlitz.
4) Fix jumbo MTU handling in amd-xgbe driver, from Thomas Lendacky.
5) sk->sk_frag.page not released properly in some cases, from Eric
Dumazet.
6) Fix RTNL deadlocks in nl80211, from Johannes Berg.
7) Fix erroneous RTNL lockdep splat in crypto, from Herbert Xu.
8) Cure improper inflight handling during AF_UNIX GC, from Andrey
Ulanov.
9) sch_dsmark doesn't write to packet headers properly, from Eric
Dumazet.
10) Fix SCM_TIMESTAMPING_OPT_STATS handling in TCP, from Soheil Hassas
Yeganeh.
11) Add some IDs for Motorola qmi_wwan chips, from Tony Lindgren.
12) Fix nametbl deadlock in tipc, from Ying Xue.
13) GRO and LRO packets not counted correctly in mlx5 driver, from Gal
Pressman.
14) Fix reset of internal PHYs in bcmgenet, from Doug Berger.
15) Fix hashmap allocation handling, from Alexei Starovoitov.
16) nl_fib_input() needs stronger netlink message length checking, from
Eric Dumazet.
17) Fix double-free of sk->sk_filter during sock clone, from Daniel
Borkmann.
18) Fix RX checksum offloading in aquantia driver, from Pavel Belous.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits)
net:ethernet:aquantia: Fix for RX checksum offload.
amd-xgbe: Fix the ECC-related bit position definitions
sfc: cleanup a condition in efx_udp_tunnel_del()
Bluetooth: btqcomsmd: fix compile-test dependency
inet: frag: release spinlock before calling icmp_send()
tcp: initialize icsk_ack.lrcvtime at session start time
genetlink: fix counting regression on ctrl_dumpfamily()
socket, bpf: fix sk_filter use after free in sk_clone_lock
ipv4: provide stronger user input validation in nl_fib_input()
bpf: fix hashmap extra_elems logic
enic: update enic maintainers
net: bcmgenet: remove bcmgenet_internal_phy_setup()
ipv6: make sure to initialize sockc.tsflags before first use
fjes: Do not load fjes driver if extended socket device is not power on.
fjes: Do not load fjes driver if system does not have extended socket device.
net/mlx5e: Count LRO packets correctly
net/mlx5e: Count GSO packets correctly
net/mlx5: Increase number of max QPs in default profile
net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps
net/mlx5e: Use the proper UAPI values when offloading TC vlan actions
...
|
|
No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.
There is nothing the callers could do, except retrying over and over
again.
So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).
Fixes: e93f8a0f821e ("KVM: convert io_bus to SRCU")
Cc: stable@vger.kernel.org # 3.4+
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
There isn't a bug here, but Smatch is not smart enough to know that
"nr_iovecs" can't be negative so it complains about underflows.
Really, it's slightly cleaner to make this parameter unsigned.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This flag was never used since it was introduced.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Make the function available for outside use and fortify it against NULL
kobject.
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|