Age | Commit message (Collapse) | Author |
|
Not used any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/397087/?series=83051&rev=1
|
|
It makes no difference to kmalloc if the structure
is 48 or 64 bytes in size.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/396950/
|
|
This is not related to allocating the backing store in any way.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/396947/
|
|
Neither page allocation backend nor the driver should mess with that.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Link: https://patchwork.freedesktop.org/patch/396948/
|
|
All drivers can determine the tt caching state at creation time,
no need to do this on the fly during every validation.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394253/
|
|
This is not something drivers should use.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/393430/
|
|
This adds 2 getters and 4 setters, however unbound and populated
are currently the same thing, this will change, it also drops
a BUG_ON that seems not that useful.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-2-airlied@gmail.com
|
|
Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit. Fix it by replacing with scnprintf().
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/357174/
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Drivers like vmwgfx may want to test whether the dma page pool is present
or not. Since it's activated by default by TTM if compiled-in, define a
hidden configuration option that the driver can test for.
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
|
|
As the name says global memory and bo accounting is global. So it doesn't
make to much sense having pointers to global structures all around the code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/332879/
|
|
In function __ttm_dma_alloc_page(), d_page->addr is allocated
by dma_alloc_attrs() but freed with use dma_free_coherent() in
__ttm_dma_free_page().
Use the correct dma_free_attrs() to free d_page->vaddr.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
These codes are not used.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch fixed the error when do not configure CONFIG_X86, otherwise, below
error will be encountered.
All errors (new ones prefixed by >>):
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c: In function 'ttm_set_pages_caching':
>> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:272:7: error: implicit declaration of function 'set_pages_array_uc'; did you mean
+'ttm_set_pages_array_uc'? [-Werror=implicit-function-declaration]
r = set_pages_array_uc(pages, cpages);
^~~~~~~~~~~~~~~~~~
ttm_set_pages_array_uc
cc1: some warnings being treated as errors
Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Every set_pages_array_wb call resulted in cross-core
interrupts and TLB flushes. Merge more of them for
less overhead.
This reduces the time needed to free a 1.6 GiB GTT WC
buffer as part of Vulkan CTS from ~2 sec to < 0.25 sec.
(Allocation still takes more than 2 sec though)
(v2): use set_pages_wb instead of set_memory_wb.
Signed-off-by: Bas Nieuwenhuizen <basni@chromium.org>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
All non-x86 definitions are moved to ttm_set_memory header, so remove it from
ttm_page_alloc_dma.c.
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:
kmalloc(a * b, gfp)
with:
kmalloc_array(a * b, gfp)
as well as handling cases of:
kmalloc(a * b * c, gfp)
with:
kmalloc(array3_size(a, b, c), gfp)
as it's slightly less ugly than:
kmalloc_array(array_size(a, b), c, gfp)
This does, however, attempt to ignore constant size factors like:
kmalloc(4 * 1024, gfp)
though any constants defined via macros get caught up in the conversion.
Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.
The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().
The Coccinelle script used for this was:
// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@
(
kmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)
// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@
(
kmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)
// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@
(
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)
// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@
- kmalloc
+ kmalloc_array
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)
// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@
(
kmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)
// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@
(
kmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)
// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@
(
kmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)
// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@
(
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)
// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@
(
kmalloc(sizeof(THING) * C2, ...)
|
kmalloc(sizeof(TYPE) * C2, ...)
|
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * E2
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- E1 * E2
+ E1, E2
, ...)
)
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
GFP_TRANSHUGE tries very hard to allocate huge pages, which can result
in long delays with high memory pressure. I have observed firefox
freezing for up to around a minute due to this while restic was taking
a full system backup.
Since we don't really need huge pages, use GFP_TRANSHUGE_LIGHT |
__GFP_NORETRY instead, in order to fail quickly when there are no huge
pages available.
Set __GFP_KSWAPD_RECLAIM as well, in order for huge pages to be freed
up in the background if necessary.
With these changes, I'm no longer seeing freezes during a restic backup.
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
the free mem space and the lower limit both include two parts:
system memory and swap space.
For the OOM triggered by TTM, that is the case as below:
first swap space is full of swapped out pages and soon
system memory also is filled up with ttm pages. and then
any memory allocation request will run into OOM.
to cover two cases:
a. if no swap disk at all or free swap space is under swap mem
limit but available system mem is bigger than sys mem limit,
allow TTM allocation;
b. if the available system mem is less than sys mem limit but
free swap space is bigger than swap mem limit, allow TTM
allocation.
v2: merge two memory limit(swap and system) into one
v3: keep original behavior except ttm_opt_ctx->flags with
TTM_OPT_FLAG_FORCE_ALLOC
v4: always set force_alloc as tx->flags & TTM_OPT_FLAG_FORCE_ALLOC
v5: add an attribute for lower_mem_limit
v6: set lower_mem_limit as 0 to keep original behavior
Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The pointer is available as ttm->bdev->glob as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove redundant store of return code.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add missing {} braces.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Flip the logic of the comparison and remove
the redudant variable for the pool address.
(v2): Remove {} bracing.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Correct missing {} style.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This to allow drivers to choose to avoid OOM invocation and handle
page allocation failures instead.
v2:
Remove extra new lines.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
add this for correctly updating global mem count in ttm_mem_zone.
before that when ttm_mem_global_alloc_page fails, we would update all
dma_page's global mem count in ttm_dma->pages_list. but actually here
we should not update for the last dma_page.
v2: only the update of last dma_page is not right
v3: use lower bits of dma_page vaddr
Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This fixes the build warning:
"ignoring return value of 'register_shrinker', declared with
attribute warn_unused_result [-Wunused-result]"
Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Suppress warning messages when allocating huge pages fails since we can
always fall back to normal pages.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
forward the operation context to ttm_tt_populate as well,
and the ultimate goal is swapout enablement for reserved BOs.
v2: squash in fix for vboxvideo
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
forward the operation context to ttm_mem_global_alloc_page as well,
and the ultimate goal is swapout enablement for reserved BOs.
Here reserved BOs refer to all the BOs which share same reservation object
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make the object a bit smaller by using a simple string instead
of a format string and array of char *.
$ size drivers/gpu/drm/ttm/ttm_page_alloc_dma.o*
text data bss dec hex filename
8820 216 4136 13172 3374 drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.defconfig.new
8910 216 4136 13262 33ce drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.defconfig.old
25383 5044 4384 34811 87fb drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.allyesconfig.new
25797 5428 4384 35609 8b19 drivers/gpu/drm/ttm/ttm_page_alloc_dma.o.allyesconfig.old
Miscellanea:
o The h array had more entries than were emitted, all are now removed
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Memory allocation failure should generally be handled gracefully by
callers. In particular, with transparent hugepage support, attempts
to allocate huge pages can fail under memory pressure, but the callers
fall back to allocating individual pages instead. In that case, there
would be spurious
[TTM] Unable to get page %u
error messages in dmesg.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We need to figure out first how to correctly map them into the CPU page tables.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=103138
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Try to allocate huge pages when it makes sense.
v2: fix comment and use ifdef
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Correctly handle different page sizes in the memory accounting.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Nobody is actually using that, remove it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove unused defines and variables. Also stop computing the
gfp_flags when they aren't used.
No intended functional change.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
set_memory_* functions have moved to set_memory.h. Switch to this
explicitly.
[akpm@linux-foundation.org: track drivers/gpu/drm/i915/i915_gem_gtt.c linux-next changes]
Link: http://lkml.kernel.org/r/1488920133-27229-8-git-send-email-labbott@redhat.com
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch 3d50d4dcb0 exposed the CPU address of DMA-allocated pages as
returned by dma_alloc_coherent because Nouveau on Tegra needed it.
This is not required anymore - as there were no other users for it,
remove it and save some memory for everyone.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It tries to do fancy things with excluding agp support if ttm is
built-in, but agp isn't. Instead just express this depency like drm
does and use CONFIG_AGP everywhere.
Also use the neat Makefile magic to make the entire ttm_agp_backend
file optional.
v2: Use IS_ENABLED(CONFIG_AGP) as suggested by Ville
v3: Review from Emil.
v4: Actually get it right as spotted by 0-day.
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1459337046-25882-1-git-send-email-daniel.vetter@ffwll.ch
|
|
Calls to set_memory_wb() incure heavy TLB flush and IPI cost. To
minimize those wait until pool grow beyond batch size before
draining the pool.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Current code never allowed the page pool to actualy fill in anyway.
This fix it, so that we only start freeing page from the pool when
we go over the pool size.
Changed since v1:
- Move the page batching optimization to its separate patch.
Changed since v2:
- Do not remove code part of the batching optimization with
this patch.
- Better commit message.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
dma_alloc_coherent() can return memory in the vmalloc range.
virt_to_page() cannot handle such addresses and crashes. This
patch detects such cases and obtains the struct page * using
vmalloc_to_page() instead.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Andrew Morton wrote:
> On Wed, 12 Nov 2014 13:08:55 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> > Andrew Morton wrote:
> > > Poor ttm guys - this is a bit of a trap we set for them.
> >
> > Commit a91576d7916f6cce ("drm/ttm: Pass GFP flags in order to avoid deadlock.")
> > changed to use sc->gfp_mask rather than GFP_KERNEL.
> >
> > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *),
> > - GFP_KERNEL);
> > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> >
> > But this bug is caused by sc->gfp_mask containing some flags which are not
> > in GFP_KERNEL, right? Then, I think
> >
> > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp & GFP_KERNEL);
> >
> > would hide this bug.
> >
> > But I think we should use GFP_ATOMIC (or drop __GFP_WAIT flag)
>
> Well no - ttm_page_pool_free() should stop calling kmalloc altogether.
> Just do
>
> struct page *pages_to_free[16];
>
> and rework the code to free 16 pages at a time. Easy.
Well, ttm code wants to process 512 pages at a time for performance.
Memory footprint increased by 512 * sizeof(struct page *) buffer is
only 4096 bytes. What about using static buffer like below?
----------
>From d3cb5393c9c8099d6b37e769f78c31af1541fe8c Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Thu, 13 Nov 2014 22:21:54 +0900
Subject: [PATCH] drm/ttm: Avoid memory allocation from shrinker functions.
Commit a91576d7916f6cce ("drm/ttm: Pass GFP flags in order to avoid
deadlock.") caused BUG_ON() due to sc->gfp_mask containing flags
which are not in GFP_KERNEL.
https://bugzilla.kernel.org/show_bug.cgi?id=87891
Changing from sc->gfp_mask to (sc->gfp_mask & GFP_KERNEL) would
avoid the BUG_ON(), but avoiding memory allocation from shrinker
function is better and reliable fix.
Shrinker function is already serialized by global lock, and
clean up function is called after shrinker function is unregistered.
Thus, we can use static buffer when called from shrinker function
and clean up function.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [2.6.35+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Pages allocated using the DMA API have a coherent memory mapping. Make
this mapping visible to drivers so they can decide to use it instead of
creating their own redundant one.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Commit 7dc19d5a "drivers: convert shrinkers to new count/scan API" added
deadlock warnings that ttm_page_pool_free() and ttm_dma_page_pool_free()
are currently doing GFP_KERNEL allocation.
But these functions did not get updated to receive gfp_t argument.
This patch explicitly passes sc->gfp_mask or GFP_KERNEL to these functions,
and removes the deadlock warning.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [2.6.35+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
I can observe that RHEL7 environment stalls with 100% CPU usage when a
certain type of memory pressure is given. While the shrinker functions
are called by shrink_slab() before the OOM killer is triggered, the stall
lasts for many minutes.
One of reasons of this stall is that
ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and
are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with
_manager->lock held causes someone (including kswapd) to deadlock when
these functions are called due to memory pressure. This patch changes
"mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to
avoid deadlock.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.3+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We can use "unsigned int" instead of "atomic_t" by updating start_pool
variable under _manager->lock. This patch will make it possible to avoid
skipping when choosing a pool to shrink in round-robin style, after next
patch changes mutex_lock(_manager->lock) to !mutex_trylock(_manager->lork).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.3+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
list_empty(&_manager->pools) being false before taking _manager->lock
does not guarantee that _manager->npools != 0 after taking _manager->lock
because _manager->npools is updated under _manager->lock.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.3+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Used by the vmwgfx driver
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Convert the driver shrinkers to the new API. Most changes are compile
tested only because I either don't have the hardware or it's staging
stuff.
FWIW, the md and android code is pretty good, but the rest of it makes me
want to claw my eyes out. The amount of broken code I just encountered is
mind boggling. I've added comments explaining what is broken, but I fear
that some of the code would be best dealt with by being dragged behind the
bike shed, burying in mud up to it's neck and then run over repeatedly
with a blunt lawn mower.
Special mention goes to the zcache/zcache2 drivers. They can't co-exist
in the build at the same time, they are under different menu options in
menuconfig, they only show up when you've got the right set of mm
subsystem options configured and so even compile testing is an exercise in
pulling teeth. And that doesn't even take into account the horrible,
broken code...
[glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|