<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pm24.git/kernel/cgroup.c, branch v3.7-rc3</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<id>https://git.kobert.dev/pm24.git/atom?h=v3.7-rc3</id>
<link rel='self' href='https://git.kobert.dev/pm24.git/atom?h=v3.7-rc3'/>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/'/>
<updated>2012-10-19T21:09:35Z</updated>
<entry>
<title>Revert "cgroup: Remove task_lock() from cgroup_post_fork()"</title>
<updated>2012-10-19T21:09:35Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-10-19T00:40:30Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=d87838321124061f6c935069d97f37010fa417e6'/>
<id>urn:sha1:d87838321124061f6c935069d97f37010fa417e6</id>
<content type='text'>
This reverts commit 7e3aa30ac8c904a706518b725c451bb486daaae9.

The commit incorrectly assumed that fork path always performed
threadgroup_change_begin/end() and depended on that for
synchronization against task exit and cgroup migration paths instead
of explicitly grabbing task_lock().

threadgroup_change is not locked when forking a new process (as
opposed to a new thread in the same process) and even if it were it
wouldn't be effective as different processes use different threadgroup
locks.

Revert the incorrect optimization.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
LKML-Reference: &lt;20121008020000.GB2575@localhost&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>Revert "cgroup: Drop task_lock(parent) on cgroup_fork()"</title>
<updated>2012-10-19T21:08:49Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-10-19T00:52:07Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=9bb71308b8133d643648776243e4d5599b1c193d'/>
<id>urn:sha1:9bb71308b8133d643648776243e4d5599b1c193d</id>
<content type='text'>
This reverts commit 7e381b0eb1e1a9805c37335562e8dc02e7d7848c.

The commit incorrectly assumed that fork path always performed
threadgroup_change_begin/end() and depended on that for
synchronization against task exit and cgroup migration paths instead
of explicitly grabbing task_lock().

threadgroup_change is not locked when forking a new process (as
opposed to a new thread in the same process) and even if it were it
wouldn't be effective as different processes use different threadgroup
locks.

Revert the incorrect optimization.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
LKML-Reference: &lt;20121008020000.GB2575@localhost&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Bitterly-Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>cgroup: notify_on_release may not be triggered in some cases</title>
<updated>2012-10-17T00:09:36Z</updated>
<author>
<name>Daisuke Nishimura</name>
<email>nishimura@mxp.nes.nec.co.jp</email>
</author>
<published>2012-10-04T07:37:16Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=1f5320d5972aa50d3e8d2b227b636b370e608359'/>
<id>urn:sha1:1f5320d5972aa50d3e8d2b227b636b370e608359</id>
<content type='text'>
notify_on_release must be triggered when the last process in a cgroup is
move to another. But if the first(and only) process in a cgroup is moved to
another, notify_on_release is not triggered.

	# mkdir /cgroup/cpu/SRC
	# mkdir /cgroup/cpu/DST
	#
	# echo 1 &gt;/cgroup/cpu/SRC/notify_on_release
	# echo 1 &gt;/cgroup/cpu/DST/notify_on_release
	#
	# sleep 300 &amp;
	[1] 8629
	#
	# echo 8629 &gt;/cgroup/cpu/SRC/tasks
	# echo 8629 &gt;/cgroup/cpu/DST/tasks
	-&gt; notify_on_release for /SRC must be triggered at this point,
	   but it isn't.

This is because put_css_set() is called before setting CGRP_RELEASABLE
in cgroup_task_migrate(), and is a regression introduce by the
commit:74a1166d(cgroups: make procs file writable), which was merged
into v3.0.

Cc: Ben Blum &lt;bblum@andrew.cmu.edu&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v3.0.x and later
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2012-10-02T17:52:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-10-02T17:52:28Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=68d47a137c3bef754923bccf73fb639c9b0bbd5e'/>
<id>urn:sha1:68d47a137c3bef754923bccf73fb639c9b0bbd5e</id>
<content type='text'>
Pull cgroup hierarchy update from Tejun Heo:
 "Currently, different cgroup subsystems handle nested cgroups
  completely differently.  There's no consistency among subsystems and
  the behaviors often are outright broken.

  People at least seem to agree that the broken hierarhcy behaviors need
  to be weeded out if any progress is gonna be made on this front and
  that the fallouts from deprecating the broken behaviors should be
  acceptable especially given that the current behaviors don't make much
  sense when nested.

  This patch makes cgroup emit warning messages if cgroups for
  subsystems with broken hierarchy behavior are nested to prepare for
  fixing them in the future.  This was put in a separate branch because
  more related changes were expected (didn't make it this round) and the
  memory cgroup wanted to pull in this and make changes on top."

* 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them
</content>
</entry>
<entry>
<title>cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them</title>
<updated>2012-09-14T19:01:16Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-09-13T19:20:58Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=8c7f6edbda01f1b1a2e60ad61f14fe38023e433b'/>
<id>urn:sha1:8c7f6edbda01f1b1a2e60ad61f14fe38023e433b</id>
<content type='text'>
Currently, cgroup hierarchy support is a mess.  cpu related subsystems
behave correctly - configuration, accounting and control on a parent
properly cover its children.  blkio and freezer completely ignore
hierarchy and treat all cgroups as if they're directly under the root
cgroup.  Others show yet different behaviors.

These differing interpretations of cgroup hierarchy make using cgroup
confusing and it impossible to co-mount controllers into the same
hierarchy and obtain sane behavior.

Eventually, we want full hierarchy support from all subsystems and
probably a unified hierarchy.  Users using separate hierarchies
expecting completely different behaviors depending on the mounted
subsystem is deterimental to making any progress on this front.

This patch adds cgroup_subsys.broken_hierarchy and sets it to %true
for controllers which are lacking in hierarchy support.  The goal of
this patch is two-fold.

* Move users away from using hierarchy on currently non-hierarchical
  subsystems, so that implementing proper hierarchy support on those
  doesn't surprise them.

* Keep track of which controllers are broken how and nudge the
  subsystems to implement proper hierarchy support.

For now, start with a single warning message.  We can whine louder
later on.

v2: Fixed a typo spotted by Michal. Warning message updated.

v3: Updated memcg part so that it doesn't generate warning in the
    cases where .use_hierarchy=false doesn't make the behavior
    different from root.use_hierarchy=true.  Fixed a typo spotted by
    Glauber.

v4: Check -&gt;broken_hierarchy after cgroup creation is complete so that
    -&gt;create() can affect the result per Michal.  Dropped unnecessary
    memcg root handling per Michal.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Acked-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@ghostprotocols.net&gt;
Cc: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>cgroup: Assign subsystem IDs during compile time</title>
<updated>2012-09-14T16:57:43Z</updated>
<author>
<name>Daniel Wagner</name>
<email>daniel.wagner@bmw-carit.de</email>
</author>
<published>2012-09-12T14:12:07Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=8a8e04df4747661daaee77e98e102d99c9e09b98'/>
<id>urn:sha1:8a8e04df4747661daaee77e98e102d99c9e09b98</id>
<content type='text'>
WARNING: With this change it is impossible to load external built
controllers anymore.

In case where CONFIG_NETPRIO_CGROUP=m and CONFIG_NET_CLS_CGROUP=m is
set, corresponding subsys_id should also be a constant. Up to now,
net_prio_subsys_id and net_cls_subsys_id would be of the type int and
the value would be assigned during runtime.

By switching the macro definition IS_SUBSYS_ENABLED from IS_BUILTIN
to IS_ENABLED, all *_subsys_id will have constant value. That means we
need to remove all the code which assumes a value can be assigned to
net_prio_subsys_id and net_cls_subsys_id.

A close look is necessary on the RCU part which was introduces by
following patch:

  commit f845172531fb7410c7fb7780b1a6e51ee6df7d52
  Author:	Herbert Xu &lt;herbert@gondor.apana.org.au&gt;  Mon May 24 09:12:34 2010
  Committer:	David S. Miller &lt;davem@davemloft.net&gt;  Mon May 24 09:12:34 2010

  cls_cgroup: Store classid in struct sock

  Tis code was added to init_cgroup_cls()

	  /* We can't use rcu_assign_pointer because this is an int. */
	  smp_wmb();
	  net_cls_subsys_id = net_cls_subsys.subsys_id;

  respectively to exit_cgroup_cls()

	  net_cls_subsys_id = -1;
	  synchronize_rcu();

  and in module version of task_cls_classid()

	  rcu_read_lock();
	  id = rcu_dereference(net_cls_subsys_id);
	  if (id &gt;= 0)
		  classid = container_of(task_subsys_state(p, id),
					 struct cgroup_cls_state, css)-&gt;classid;
	  rcu_read_unlock();

Without an explicit explaination why the RCU part is needed. (The
rcu_deference was fixed by exchanging it to rcu_derefence_index_check()
in a later commit, but that is a minor detail.)

So here is my pondering why it was introduced and why it safe to
remove it now. Note that this code was copied over to net_prio the
reasoning holds for that subsystem too.

The idea behind the RCU use for net_cls_subsys_id is to make sure we
get a valid pointer back from task_subsys_state(). task_subsys_state()
is just blindly accessing the subsys array and returning the
pointer. Obviously, passing in -1 as id into task_subsys_state()
returns an invalid value (out of lower bound).

So this code makes sure that only after module is loaded and the
subsystem registered, the id is assigned.

Before unregistering the module all old readers must have left the
critical section. This is done by assigning -1 to the id and issuing a
synchronized_rcu(). Any new readers wont call task_subsys_state()
anymore and therefore it is safe to unregister the subsystem.

The new code relies on the same trick, but it looks at the subsys
pointer return by task_subsys_state() (remember the id is constant
and therefore we allways have a valid index into the subsys
array).

No precautions need to be taken during module loading
module. Eventually, all CPUs will get a valid pointer back from
task_subsys_state() because rebind_subsystem() which is called after
the module init() function will assigned subsys[net_cls_subsys_id] the
newly loaded module subsystem pointer.

When the subsystem is about to be removed, rebind_subsystem() will
called before the module exit() function. In this case,
rebind_subsys() will assign subsys[net_cls_subsys_id] a NULL pointer
and then it calls synchronize_rcu(). All old readers have left by then
the critical section. Any new reader wont access the subsystem
anymore.  At this point we are safe to unregister the subsystem. No
synchronize_rcu() call is needed.

Signed-off-by: Daniel Wagner &lt;daniel.wagner@bmw-carit.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Cc: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Cc: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Cc: Kamezawa Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: netdev@vger.kernel.org
Cc: cgroups@vger.kernel.org
</content>
</entry>
<entry>
<title>cgroup: Do not depend on a given order when populating the subsys array</title>
<updated>2012-09-14T16:57:40Z</updated>
<author>
<name>Daniel Wagner</name>
<email>daniel.wagner@bmw-carit.de</email>
</author>
<published>2012-09-12T14:12:06Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=80f4c87774721e864d5a5a1f7aca3e95fd90e194'/>
<id>urn:sha1:80f4c87774721e864d5a5a1f7aca3e95fd90e194</id>
<content type='text'>
The *_subsys_id will be used as index to access the subsys. Therefore
we need to care we populate the subsystem at the correct position by
using designated initialization.

With this change we are able to interleave builtin and modules in the subsys
array.

Signed-off-by: Daniel Wagner &lt;daniel.wagner@bmw-carit.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Cc: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Cc: netdev@vger.kernel.org
Cc: cgroups@vger.kernel.org
</content>
</entry>
<entry>
<title>cgroup: Wrap subsystem selection macro</title>
<updated>2012-09-14T16:57:37Z</updated>
<author>
<name>Daniel Wagner</name>
<email>daniel.wagner@bmw-carit.de</email>
</author>
<published>2012-09-12T14:12:05Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=5fc0b02544b3b9bd3db5a8156b5f3e7350f8e797'/>
<id>urn:sha1:5fc0b02544b3b9bd3db5a8156b5f3e7350f8e797</id>
<content type='text'>
Before we are able to define all subsystem ids at compile time we need
a more fine grained control what gets defined when we include
cgroup_subsys.h. For example we define the enums for the subsystems or
to declare for struct cgroup_subsys (builtin subsystem) by including
cgroup_subsys.h and defining SUBSYS accordingly.

Currently, the decision if a subsys is used is defined inside the
header by testing if CONFIG_*=y is true. By moving this test outside
of cgroup_subsys.h we are able to control it on the include level.

This is done by introducing IS_SUBSYS_ENABLED which then is defined
according the task, e.g. is CONFIG_*=y or CONFIG_*=m.

Signed-off-by: Daniel Wagner &lt;daniel.wagner@bmw-carit.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Cc: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Cc: netdev@vger.kernel.org
Cc: cgroups@vger.kernel.org
</content>
</entry>
<entry>
<title>cgroup: Remove CGROUP_BUILTIN_SUBSYS_COUNT</title>
<updated>2012-09-14T16:57:32Z</updated>
<author>
<name>Daniel Wagner</name>
<email>daniel.wagner@bmw-carit.de</email>
</author>
<published>2012-09-13T07:50:55Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=be45c900fdc2c66baad5a7703fb8136991d88aeb'/>
<id>urn:sha1:be45c900fdc2c66baad5a7703fb8136991d88aeb</id>
<content type='text'>
CGROUP_BUILTIN_SUBSYS_COUNT is used as start index or stop index when
looping over the subsys array looking either at the builtin or the
module subsystems. Since all the builtin subsystems have an id which
is lower then CGROUP_BUILTIN_SUBSYS_COUNT we know that any module will
have an id larger than CGROUP_BUILTIN_SUBSYS_COUNT. In short the ids
are sorted.

We are about to change id assignment to happen only at compile time
later in this series. That means we can't rely on the above trick
since all ids will always be defined at compile time. Furthermore,
ordering the builtin subsystems and the module subsystems is not
really necessary.

So we need a different way to know which subsystem is a builtin or a
module one. We can use the subsys[]-&gt;module pointer for this. Any
place where we need to know if a subsys is module we just check for
the pointer. If it is NULL then the subsystem is a builtin one.

With this we are able to drop the CGROUP_BUILTIN_SUBSYS_COUNT
enum. Though we need to introduce a temporary placeholder so that we
don't get a compilation error when only CONFIG_CGROUP is selected and
no single controller. An empty enum definition is not valid. Later in
this series we are able to remove the placeholder again.

And with this change we get a fix for this:

kernel/cgroup.c: In function ‘cgroup_load_subsys’:
kernel/cgroup.c:4326:38: warning: array subscript is below array bounds [-Warray-bounds]

when CONFIG_CGROUP=y and no built in controller was enabled.

Signed-off-by: Daniel Wagner &lt;daniel.wagner@bmw-carit.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Cc: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Cc: netdev@vger.kernel.org
Cc: cgroups@vger.kernel.org
</content>
</entry>
<entry>
<title>cgroup: rename subsys_bits to subsys_mask</title>
<updated>2012-08-24T22:55:33Z</updated>
<author>
<name>Aristeu Rozanski</name>
<email>aris@redhat.com</email>
</author>
<published>2012-08-23T20:53:31Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=a1a71b45a66fd3c3c453b55fbd180f8fccdd1daa'/>
<id>urn:sha1:a1a71b45a66fd3c3c453b55fbd180f8fccdd1daa</id>
<content type='text'>
In a previous discussion, Tejun Heo suggested to rename references to
subsys_bits (added_bits, removed_bits, etc) by something more meaningful.

Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Hillf Danton &lt;dhillf@gmail.com&gt;
Cc: Lennart Poettering &lt;lpoetter@redhat.com&gt;
Signed-off-by: Aristeu Rozanski &lt;aris@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
