参考内核5.x,代码路径是在drivers/devfreq/,devfreq的代码定义是:Generic Dynamic Voltage and Frequency Scaling (DVFS) Framework for Non-CPU Devices。而CPU动态调频是drivers/cpufreq,devfreq是基于cpufreq而来。

第一次引入这个特性描述:

PM: Introduce devfreq: generic DVFS framework with device-specific OPPs

With OPPs, a device may have multiple operable frequency and voltage
sets. However, there can be multiple possible operable sets and a system
will need to choose one from them. In order to reduce the power
consumption (by reducing frequency and voltage) without affecting the
performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
scheme may be used.

This patch introduces the DVFS capability to non-CPU devices with OPPs.
DVFS is a techique whereby the frequency and supplied voltage of a
device is adjusted on-the-fly. DVFS usually sets the frequency as low
as possible with given conditions (such as QoS assurance) and adjusts
voltage according to the chosen frequency in order to reduce power
consumption and heat dissipation.

The generic DVFS for devices, devfreq, may appear quite similar with
/drivers/cpufreq. However, cpufreq does not allow to have multiple
devices registered and is not suitable to have multiple heterogenous
devices with different (but simple) governors.

Normally, DVFS mechanism controls frequency based on the demand for
the device, and then, chooses voltage based on the chosen frequency.
devfreq also controls the frequency based on the governor's frequency
recommendation and let OPP pick up the pair of frequency and voltage
based on the recommended frequency. Then, the chosen OPP is passed to
device driver's "target" callback.

OPP是Operating Performance Point缩写,定义如下:

Complex SoCs of today consists of a multiple sub-modules working in conjunction.
In an operational system executing varied use cases, not all modules in the SoC
need to function at their highest performing frequency all the time. To
facilitate this, sub-modules in a SoC are grouped into domains, allowing some
domains to run at lower voltage and frequency while other domains run at
voltage/frequency pairs that are higher.

The set of discrete tuples consisting of frequency and voltage pairs that
the device will support per domain are called Operating Performance Points or
OPPs.

简单说比如DDR有2个OPPs,分别是{1.8G, 1.3V},{1G, 1V},任君选择。

ok,基于OOPs,为了降低功耗而又不影响太多性能,DVFS由此而生。

devfreq和cpufreq的区别就是:

cpufreq does not allow to have multiple devices registered and is not suitable to
have multiple heterogenous devices with different (but simple) governors.

devfreq支持不同种类的device,这些device可以拥有不同的governor。这一点cpufreq是不具备的,你看cpufreq都是针对cpuX,governor都是一样一样的。

另外,devfreq提供了4个基本governor: ondemand/performance/powersave/userspace/,注意仅仅作为例子,drivers可以实现自己的governor。

Note that these are given only as basic examples for governors and any devices
with devfreq may implement their own governors with the drivers and use them.

passive governor是后来加入的,看下提交原因:

The following governors are independently used for one device driver which don't
give the influence to other device drviers and also don't receive the effect from
other device drivers.

ok,下面主要看下devfreq.c governor相关:

struct devfreq_governor {
struct list_head node;

const char name[DEVFREQ_NAME_LEN];
const unsigned int immutable;
int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
int (*event_handler)(struct devfreq *devfreq,
unsigned int event, void *data);
};

immutable如果是1就是governor运行时不能再变了。

/* The list of all device-devfreq governors */
static LIST_HEAD(devfreq_governor_list);

用devfreq_governor_list来管理governor,比如增加governor:

int devfreq_add_governor(struct devfreq_governor *governor)
{
struct devfreq_governor *g;
struct devfreq *devfreq;
int err = 0;

...
mutex_lock(&devfreq_list_lock);
g = find_devfreq_governor(governor->name);
if (!IS_ERR(g)) {
pr_err("%s: governor %s already registered\n", __func__,
g->name);
err = -EINVAL;
goto err_out;
}

list_add(&governor->node, &devfreq_governor_list); //tj: add governor to list

如果check此governor已经注册过了via find_devfreq_governor(),就直接返回了(需要返回error?),新的governor就加到governor list中。继续看:

list_for_each_entry(devfreq, &devfreq_list, node) {
int ret = 0;
struct device *dev = devfreq->dev.parent;

if (!strncmp(devfreq->governor_name, governor->name,
DEVFREQ_NAME_LEN)) {
/* The following should never occur */
if (devfreq->governor) {
dev_warn(dev,
"%s: Governor %s already present\n",
__func__, devfreq->governor->name);
ret = devfreq->governor->event_handler(devfreq,
DEVFREQ_GOV_STOP, NULL);
if (ret) {
dev_warn(dev,
"%s: Governor %s stop = %d\n",
__func__,
devfreq->governor->name, ret);
}
/* Fall through */
}
devfreq->governor = governor;
ret = devfreq->governor->event_handler(devfreq,
DEVFREQ_GOV_START, NULL);
if (ret) {
dev_warn(dev, "%s: Governor %s start=%d\n",
__func__, devfreq->governor->name,
ret);
}
}
}
/* The list of all device-devfreq */
static LIST_HEAD(devfreq_list);

扫描devfreq_list,如果在这个list中找到有governor_name的就start this governor,涉及到add device,来看:

add devfreq to device

struct devfreq {
struct list_head node;

struct mutex lock;
struct device dev;
struct devfreq_dev_profile *profile;
const struct devfreq_governor *governor;
char governor_name[DEVFREQ_NAME_LEN];
...

struct devfreq有个成员governor_name[],还有个pointer governor用来for how to choose freq。不过struct devfreq_governor也有name,so,可以去掉governor_name? 看下相关:

struct devfreq *devfreq_add_device(struct device *dev,
struct devfreq_dev_profile *profile,
const char *governor_name,
void *data)
{
...
devfreq->dev.class = devfreq_class;
devfreq->dev.release = devfreq_dev_release;
devfreq->profile = profile;
strncpy(devfreq->governor_name, governor_name, DEVFREQ_NAME_LEN);
...
mutex_lock(&devfreq_list_lock);

governor = try_then_request_governor(devfreq->governor_name);
if (IS_ERR(governor)) {
dev_err(dev, "%s: Unable to find governor for the device\n",
__func__);
err = PTR_ERR(governor);
goto err_init;
}

devfreq->governor = governor;
err = devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_START,
NULL);
if (err) {
dev_err(dev, "%s: Unable to start governor for the device\n",
__func__);
goto err_init;
}

list_add(&devfreq->node, &devfreq_list);

mutex_unlock(&devfreq_list_lock);

return devfreq;
...

直接copy了passed governor_name to devfreq->governor_name。在通过try_then_request_governor()找到了一个governor后会start这个governor via callback。

static struct devfreq_governor *try_then_request_governor(const char *name)
{
struct devfreq_governor *governor;
int err = 0;

if (IS_ERR_OR_NULL(name)) {
pr_err("DEVFREQ: %s: Invalid parameters\n", __func__);
return ERR_PTR(-EINVAL);
}
WARN(!mutex_is_locked(&devfreq_list_lock),
"devfreq_list_lock must be locked.");

governor = find_devfreq_governor(name);
if (IS_ERR(governor)) {
mutex_unlock(&devfreq_list_lock);

if (!strncmp(name, DEVFREQ_GOV_SIMPLE_ONDEMAND,
DEVFREQ_NAME_LEN))
err = request_module("governor_%s", "simpleondemand");
else
err = request_module("governor_%s", name);
/* Restore previous state before return */
mutex_lock(&devfreq_list_lock);
if (err)
return (err < 0) ? ERR_PTR(err) : ERR_PTR(-EINVAL);

governor = find_devfreq_governor(name);
}

return governor;
}
static struct devfreq_governor *find_devfreq_governor(const char *name)
{
struct devfreq_governor *tmp_governor;

if (IS_ERR_OR_NULL(name)) {
pr_err("DEVFREQ: %s: Invalid parameters\n", __func__);
return ERR_PTR(-EINVAL);
}
WARN(!mutex_is_locked(&devfreq_list_lock),
"devfreq_list_lock must be locked.");

list_for_each_entry(tmp_governor, &devfreq_governor_list, node) {
if (!strncmp(tmp_governor->name, name, DEVFREQ_NAME_LEN))
return tmp_governor;
}

return ERR_PTR(-ENODEV);
}

so, 必须要先add governor到governor list,add device才能return ok,rt? yes, 但是前提都是built-in governor。如果governor编译成module,那add device先调用,而后add governor在request时调用。

ok,我们回来再看devfreq_add_governor()剩下来的list_for_each_entry(),再贴下

list_for_each_entry(devfreq, &devfreq_list, node) {
int ret = 0;
struct device *dev = devfreq->dev.parent;

if (!strncmp(devfreq->governor_name, governor->name,
DEVFREQ_NAME_LEN)) {
/* The following should never occur */
if (devfreq->governor) {
dev_warn(dev,
"%s: Governor %s already present\n",
__func__, devfreq->governor->name);
ret = devfreq->governor->event_handler(devfreq,
DEVFREQ_GOV_STOP, NULL);
if (ret) {
dev_warn(dev,
"%s: Governor %s stop = %d\n",
__func__,
devfreq->governor->name, ret);
}
/* Fall through */
}
devfreq->governor = governor;
ret = devfreq->governor->event_handler(devfreq,
DEVFREQ_GOV_START, NULL);
if (ret) {
dev_warn(dev, "%s: Governor %s start=%d\n",
__func__, devfreq->governor->name,
ret);
}
}
}

所以,这里的start governor是给add device中没有找到governor才用的,rt? 这块code貌似有问题,待我提交patch…

Done.