22 用户自定义函数
User-Defined Functions
本节译者:杜丽英、李君竹
初次校审:李君竹
二次校审:李君竹(Claude 辅助)
This chapter explains functions from a user perspective with examples; see the language reference for a full specification. User-defined functions allow computations to be encapsulated into a single named unit and invoked elsewhere by name. Similarly, functions allow complex procedures to be broken down into more understandable components. Writing modular code using descriptively named functions is easier to understand than a monolithic program, even if the latter is heavily commented.1
本章从用户角度通过示例解释函数;完整规范请参阅语言参考手册。用户自定义函数允许将计算封装到单个命名单元中,并在其他地方通过名称调用。同样,函数允许将复杂过程分解为更易理解的组件。使用描述性命名函数编写的模块化代码比单体程序更易理解,即使后者有大量注释。2
22.1 Basic functions
基本函数
Here’s an example of a skeletal Stan program with a user-defined relative difference function employed in the generated quantities block to compute a relative differences between two parameters.
以下是一个 Stan 程序框架示例,其中包含一个用户自定义的相对差异函数,在生成量块中用于计算两个参数之间的相对差异。
functions {
real relative_diff(real x, real y) {
real abs_diff;
real avg_scale;
abs_diff = abs(x - y);
avg_scale = (abs(x) + abs(y)) / 2;
return abs_diff / avg_scale;
}
}
// ...
generated quantities {
real rdiff;
rdiff = relative_diff(alpha, beta);
}The function is named relative_diff, and is declared to have two real-valued arguments and return a real-valued result. It is used the same way a built-in function would be used in the generated quantities block.
该函数名为 relative_diff,声明为具有两个实值参数并返回一个实值结果。它的使用方式与生成量块中的内置函数相同。
User-defined functions block
用户自定义函数块
All functions are defined in their own block, which is labeled functions and must appear before all other program blocks. The user-defined functions block is optional.
所有函数都在自己的块中定义,该块标记为 functions,必须出现在所有其他程序块之前。用户自定义函数块是可选的。
Function bodies
函数体
The body (the part between the curly braces) contains ordinary Stan code, including local variables. The new function is used in the generated quantities block just as any of Stan’s built-in functions would be used.
函数体(花括号之间的部分)包含普通的 Stan 代码,包括局部变量。新函数在生成量块中的使用方式与 Stan 的任何内置函数相同。
Return statements
返回语句
Return statements, such as the one on the last line of the definition of relative_diff above, are only allowed in the bodies of function definitions. Return statements may appear anywhere in a function, but functions with non-void return types must end in a return statement.
返回语句,如上面 relative_diff 定义的最后一行,只允许在函数定义的函数体中使用。返回语句可以出现在函数的任何地方,但具有非 void 返回类型的函数必须以返回语句结尾。
Reject and error statements
拒绝和错误语句
The Stan reject statement provides a mechanism to report errors or problematic values encountered during program execution. It accepts any number of quoted string literals or Stan expressions as arguments. This statement is typically embedded in a conditional statement in order to detect bad or illegal outcomes of some processing step.
Stan 的 reject 语句提供了一种在程序执行期间报告错误或问题值的机制。它接受任意数量的带引号字符串字面量或 Stan 表达式作为参数。该语句通常嵌入在条件语句中,以检测某些处理步骤的错误或非法结果。
If an error is indicative of a problem from which it is not expected to be able to recover, Stan provides a fatal_error statement.
如果错误表明存在无法恢复的问题,Stan 提供了 fatal_error 语句。
Catching errors
捕获错误
Rejection is used to flag errors that arise in inputs or in program state. It is far better to fail early with a localized informative error message than to run into problems much further downstream (as in rejecting a state or failing to compute a derivative).
拒绝用于标记输入或程序状态中出现的错误。尽早失败并提供局部化的信息性错误消息,远比在下游遇到问题(如拒绝状态或无法计算导数)要好得多。
The most common errors that are coded is to test that all of the arguments to a function are legal. The following function takes a square root of its input, so requires non-negative inputs; it is coded to guard against illegal inputs.
最常见的错误检查是测试函数的所有参数是否合法。以下函数对其输入取平方根,因此需要非负输入;它被编码以防止非法输入。
real dbl_sqrt(real x) {
if (!(x >= 0)) {
reject("dbl_sqrt(x): x must be positive; found x = ", x);
}
return 2 * sqrt(x);
}The negation of the positive test is important, because it also catches the case where x is a not-a-number value. If the condition had been coded as (x < 0) it would not catch the not-a-number case, though it could be written as (x < 0 || is_nan(x)). The positive infinite case is allowed through, but could also be checked with the is_inf(x) function. The square root function does not itself reject, but some downstream consumer of dbl_sqrt(-2) would be likely to raise an error, at which point the origin of the illegal input requires detective work. Or even worse, as Matt Simpson pointed out in the GitHub comments, the function could go into an infinite loop if it starts with an infinite value and tries to reduce it by arithmetic, likely consuming all available memory and crashing an interface. Much better to catch errors early and report on their origin.
正数测试的否定很重要,因为它还捕获了 x 是非数字值的情况。如果条件被编码为 (x < 0),它将不会捕获非数字情况,尽管可以写成 (x < 0 || is_nan(x))。正无穷大的情况被允许通过,但也可以使用 is_inf(x) 函数检查。平方根函数本身不拒绝,但 dbl_sqrt(-2) 的某些下游使用者可能会引发错误,此时需要追查非法输入的来源。或者更糟的是,正如 Matt Simpson 在 GitHub 评论中指出的,如果函数从无穷大值开始并尝试通过算术减少它,可能会进入无限循环,可能会消耗所有可用内存并使接口崩溃。尽早捕获错误并报告其来源要好得多。
The effect of rejection depends on the program block in which the rejection is executed. In transformed data, rejections cause the program to fail to load. In transformed parameters or in the model block, rejections cause the current state to be rejected in the Metropolis sense.3
拒绝的效果取决于执行拒绝的程序块。在转换数据中,拒绝会导致程序无法加载。在转换参数或模型块中,拒绝会导致当前状态在 Metropolis 意义上被拒绝。4
In generated quantities there is no way to recover and generate the remaining parameters, so rejections cause subsequent values to be reported as NaNs. Extra care should be taken in calling functions which may reject in the generated quantities block.
在生成量中没有办法恢复并生成剩余的参数,因此拒绝会导致后续值报告为 NaN。在生成量块中调用可能拒绝的函数时应格外小心。
Type declarations for functions
函数的类型声明
Function argument and return types for vector and matrix types are not declared with their sizes, unlike type declarations for variables. Function argument type declarations may not be declared with constraints, either lower or upper bounds or structured constraints like forming a simplex or correlation matrix, (as is also the case for local variables); see the table of types in the reference manual for full details.
向量和矩阵类型的函数参数和返回类型不声明其大小,这与变量的类型声明不同。函数参数类型声明不能带有约束,无论是下界或上界,还是形成单纯形或相关矩阵等结构化约束(局部变量也是如此);完整详情请参阅参考手册中的类型表。
For example, here’s a function to compute the entropy of a categorical distribution with simplex parameter theta.
例如,这是一个计算具有单纯形参数 theta 的分类分布熵的函数。
real entropy(vector theta) {
return sum(theta .* log(theta));
}Although theta must be a simplex, only the type vector is used.5
尽管 theta 必须是单纯形,但只使用 vector 类型。6
Upper or lower bounds on values or constrained types are not allowed as return types or argument types in function declarations.
值的上界或下界或约束类型不允许作为函数声明中的返回类型或参数类型。
Array types for function declarations
函数声明的数组类型
Array arguments have their own syntax, which follows that used in this manual for function signatures. For example, a function that operates on a two-dimensional array to produce a one-dimensional array might be declared as follows.
数组参数有自己的语法,遵循本手册中用于函数签名的语法。例如,对二维数组进行操作以产生一维数组的函数可能声明如下。
array[] real baz(array[,] real x);The notation [ ] is used for one-dimensional arrays (as in the return above), [ , ] for two-dimensional arrays, [ , , ] for three-dimensional arrays, and so on.
符号 [ ] 用于一维数组(如上面的返回),[ , ] 用于二维数组,[ , , ] 用于三维数组,依此类推。
Functions support arrays of any type, including matrix and vector types. As with other types, no constraints are allowed.
函数支持任何类型的数组,包括矩阵和向量类型。与其他类型一样,不允许约束。
Data-only function arguments
纯数据函数参数
A function argument which is a real-valued type or a container of a real-valued type, i.e., not an integer type or integer array type, can be qualified using the prefix qualifier data. The following is an example of a data-only function argument.
作为实值类型或实值类型容器的函数参数(即不是整数类型或整数数组类型)可以使用前缀限定符 data 进行限定。以下是一个纯数据函数参数的示例。
real foo(real y, data real mu) {
return -0.5 * (y - mu)^2;
}This qualifier restricts this argument to being invoked with expressions which consist only of data variables, transformed data variables, literals, and function calls. A data-only function argument cannot involve real variables declared in the parameters, transformed parameters, or model block. Attempts to invoke a function using an expression which contains parameter, transformed parameters, or model block variables as a data-only argument will result in an error message from the parser.
此限定符将此参数的调用限制为仅包含数据变量、转换数据变量、字面量和函数调用的表达式。纯数据函数参数不能涉及在参数、转换参数或模型块中声明的实变量。尝试使用包含参数、转换参数或模型块变量的表达式作为纯数据参数调用函数将导致解析器产生错误消息。
Use of the data qualifier must be consistent between the forward declaration and the definition of a functions.
data 限定符的使用必须在函数的前向声明和定义之间保持一致。
This qualifier should be used when writing functions that call the built-in ordinary differential equation (ODE) solvers, algebraic solvers, or map functions. These higher-order functions have strictly specified signatures where some arguments of are data only expressions. (See the ODE solver chapter for more usage details and the functions reference manual for full definitions.) When writing a function which calls the ODE or algebraic solver, arguments to that function which are passed into the call to the solver, either directly or indirectly, should have the data prefix qualifier. This allows for compile-time type checking and increases overall program understandability.
在编写调用内置常微分方程(ODE)求解器、代数求解器或映射函数的函数时应使用此限定符。这些高阶函数具有严格指定的签名,其中某些参数仅为数据表达式。(有关更多使用详细信息,请参阅 ODE 求解器章节,完整定义请参阅函数参考手册。)在编写调用 ODE 或代数求解器的函数时,传递给求解器调用的该函数的参数(直接或间接)应具有 data 前缀限定符。这允许编译时类型检查并提高整体程序的可理解性。
22.2 Functions as statements
函数作为语句
In some cases, it makes sense to have functions that do not return a value. For example, a routine to print the lower-triangular portion of a matrix can be defined as follows.
在某些情况下,不返回值的函数是有意义的。例如,可以定义一个打印矩阵下三角部分的例程,如下所示。
functions {
void pretty_print_tri_lower(matrix x) {
if (rows(x) == 0) {
print("empty matrix");
return;
}
print("rows=", rows(x), " cols=", cols(x));
for (m in 1:rows(x)) {
for (n in 1:m) {
print("[", m, ",", n, "]=", x[m, n]);
}
}
}
}The special symbol void is used as the return type. This is not a type itself in that there are no values of type void; it merely indicates the lack of a value. As such, return statements for void functions are not allowed to have arguments, as in the return statement in the body of the previous example.
特殊符号 void 用作返回类型。这本身不是一种类型,因为没有 void 类型的值;它仅表示缺少值。因此,void 函数的返回语句不允许有参数,如前面示例主体中的返回语句。
Void functions applied to appropriately typed arguments may be used on their own as statements. For example, the pretty-print function defined above may be applied to a covariance matrix being defined in the transformed parameters block.
应用于适当类型参数的 void 函数可以单独用作语句。例如,上面定义的美化打印函数可以应用于在转换参数块中定义的协方差矩阵。
transformed parameters {
cov_matrix[K] Sigma;
// ... code to set Sigma ...
pretty_print_tri_lower(Sigma);
// ...
}22.3 Functions accessing the log probability accumulator
访问对数概率累加器的函数
Functions whose names end in _lp are allowed to use sampling statements and target += statements; other functions are not. Because of this access, their use is restricted to the transformed parameters and model blocks.
名称以 _lp 结尾的函数允许使用采样语句和 target += 语句;其他函数则不允许。由于这种访问权限,它们的使用仅限于转换参数和模型块。
Here is an example of a function to assign standard normal priors to a vector of coefficients, along with a center and scale, and return the translated and scaled coefficients; see the reparameterization section for more information on efficient non-centered parameterizations
以下是一个函数示例,为系数向量分配标准正态先验,以及中心和尺度,并返回平移和缩放后的系数;有关高效非中心参数化的更多信息,请参阅重参数化部分。
functions {
vector center_lp(vector beta_raw, real mu, real sigma) {
beta_raw ~ std_normal();
sigma ~ cauchy(0, 5);
mu ~ cauchy(0, 2.5);
return sigma * beta_raw + mu;
}
// ...
}
parameters {
vector[K] beta_raw;
real mu_beta;
real<lower=0> sigma_beta;
// ...
}
transformed parameters {
vector[K] beta;
// ...
beta = center_lp(beta_raw, mu_beta, sigma_beta);
// ...
}22.4 Functions implementing change-of-variable adjustments
实现变量变换调整的函数
Functions whose names end in _jacobian can use the jacobian += statement. This can be used to implement a custom change of variables for arbitrary parameters.
名称以 _jacobian 结尾的函数可以使用 jacobian += 语句。这可以用来实现任意参数的自定义变量变换。
For example, this function recreates the built-in <upper=x> transform on real numbers:
例如,这个函数重现了实数上内置的 <upper=x> 变换:
real my_upper_bound_jacobian(real x, real ub) {
jacobian += x;
return ub - exp(x);
}It can be used as a replacement for real<lower=ub> as follows:
它可以用来替代 real<lower=ub>,如下所示:
functions {
// my_upper_bound_jacobian as above
}
data {
real ub;
}
parameters {
real b_raw;
}
transformed parameters {
real b = my_upper_bound_jacobian(b_raw, ub);
}
model {
b ~ lognormal(0, 1);
// ...
}22.5 Functions acting as random number generators
作为随机数生成器的函数
A user-specified function can be declared to act as a (pseudo) random number generator (PRNG) by giving it a name that ends in _rng. Giving a function a name that ends in _rng allows it to access built-in functions and user-defined functions that end in _rng, which includes all the built-in PRNG functions. Only functions ending in _rng are able access the built-in PRNG functions. The use of functions ending in _rng must therefore be restricted to transformed data and generated quantities blocks like other PRNG functions; they may also be used in the bodies of other user-defined functions ending in _rng.
用户指定的函数可以通过给它一个以 _rng 结尾的名称来声明为(伪)随机数生成器(PRNG)。给函数一个以 _rng 结尾的名称允许它访问以 _rng 结尾的内置函数和用户定义函数,其中包括所有内置的 PRNG 函数。只有以 _rng 结尾的函数才能访问内置的 PRNG 函数。因此,以 _rng 结尾的函数的使用必须限制在转换数据和生成量块中,就像其他 PRNG 函数一样;它们也可以用在其他以 _rng 结尾的用户定义函数的主体中。
For example, the following function generates an \(N \times K\) data matrix, the first column of which is filled with 1 values for the intercept and the remaining entries of which have values drawn from a standard normal PRNG.
例如,以下函数生成一个 \(N \times K\) 数据矩阵,其第一列填充为截距的值 1,其余条目的值从标准正态 PRNG 中抽取。
matrix predictors_rng(int N, int K) {
matrix[N, K] x;
for (n in 1:N) {
x[n, 1] = 1.0; // intercept
for (k in 2:K) {
x[n, k] = normal_rng(0, 1);
}
}
return x;
}The following function defines a simulator for regression outcomes based on a data matrix x, coefficients beta, and noise scale sigma.
以下函数定义了基于数据矩阵 x、系数 beta 和噪声尺度 sigma 的回归结果的模拟器。
vector regression_rng(vector beta, matrix x, real sigma) {
vector[rows(x)] y;
vector[rows(x)] mu;
mu = x * beta;
for (n in 1:rows(x)) {
y[n] = normal_rng(mu[n], sigma);
}
return y;
}These might be used in a generated quantity block to simulate some fake data from a fitted regression model as follows.
这些可以在生成量块中用来从拟合的回归模型模拟一些虚假数据,如下所示:
parameters {
vector[K] beta;
real<lower=0> sigma;
// ...
}
generated quantities {
matrix[N_sim, K] x_sim;
vector[N_sim] y_sim;
x_sim = predictors_rng(N_sim, K);
y_sim = regression_rng(beta, x_sim, sigma);
}A more sophisticated simulation might fit a multivariate normal to the predictors x and use the resulting parameters to generate multivariate normal draws for x_sim.
更复杂的模拟可能会对预测器 x 拟合多元正态分布,并使用结果参数为 x_sim 生成多元正态抽取。
22.6 User-defined probability functions
用户定义的概率函数
Probability functions are distinguished in Stan by names ending in _lpdf for density functions and _lpmf for mass functions; in both cases, they must have real return types.
概率函数在 Stan 中通过以 _lpdf 结尾的密度函数和以 _lpmf 结尾的质量函数名称来区分;在这两种情况下,它们都必须具有 real 返回类型。
Suppose a model uses several standard normal distributions, for which there is not a specific overloaded density nor defaults in Stan. So rather than writing out the location of 0 and scale of 1 for all of them, a new density function may be defined and reused.
假设一个模型使用了几个标准正态分布,Stan 中没有特定的重载密度或默认值。因此,与其为所有分布都写出位置 0 和尺度 1,不如定义并重用一个新的密度函数。
functions {
real unit_normal_lpdf(real y) {
return normal_lpdf(y | 0, 1);
}
}
// ...
model {
alpha ~ unit_normal();
beta ~ unit_normal();
// ...
}The ability to use the unit_normal function as a density is keyed off its name ending in _lpdf (names ending in _lpmf for probability mass functions work the same way).
将 unit_normal 函数用作密度的能力源于其以 _lpdf 结尾的名称(概率质量函数以 _lpmf 结尾的名称工作方式相同)。
In general, if foo_lpdf is defined to consume \(N + 1\) arguments, then
一般来说,如果 foo_lpdf 被定义为消耗 \(N + 1\) 个参数,那么
y ~ foo(theta1, ..., thetaN);can be used as shorthand for
可以用作以下内容的简写:
target += foo_lpdf(y | theta1, ..., thetaN);As with the built-in functions, the suffix _lpdf is dropped and the first argument moves to the left of the tilde symbol (~) in the distribution statement.
与内置函数一样,后缀 _lpdf 被去掉,第一个参数移动到分布语句中的波浪号符号(~)的左侧。
Functions ending in _lpmf (for probability mass functions), behave exactly the same way. The difference is that the first argument of a density function (_lpdf) must be continuous (not an integer or integer array), whereas the first argument of a mass function (_lpmf) must be discrete (integer or integer array).
以 _lpmf 结尾的函数(用于概率质量函数)的行为方式完全相同。区别在于密度函数(_lpdf)的第一个参数必须是连续的(不是整数或整数数组),而质量函数(_lpmf)的第一个参数必须是离散的(整数或整数数组)。
22.7 Overloading functions
函数重载
As described in the reference manual function overloading is permitted in Stan, beginning in version 2.29.
如参考手册中所述,从 2.29 版本开始,Stan 中允许函数重载。
This means multiple functions can be defined with the same name as long as they accept different numbers or types of arguments. User-defined functions can also overload Stan library functions.
这意味着只要函数接受不同数量或类型的参数,就可以定义多个同名函数。用户定义函数也可以重载 Stan 库函数。
Warning on usage
使用警告
Overloading is a powerful productivity tool in programming languages, but it can also lead to confusion. In particular, it can be unclear at first glance which version of a function is being called at any particular call site, especially with type promotion allowed between scalar types. Because of this, it is a programming best practice that overloaded functions maintain the same meaning across definitions.
重载是编程语言中强大的生产力工具,但也可能导致混乱。特别是,乍一看可能不清楚在任何特定调用站点调用的是哪个版本的函数,特别是在标量类型之间允许类型提升的情况下。因此,编程最佳实践是重载函数在所有定义中保持相同的含义。
For example, consider a function triple which has the following three signatures
例如,考虑一个具有以下三个签名的函数 triple:
real triple(real x);
complex triple(complex x);
array[] real triple(array[] real);One should expect that all overloads of this function perform the same basic task. This should lead to definitions of these functions which would satisfy the following assumptions that someone reading the program would expect
人们应该期望这个函数的所有重载执行相同的基本任务。这应该导致这些函数的定义满足阅读程序的人期望的以下假设:
// The function does what it says
triple(3.0) == 9.0
// It is defined reasonably for different types
triple(to_complex(3.0)) == to_complex(triple(3.0))
// A container version of this function works by element
triple({3.0, 4.0})[0] == triple({3.0, 4.0}[0])Note that none of these properties are enforced by Stan, they are mentioned merely to warn against uses of overloading which cause confusion.
请注意,Stan 不强制执行这些属性,提及它们仅是为了警告避免使用会引起困惑的重载。
Function resolution
函数解析
Stan resolves overloaded functions by the number and type of arguments passed to the function. This can be subtle when multiple signatures with the same number of arguments are present.
Stan 通过传递给函数的参数的数量和类型来解析重载函数。当存在具有相同参数数量的多个签名时,这可能很微妙。
Consider the following function signatures
考虑以下函数签名
real foo(int a, real b);
real foo(real a, real b);Given these, the function call foo(1.5, 2.5) is unambiguous - it must resolve to the second signature. But, the function call foo(1, 1.5) could be valid for either under Stan’s promotion rules, which allow integers to be promoted to real numbers.
鉴于此,函数调用 foo(1.5, 2.5) 是明确的——它必须解析为第二个签名。但是,函数调用 foo(1, 1.5) 在 Stan 的提升规则下可能对_任一_都有效,该规则允许整数提升为实数。
To resolve this, Stan selects the signature which requires the fewest number of promotions for a given function call. In the above case, this means the call foo(1, 1.5) would select the first signature, because it requires 0 promotions (the second signature would require 1 promotion).
为了解决这个问题,Stan 选择给定函数调用所需提升次数最少的签名。在上述情况下,这意味着调用 foo(1, 1.5) 将选择第一个签名,因为它需要 0 次提升(第二个签名需要 1 次提升)。
Furthermore, there must be only one such signature, e.g., the minimum number of promotions must be a unique minimum. This requirement forbids certain kinds of overloading. For example, consider the function signatures
此外,必须只有一个这样的签名,即给定函数调用的最小提升次数必须是唯一的。这个要求限制了某些类型的重载。例如,考虑函数签名
real bar(int x, real y);
real bar(real x, int y);These signatures do not have a unique minimum number of promotions for the call bar(1, 2). Both signatures require one int to real promotion, and so it cannot be determined which is correct. Stan will produce a compilation error in this case.
这些签名对于调用 bar(1, 2) 没有唯一的最小提升次数。两个签名都需要一个 int 到 real 的提升,因此无法确定哪个是正确的。在这种情况下,Stan 将产生编译错误。
Promotion from integers to complex numbers is considered to be two separate promotions, first from int to real, then from real to complex. This means that integer arguments will “prefer” a signature with real types over complex types.
从整数到复数的提升被认为是两个独立的提升,首先从 int 到 real,然后从 real 到 complex。这意味着整数参数将“偏好”具有实数类型而不是复数类型的签名。
For example, consider the function signatures
例如,考虑函数签名
real pop(real x);
real pop(complex x);Stan will select the first signature when pop is called with an integer argument such as pop(0).
当使用整数参数(如 pop(0))调用 pop 时,Stan 将选择第一个签名。
22.8 Documenting functions
函数文档
Functions will ideally be documented at their interface level. The Stan style guide for function documentation follows the same format as used by the Doxygen (C++) and Javadoc (Java) automatic documentation systems. Such specifications indicate the variables and their types and the return value, prefaced with some descriptive text.
理想情况下,函数应在其接口级别进行文档化。Stan 函数文档的样式指南遵循 Doxygen(C++)和 Javadoc(Java)自动文档系统使用的相同格式。这些规范指示变量及其类型和返回值,并以一些描述性文本作为前言。
For example, here’s some documentation for the prediction matrix generator.
例如,这是预测矩阵生成器的一些文档。
/**
* Return a data matrix of specified size with rows
* corresponding to items and the first column filled
* with the value 1 to represent the intercept and the
* remaining columns randomly filled with unit-normal draws.
*
* @param N Number of rows corresponding to data items
* @param K Number of predictors, counting the intercept, per
* item.
* @return Simulated predictor matrix.
*/
matrix predictors_rng(int N, int K) {
// ...The comment begins with /**, ends with */, and has an asterisk (*) on each line. It uses @param followed by the argument’s identifier to document a function argument. The tag @return is used to indicate the return value. Stan does not (yet) have an automatic documentation generator like Javadoc or Doxygen, so this just looks like a big comment starting with /* and ending with */ to the Stan parser.
注释以 /** 开始,以 */ 结束,每行都有一个星号(*)。它使用 @param 后跟参数的标识符来记录函数参数。标签 @return 用于指示返回值。Stan(尚)没有像 Javadoc 或 Doxygen 那样的自动文档生成器,所以对于 Stan 解析器来说,这只是一个以 /* 开始并以 */ 结束的大注释。
For functions that raise exceptions, exceptions can be documented using @throws.7
对于引发异常的函数,可以使用 @throws 记录异常。8
For example,
例如,
/** ...
* @param theta
* @throws If any of the entries of theta is negative.
*/
real entropy(vector theta) {
// ...
}Usually an exception type would be provided, but these are not exposed as part of the Stan language, so there is no need to document them.
通常会提供异常类型,但这些不作为 Stan 语言的一部分公开,因此无需记录它们。
22.9 Summary of function types
函数类型总结
Functions may have a void or non-void return type and they may or may not have one of the special suffixes, _lpdf, _lpmf, _lp, or _rng.
函数可能具有 void 或非 void 返回类型,它们可能有也可能没有特殊后缀之一,_lpdf、_lpmf、_lp 或 _rng。
Void vs. non-void return
Void 与非 void 返回
Only functions declared to return void may be used as statements. These are also the only functions that use return statements with no arguments.
只有声明返回 void 的函数才能用作语句。这些也是唯一使用没有参数的 return 语句的函数。
Only functions declared to return non-void values may be used as expressions. These functions require return statements with arguments of a type that matches the declared return type.
只有声明返回非 void 值的函数才能用作表达式。这些函数需要具有与声明的返回类型匹配的类型的参数的 return 语句。
Suffixed or non-suffixed
带后缀或不带后缀
Only functions ending in _lpmf or _lpdf and with return type real may be used as probability functions in distribution statements.
只有以 _lpmf 或 _lpdf 结尾且返回类型为 real 的函数才能用作分布语句中的概率函数。
Only functions ending in _lp may access the log probability accumulator through distribution statements or target += statements. Such functions may only be used in the transformed parameters or model blocks.
只有以 _lp 结尾的函数才能通过分布语句或 target += 语句访问对数概率累加器。此类函数只能在转换参数或模型块中使用。
Only functions ending in _rng may access the built-in pseudo-random number generators. Such functions may only be used in the generated quantities block or transformed data block, or in the bodies of other user-defined functions ending in _rng.
只有以 _rng 结尾的函数才能访问内置的伪随机数生成器。此类函数只能在生成量块或转换数据块中使用,或在以 _rng 结尾的其他用户定义函数的主体中使用。
22.10 Recursive functions
递归函数
Stan supports recursive function definitions, which can be useful for some applications. For instance, consider the matrix power operation, \(A^n\), which is defined for a square matrix \(A\) and positive integer \(n\) by
Stan 支持递归函数定义,这对某些应用可能很有用。例如,考虑矩阵幂运算 \(A^n\),它对于方阵 \(A\) 和正整数 \(n\) 定义为
\[ A^n = \begin{cases} \textrm{I} & \quad\text{if } n = 0, \text{ and} \\ A \, A^{n-1} & \quad\text{if } n > 0. \end{cases} \]
where \(\textrm{I}\) is the identity matrix. This definition can be directly translated to a recursive function definition.
其中 \(\textrm{I}\) 是单位矩阵。这个定义可以直接转换为递归函数定义。
matrix matrix_pow(matrix a, int n) {
if (n == 0) {
return diag_matrix(rep_vector(1, rows(a)));
} else {
return a * matrix_pow(a, n - 1);
}
}It would be more efficient to not allow the recursion to go all the way to the base case, adding the following conditional clause.
为避免递归深入到基本情况,可以添加以下条件子句以提高效率。
else if (n == 1) {
return a;
}22.11 Truncated random number generation
截断随机数生成
Generation with inverse CDFs
使用逆 CDF 生成
To generate random numbers, it is often sufficient to invert their cumulative distribution functions. This is built into many of the random number generators. For example, to generate a standard logistic variate, first generate a uniform variate \(u \sim \textsf{uniform}(0, 1)\), then run through the inverse cumulative distribution function, \(y = \textrm{logit}(u)\). If this were not already built in as logistic_rng(0, 1), it could be coded in Stan directly as
要生成随机数,通常只需反转其累积分布函数即可。这已内置在许多随机数生成器中。例如,要生成标准 logistic 变量,首先生成均匀变量 \(u \sim \textsf{uniform}(0, 1)\),然后通过逆累积分布函数运行,\(y = \textrm{logit}(u)\)。如果这还没有内置为 logistic_rng(0, 1),可以在 Stan 中直接编码为
real standard_logistic_rng() {
real u = uniform_rng(0, 1);
real y = logit(u);
return y;
}Following the same pattern, a standard normal RNG could be coded as
按照相同的模式,标准正态 RNG 可以编码为
real standard_normal_rng() {
real u = uniform_rng(0, 1);
real y = inv_Phi(u);
return y;
}that is, \(y = \Phi^{-1}(u)\), where \(\Phi^{-1}\) is the inverse cumulative distribution function for the standard normal distribution, implemented in the Stan function inv_Phi.
即 \(y = \Phi^{-1}(u)\),其中 \(\Phi^{-1}\) 是标准正态分布的逆累积分布函数,在 Stan 函数 inv_Phi 中实现。
In order to generate non-standard variates of the location-scale variety, the variate is scaled by the scale parameter and shifted by the location parameter. For example, to generate \(\textsf{normal}(\mu, \sigma)\) variates, it is enough to generate a uniform variate \(u \sim \textsf{uniform}(0, 1)\), then convert it to a standard normal variate, \(z = \Phi^{-1}(u)\), where \(\Phi^{-1}(\cdot)\) is the inverse cumulative distribution function for the standard normal, and then, finally, scale and translate it, \(y = \mu + \sigma \times z\). In code,
为了生成位置-尺度类型的非标准变量,变量通过尺度参数缩放并通过位置参数移动。例如,要生成 \(\textsf{normal}(\mu, \sigma)\) 变量,只需生成均匀变量 \(u \sim \textsf{uniform}(0, 1)\),然后将其转换为标准正态变量,\(z = \Phi(u)\),其中 \(\Phi\) 是标准正态的逆累积分布函数,然后最后缩放和平移它,\(y = \mu + \sigma \times z\)。在代码中,
real my_normal_rng(real mu, real sigma) {
real u = uniform_rng(0, 1);
real z = inv_Phi(u);
real y = mu + sigma * z;
return y;
}A robust version of this function would test that the arguments are finite and that sigma is non-negative, e.g.,
此函数的健壮版本将测试参数是有限的,且 sigma 是非负的,例如,
if (is_nan(mu) || is_inf(mu)) {
reject("my_normal_rng: mu must be finite; ",
"found mu = ", mu);
}
if (is_nan(sigma) || is_inf(sigma) || sigma < 0) {
reject("my_normal_rng: sigma must be finite and non-negative; ",
"found sigma = ", sigma);
}Truncated variate generation
截断变量生成
Often truncated uniform variates are needed, as in survival analysis when a time of death is censored beyond the end of the observations. To generate a truncated random variate, the cumulative distribution is used to find the truncation point in the inverse CDF, a uniform variate is generated in range, and then the inverse CDF translates it back.
经常需要截断的均匀变量,如在生存分析中,当死亡时间在观察结束后被审查时。要生成截断的随机变量,使用累积分布在逆 CDF 中找到截断点,在范围内生成均匀变量,然后逆 CDF 将其转换回来。
Truncating below
下截断
For example, the following code generates a \(\textsf{Weibull}(\alpha, \sigma)\) variate truncated below at a time \(t\),9
例如,以下代码生成在时间 \(t\) 处下截断的 \(\textsf{Weibull}(\alpha, \sigma)\) 变量,10
real weibull_lb_rng(real alpha, real sigma, real t) {
real p = weibull_cdf(t | alpha, sigma); // cdf for lb
real u = uniform_rng(p, 1); // unif in bounds
real y = sigma * (-log1m(u))^inv(alpha); // inverse cdf
return y;
}Truncating above and below
上下截断
If there is a lower bound and upper bound, then the CDF trick is used twice to find a lower and upper bound. For example, to generate a \(\textsf{normal}(\mu, \sigma)\) truncated to a region \((a, b)\), the following code suffices,
如果有下界和上界,则 CDF 技巧被使用两次以找到下界和上界。例如,要生成截断到区域 \((a, b)\) 的 \(\textsf{normal}(\mu, \sigma)\),以下代码就足够了,
real normal_lub_rng(real mu, real sigma, real lb, real ub) {
real p_lb = normal_cdf(lb | mu, sigma);
real p_ub = normal_cdf(ub | mu, sigma);
real u = uniform_rng(p_lb, p_ub);
real y = mu + sigma * inv_Phi(u);
return y;
}To make this more robust, all variables should be tested for finiteness, sigma should be tested for positiveness, and lb and ub should be tested to ensure the upper bound is greater than the lower bound. While it may be tempting to compress lines, the variable names serve as a kind of chunking of operations and naming for readability; compare the multiple statement version above with the single statement
为了使其更健壮,应测试所有变量的有限性,应测试 sigma 的正性,应测试 lb 和 ub 以确保上界大于下界。虽然压缩行可能很诱人,但变量名称用作操作分块和命名以提高可读性;将上面的多语句版本与单语句进行比较,可以明显看出多语句版本的可读性更好。
return mu + sigma * inv_Phi(uniform_rng(normal_cdf(lb | mu, sigma),
normal_cdf(ub | mu, sigma)));for readability. The names like p indicate probabilities, and p_lb and p_ub indicate the probabilities of the bounds. The variable u is clearly named as a uniform variate, and y is used to denote the variate being generated itself.
以提高可读性。像 p 这样的名称表示概率,p_lb 和 p_ub 表示边界的概率。变量 u 明确命名为均匀变量,y 用于表示正在生成的变量本身。
The main problem with comments is that they can be misleading, either due to misunderstandings on the programmer’s part or because the program’s behavior is modified after the comment is written. The program always behaves the way the code is written, which is why refactoring complex code into understandable units is preferable to simply adding comments.↩︎
注释的主要问题是它们可能具有误导性,要么是由于程序员的误解,要么是因为在编写注释后程序的行为被修改。程序始终按照代码编写的方式运行,这就是为什么将复杂代码重构为可理解的单元比简单地添加注释更可取。↩︎
Just because this makes it possible to code a rejection sampler does not make it a good idea. Rejections break differentiability and the smooth exploration of the posterior. In Hamiltonian Monte Carlo, it can cause the sampler to be reduced to a diffusive random walk.↩︎
仅仅因为这使得编码拒绝采样器成为可能并不意味着这是个好主意。拒绝会破坏可微性和后验的平滑探索。在哈密顿蒙特卡罗中,它可能导致采样器退化为扩散随机游走。↩︎
A range of built-in validation routines is coming to Stan soon! Alternatively, the
rejectstatement can be used to check constraints on the simplex.↩︎Stan 即将推出一系列内置验证例程!或者,可以使用
reject语句来检查单纯形上的约束。↩︎As of Stan 2.9.0, the only way a user-defined producer will raise an exception is if a function it calls (including distribution statements) raises an exception via the reject statement.↩︎
截至 Stan 2.9.0,用户定义的生产者引发异常的唯一方法是它调用的函数(包括分布语句)通过拒绝语句引发异常。↩︎
The original code and impetus for including this in the manual came from the Stan forums post; by user
lcomm, who also explained truncation above and below.↩︎