20  多重索引和范围索引

Multiple Indexing and Range Indexing

本节译者:王嘉宁

本节校审:张梓源(ChatGPT辅助)

Stan allows multiple indexes to be provided for containers (i.e., arrays, vectors, and matrices) in a single position, using either an array of integer indexes or range bounds. In many cases, there are functions that provide similar behavior.

Stan 支持在单个位置使用整数数组或范围边界对容器(即数组、向量和矩阵)进行多重索引。此外,还有一些函数可以实现类似的功能。

Allowing multiple indexes supports inline vectorization of models. For instance, consider the data model for a varying-slope, varying-intercept hierarchical linear regression, which could be coded as

使用多重索引有助于模型内联向量化。例如,下述代码展示了一个变斜率、变截距的层次线性回归模型的似然函数:

for (n in 1:N) {
  y[n] ~ normal(alpha[ii[n]] + beta[ii[n]] * x[n], sigma);
}

With multiple indexing, this can be coded in one line, leading to more efficient vectorized code.

通过多重索引,这可以用一行代码实现,从而产生更高效的向量化代码。

y ~ normal(alpha[ii] + rows_dot_product(beta[ii], x), sigma);

This latter version is faster than the loop version; it is equivalent in speed to the clunky assignment to a local variable.

后一种写法比循环写法更为高效,其性能与使用局部变量进行赋值的版本相当。

{
  vector[N] mu;
  for (n in 1:N) {
    mu[n] = alpha[ii[n]] + beta[ii[n]] * x[n];
  }
  y ~ normal(mu, sigma);
}

The boost in speed compared to the original version is because the single call to the normal log density in the distribution statement will be much more memory efficient than the original version.

性能提升的原因在于:后一版本中仅调用一次正态分布的对数密度函数,因此内存使用效率远高于循环版本。

20.1 Multiple indexing

多重索引

The following is the simplest concrete example of multiple indexing with an array of integers; the ellipses stand for code defining the variables as indicated in the comments.

以下示例展示了如何使用整数数组进行最简单的多重索引操作,其中省略号表示变量定义的代码部分:

array[3] int c;
// ... define: c == (5, 9, 7)
array[4] int idxs;
// ... define: idxs == (3, 3, 1, 2)
array[4] int d;
d = c[idxs];    // result: d == (7, 7, 5, 9)

In general, the multiple indexed expression c[idxs] is defined as follows, assuming idxs is of size K.

一般来说,假设 idxs 的大小为 K,则多重索引表达式 c[idxs] 定义如下。

c[idxs] = ( c[idxs[1]], c[idxs[2]], ..., c[idxs[K]] )

Thus c[idxs] is of the same size as idxs, which is K in this example.

因此,在此示例中,c[idxs] 的大小与 idxs 相同,即为 K

Multiple indexing can also be used with multi-dimensional arrays. For example, consider the following.

多重索引也可用于多维数组。例如,请考虑以下情况。

array[2, 3] int c;
// ... define: c = ((1, 3, 5), ((7, 11, 13))
array[4] int idxs;
// ... define: idxs = (2, 2, 1, 2)
array[4, 3] int d
d = c[idxs];    // result: d = ((7, 11, 13), (7, 11, 13),
                //              (1, 3, 5), (7, 11, 13))

That is, putting an index in the first position acts exactly the same way as defined above. The fact that the values are themselves arrays makes no difference—the result is still defined by c[idxs][j] == c[idxs[j]].

也就是说,将索引放在第一个位置的行为与上面定义的完全相同。值本身是数组这一事实并没有什么区别——结果仍然由 c[idxs][j] == c[idxs[j]] 定义。

Multiple indexing may also be used in the second position of a multi-dimensional array. Continuing the above example, consider a single index in the first position and a multiple index in the second.

多重索引也可用于多维数组的第二个位置。继续上面的例子,考虑在第一个位置使用单个索引,在第二个位置使用多重索引。

array[4] int e;
e = c[2, idxs]; // result:  c[2] = (7, 11, 13)
                // result:  e = (11, 11, 7, 11)

The single index is applied, the one-dimensional result is determined, then the multiple index is applied to the result. That is, c[2,idxs] evaluates to the same value as c[2][idxs].

首先应用单个索引,得到一个一维的结果;随后再对该结果进行多重索引。也就是说,c[2,idxs] 的计算结果与 c[2][idxs] 相同。

Multiple indexing can apply to more than one position of a multi-dimensional array. For instance, consider the following

多重索引可以应用于多维数组的多个位置。例如,考虑以下情况。

array[2, 3] int c;
// ... define: c = ((1, 3, 5), (7, 11, 13))
array[3] int idxs1;
// ... define: idxs1 = (2, 2, 1)
array[2] int idxs2;
// ... define: idxs2 = (1, 3)
array[3, 2] int d;
d = c[idxs1, idxs2];  // result: d = ((7, 13), (7, 13), (1, 5))

With multiple indexes, we no longer have c[idxs1, idxs2] being the same as c[idxs1][idxs2]. Rather, the entry d[i, j] after executing the above is given by

当同时在多个位置使用多重索引时,c[idxs1, idxs2] 的结果将不再等同于 c[idxs1][idxs2]。相反,在执行上述操作后,条目 d[i, j] 的值为:

d[i, j] == c[idxs1, idxs2][i, j] = c[idxs1[i], idxs2[j]]

This example illustrates the operation of multiple indexing in the general case: a multiple index like idxs1 converts an index i used on the result (here, c[idxs1, idxs2]) to index idxs1[i] in the variable being indexed (here, c). In contrast, a single index just returns the value at that index, thus reducing dimensionality by one in the result.

此示例说明了多重索引在一般情况下的操作方式:像 idxs1 这样的多重索引将用于结果(此处为 c[idxs1, idxs2])的索引 i 转换为用于被索引变量(此处为 c)的索引 idxs1[i]。相反,单个索引仅返回该索引处的值,从而在结果中将维数减少一个。

20.2 Slicing with range indexes

使用范围索引进行切片

Slicing returns a contiguous slice of a one-dimensional array, a contiguous sub-block of a two-dimensional array, and so on. Semantically, it is just a special form of multiple indexing.

切片操作返回一维数组的连续片段、二维数组的连续子块,以此类推。从语义上讲,它只是多重索引的一种特殊形式。

Lower and upper bound indexes

下界和上界索引

For instance, consider supplying an upper and lower bound for an index.

例如,考虑为索引提供上界和下界。

array[7] int c;
// ...
array[4] int d;
d = c[3:6];  // result: d == (c[3], c[4], c[5], c[6])

The range index 3:6 behaves semantically just like the multiple index (3, 4, 5, 6). In terms of implementation, the sliced upper and/or lower bounded indices are faster and use less memory because they do not explicitly create a multiple index, but rather use a direct loop. They are also easier to read, so should be preferred over multiple indexes where applicable.

范围索引 3:6 从语义上与多重索引 (3, 4, 5, 6) 一样。在实现方面,使用上界和/或下界的切片索引更快速且占用更少内存,因为它们不会显式地创建多重索引,而是使用直接循环。它们也更易于阅读,因此应在适用的情况下优先选择范围索引而不是多重索引。

Lower or upper bound indexes

下界或上界索引

It is also possible to supply just a lower bound, or just an upper bound. Writing c[3:] is just shorthand for c[3:size(c)]. Writing c[:5] is just shorthand for c[1:5].

在 Stan 中,也可以仅指定范围索引的下界或上界。c[3:] 就是 c[3:size(c)] 的简写形式。c[:5] 就是 c[1:5] 的简写形式。

Full range indexes

全范围索引

Finally, it is possible to write a range index that covers the entire range of an array, either by including just the range symbol (:) as the index or leaving the index position empty. In both cases, c[] and c[:] are equal to c[1:size(c)], which in turn is just equal to c.

最后,可以编写一个覆盖数组整个范围的范围索引,方法是仅包含范围符号(:)作为索引,或者将索引位置留空。在这两种情况下,c[]c[:] 都等于 c[1:size(c)],而后者又等于 c

Slicing functions

切片函数

Stan provides head and tail functions that pull out prefixes or suffixes of vectors, row vectors, and one-dimensional arrays. In each case, the return type is the same as the argument type. For example,

Stan 提供了 headtail 函数,用于提取向量、行向量和一维数组的前若干个元素或后若干个元素。在每种情况下,返回类型与参数类型相同。例如,

vector[M] a = ...;
vector[N] b = head(a, N);

assigns b to be a vector equivalent to the first N elements of the vector a. The function tail works the same way for suffixes, with

将向量 a 的前 N 个元素赋值给向量 b。函数 tail 以相同的方式作用于末尾部分,可以提取向量、行向量和一维数组的后若干个元素。在每种情况下,返回类型与参数类型相同。

array[M] a = ...;
array[N] b = tail(a, N);

Finally, there is a segment function, which specifies a first element and number of elements. For example,

最后,还有一个 segment 函数,它可以指定第一个元素和元素数量。例如,

array[15] a = ...;
array[3] b = segment(a, 5, 3);

will set b to be equal to { a[5], a[6], a[7] }, so that it starts at element 5 of a and includes a total of 3 elements.

上式设 b 等于 { a[5], a[6], a[7] },因此它从 a 的第5个元素开始,总共包括了3个元素。

20.3 Multiple indexing on the left of assignments

赋值语句左侧的多重索引

Multiple expressions may be used on the left-hand side of an assignment statement, where they work exactly the same way as on the right-hand side in terms of picking out entries of a container. For example, consider the following.

多重索引表达式也可以用于赋值语句的左侧,其操作方式与右侧完全相同,用于选取容器中的对应元素进行赋值。例如,考虑下面的例子。

array[3] int a;
array[2] int c;
array[2] int idxs;
// ... define: a == (1, 2, 3);  c == (5, 9)
               //         idxs = (3,2)
a[idxs] = c;   // result: a == (1, 9, 5)

The result above can be worked out by noting that the assignment sets a[idxs[1]] (a[3]) to c[1] (5) and a[idxs[2]] (a[2]) to c[2] (9).

通过分析赋值过程可以得出上述结果:赋值操作将 c[1](值为5)赋给了 a[idxs[1]](即 a[3]),将 c[2](值为 9)赋给了 a[idxs[2]](即 a[2])。

The same principle applies when there are many multiple indexes, as in the following example.

当存在多个多重索引时,同样的原理也适用,如下例所示。

array[5, 7] int a;
array[2, 2] int c;
// ...
a[2:3, 5:6] = c;  // result: a[2, 5] == c[1, 1];  a[2, 6] == c[1, 2]
                  //         a[3, 5] == c[2, 1];  a[3, 6] == c[2, 2]

As in the one-dimensional case, the right-hand side is written into the slice, block, or general chunk picked out by the left-hand side.

与一维情况一样,右侧的值会被写入到左侧所指定的切片、块或通用片段中。

Usage on the left-hand side allows the full generality of multiple indexing, with single indexes reducing dimensionality and multiple indexes maintaining dimensionality while rearranging, slicing, or blocking. For example, it is valid to assign to a segment of a row of an array as follows.

在左侧使用时,可以充分发挥多重索引的通用性:单一索引会降低维度,而多重索引则在重新排列、切片或分块的同时保持维度不变。例如,可以像下面这样对数组某一行的片段进行赋值。

array[10, 13] int a;
array[2] int c;
// ...
a[4, 2:3] = c;  // result:  a[4, 2] == c[1];  a[4, 3] == c[2]

Assign-by-value and aliasing

按值赋值与别名问题

Aliasing issues arise when there are references to the same data structure on the right-hand and left-hand side of an assignment. For example, consider the array a in the following code fragment.

当赋值语句的左侧和右侧同时引用同一数据结构时,会引发别名(aliasing)问题。例如,请考虑以下代码片段中的数组 a

array[3] int a;
// ... define: a == (5, 6, 7)
a[2:3] = a[1:2];
// ... result: a == (5, 5, 6)

The reason the value of a after the assignment is \((5,5,6)\) rather than \((5,5,5)\) is that Stan behaves as if the right-hand side expression is evaluated to a fresh copy. As another example, consider the following.

赋值后 a 的值是 \((5,5,6)\) 而不是 \((5,5,5)\) 的原因是,Stan 会将右侧表达式求值为一个新的副本。再看下面这个例子。

array[3] int a;
array[3] int idxs;
// ... define idxs = (2, 1, 3)
a[idxs] = a;

In this case, it is evident why the right-hand side needs to be copied before the assignment.

这种情况下,为什么在赋值之前需要复制右侧表达式是显而易见的。

It is tempting (but wrong) to think of the assignment a[2:3] = a[1:2] as executing the following assignments.

人们可能直观地(但错误地)认为赋值语句 a [2:3] = a [1:2] 等同于顺序执行以下的单独赋值操作:

// ... define: a = (5, 6, 7)
a[2] = a[1];      // result: a = (5, 5, 7)
a[3] = a[2];      // result: a = (5, 5, 5)!

This produces a different result than executing the assignment because a[2]’s value changes before it is used.

这会产生与执行赋值操作不同的结果,因为 a[2] 的值在被使用前就发生了改变。

20.4 Multiple indexes with vectors and matrices

向量和矩阵的多重索引

Multiple indexes can be supplied to vectors and matrices as well as arrays of vectors and matrices.

多重索引可以应用于向量和矩阵,以及向量和矩阵的数组。

Vectors

向量

Vectors and row vectors behave exactly the same way as arrays with multiple indexes. If v is a vector, then v[3] is a scalar real value, whereas v[2:4] is a vector of size 3 containing the elements v[2], v[3], and v[4].

向量和行向量在多重索引方面的行为与数组完全相同。如果 v 是一个向量,那么 v[3] 是一个标量实数值,而 v[2:4] 是一个包含元素 v[2], v[3], 和 v[4] 的大小为3的向量。

The only subtlety with vectors is in inferring the return type when there are multiple indexes. For example, consider the following minimal example.

向量唯一的细微差别在于当存在多重索引时如何推断返回类型。例如,考虑下面这个简单的例子。

array[3] vector[5] v;
array[7] int idxs;
// ...
vector[7] u;
u = v[2, idxs];

array[7] real w;
w = v[idxs, 2];

The key is understanding that a single index always reduces dimensionality, whereas a multiple index never does. The dimensions with multiple indexes (and unindexed dimensions) determine the indexed expression’s type. In the example above, because v is an array of vectors, v[2, idxs] reduces the array dimension but doesn’t reduce the vector dimension, so the result is a vector. In contrast, v[idxs, 2] does not reduce the array dimension, but does reduce the vector dimension (to a scalar), so the result type for w is an array of reals. In both cases, the size of the multiple index (here, 7) determines the size of the result.

关键是要理解,单个索引始终会降低维度,而多重索引则不会。具有多重索引(和未索引的维度)的维度决定了索引表达式的类型。在上面的示例中,因为 v 是向量数组,v[2, idxs] 减少了数组的维数,但并没有减少向量的维数,因此结果是一个向量。相比之下,v[idxs, 2] 不会减少数组的维数,但会将向量的维数减少(到一个标量),因此 w 的结果类型是实数数组。在上述两种情况下,最终结果的大小均由多重索引的大小决定(在本例中为7)。

Matrices

矩阵

Matrices are a bit trickier because they have two dimensions, but the underlying principle of type inference is the same—multiple indexes leave dimensions in place, whereas single indexes reduce them. The following code shows how this works for multiple indexing of matrices.

矩阵稍微有些棘手,因为它们有两个维度,但类型推断的基本原理相同——多重索引保留维度,而单个索引减少维度。下面的代码展示了如何在矩阵中进行多重索引操作。

matrix[5, 7] m;
// ...
row_vector[3] rv;
rv = m[4, 3:5];    // result is 1 x 3
// ...
vector[4] v;
v = m[2:5, 3];     // result is 3 x 1
// ...
matrix[3, 4] m2;
m2 = m[1:3, 2:5];  // result is 3 x 4

The key is realizing that any position with a multiple index or bounded index remains in play in the result, whereas any dimension with a single index is replaced with 1 in the resulting dimensions. Then the type of the result can be read off of the resulting dimensionality as indicated in the comments above.

关键在于认识到任何使用多重索引或有界索引的位置在结果中都会保留,而任何使用单一索引的维度在结果维度中都会被替换为 1。然后可以根据结果的维度来确定结果的类型,如上面注释所示。

Matrices with one multiple index

具有一个多重索引的矩阵

If matrices receive a single multiple index, the result is a matrix. So if m is a matrix, so is m[2:4]. In contrast, supplying a single index, m[3], produces a row vector result. That is, m[3] produces the same result as m[3, ] or m[3, 1:cols(m)].

当对矩阵仅使用一个多重索引时,返回结果仍然是矩阵类型。举例来说,若 m 是矩阵,则 m[2:4] 同样也是一个矩阵。相反,提供单个索引 m[3] 会产生一个行向量结果。也就是说,m[3] 的结果与 m[3, ]m[3, 1:cols(m)] 相同。

Arrays of vectors or matrices

向量或矩阵的数组

With arrays of matrices, vectors, and row vectors, the basic access rules remain exactly the same: single indexes reduce dimensionality and multiple indexes redirect indexes. For example, consider the following example.

对于向量或矩阵的数组,索引的基本规则仍然相同:单个索引会减少维度,多重索引则会重新调整各维度的索引顺序。例如,考虑以下示例。

array[5, 7] matrix[3, 4] m;
// ...
array[2] matrix[3, 4] a;
a = m[1, 2:3];  // knock off first array dimension
a = m[3:4, 5];  // knock off second array dimension

In both assignments, the multiple index knocks off an array dimension, but it’s different in both cases. In the first case, a[i] == m[1, i + 1], whereas in the second case, a[i] == m[i + 2, 5].

在这两个赋值语句中,多重索引会减少数组的维度,但在示例中两种情况下是不同的。在第一种情况下,a[i] == m[1, i + 1],而在第二种情况下,a[i] == m[i + 2, 5]

Continuing the previous example, consider the following.

继续上面的例子,考虑以下情况。

// ...
vector[2] b;
b = a[1, 3, 2:3, 2];

Here, the two array dimensions are reduced as is the column dimension of the matrix, leaving only a row dimension index, hence the result is a vector. In this case, b[j] == a[1, 3, 1 + j, 2].

这里,两个数组维度和矩阵的列维度都被降维了,只留下了行维度索引,因此结果是一个向量。在这种情况下,b[j] == a[1, 3, 1 + j, 2]

This last example illustrates an important point: if there is a lower-bounded index, such as 2:3, with lower bound 2, then the lower bound minus one is added to the index, as seen in the 1 + j expression above.

这个最后的例子阐明了一个重要的点:如果存在下界索引,比如 2:3,其下界为2,那么在索引中需要加上下界减一的值,如上面的 1 + j 表达式所示。

Continuing further, consider continuing with the following.

接下来,我们继续考虑以下内容。

// ...
array[2] row_vector[3] c;
c = a[4:5, 3, 1, 2: ];

Here, the first array dimension is reduced, leaving a single array dimension, and the row index of the matrix is reduced, leaving a row vector. For indexing, the values are given by c[i, j] == a[i + 3, 3, 1, j + 1]

这里,第一个数组维度被降维了,留下一个单独的数组维度,矩阵的行索引也被降维了,留下一个行向量。对于索引,值由 c[i, j] == a[i + 3, 3, 1, j + 1] 给出。

Block, row, and column extraction for matrices

矩阵的块,行,列提取

Matrix slicing can also be performed using the block function. For example,

我们可以使用 block 函数来对矩阵进行切片,例如:

matrix[20, 20] a = ...;
matrix[3, 2] b = block(a, 5, 9, 3, 2);

will set b equal to the submatrix of a starting at index [5, 9] and extending 3 rows and 2 columns. Thus block(a, 5, 9, 3, 2) is equivalent to b[5:7, 9:10].

上面例子设 b 等于从索引 [5, 9] 开始并延伸3行2列的子矩阵。因此 block(a, 5, 9, 3, 2) 相当于 a[5:7, 9:10]

The sub_col function extracts a slice of a column of a matrix as a vector. For example,

sub_col 函数将矩阵的某一列切片提取为向量。例如,

matrix[10, 10] a = ...;
vector b = sub_col(a, 2, 3, 5);

will set b equal to the vector a[2:6, 3], taking the element starting at [2, 3], then extending for a total of 5 rows. The function sub_row works the same way for extracting a slice of a row as a row vector. For example, sub_row(a, 2, 3, 5) is equal to the row vector a[2, 3:7], which also starts at position [2, 3] then extends for a total of 5 columns.

上面例子设 b 为向量 a[2:6, 3],取元素起始位置为 [2,3],然后扩展5行。 函数 sub_row 也可以用于提取行的子序列作为行向量。例如,sub_row(a,2,3,5) 等于行向量 a[2,3:7],它也从位置 [2,3] 开始,然后扩展了5列。

20.5 Matrices with parameters and constants

具有参数和常数的矩阵

Suppose you have a \(3\times 3\) matrix and know that two entries are zero but the others are parameters. Such a situation arises in missing data situations and in problems with fixed structural parameters.

假设有一个 \(3\times 3\) 的矩阵,已知其中两个元素为零,其他元素是参数。这种情况在缺失数据情况和具有固定结构参数的问题中会出现。

Suppose a \(3 \times 3\) matrix is known to be zero at indexes \([1,2]\) and \([1,3]\). The indexes for parameters are included in a “melted” data-frame or database format.

假设一个 \(3\times 3\) 的矩阵在索引 \([1,2]\)\([1,3]\) 处都为零。在“融合”的数据帧或数据库格式中包含参数的索引。

transformed data {
  array[7, 2] int<lower=1, upper=3> idxs
    = { {1, 1},
        {2, 1}, {2, 2}, {2, 3},
        {3, 1}, {3, 2}, {3, 3} };
  // ...

The seven remaining parameters are declared as a vector.

剩下的七个参数被声明为一个向量。

parameters {
  vector[7] A_raw;
  // ...
}

Then the full matrix A is constructed in the model block as a local variable.

然后在模型块中将完整矩阵 A 构造为一个局部变量。

model {
  matrix[3, 3] A;
  for (i in 1:7) {
    A[idxs[i, 1], idxs[i, 2]] = A_raw[i];
  }
  A[1, 2] = 0;
  A[1, 3] = 0;
  // ...
}

This may seem like overkill in this setting, but in more general settings, the matrix size, vector size, and the idxs array will be too large to code directly. Similar techniques can be used to build up matrices with ad-hoc constraints, such as a handful of entries known to be positive.

这在这种情况下可能看起来有些过度,但在更一般的情况下,矩阵大小、向量大小和 idxs 数组会太大而无法直接编码。类似的技术可以用于构建具有特定约束条件的矩阵,例如一些已知为正数的条目。