Apply 函数

Apply 函数查询一个或多个时间系列并将用户指定的 SQL 表达式或函数应用到选定的时间系列元素。

语法

Apply(sql_express  lvarchar, 
     ts           TimeSeries, ...) 
returns TimeSeries;

Apply(sql_express  lvarchar, 
     multiset_ts  multiset(TimeSeries)) 
returns TimeSeries;

Apply(sql_express  lvarchar, 
     filter       lvarchar, 
     ts           TimeSeries, ...) 
returns TimeSeries;

Apply(sql_express lvarchar, 
     filter      lvarchar, 
     multiset_ts multiset(TimeSeries)) 
returns TimeSeries;

Apply(sql_express lvarchar, 
     begin_stamp datetime year to fraction(5), 
     end_stamp   datetime year to fraction(5), 
     ts          TimeSeries, ...) 
returns TimeSeries with (handlesnulls);

Apply(sql_express lvarchar, 
     begin_stamp datetime year to fraction(5), 
     end_stamp    datetime year to fraction(5), 
     multiset_ts multiset(TimeSeries)) 
returns TimeSeries with (handlesnulls);

Apply(sql_express lvarchar, 
     filter      lvarchar, 
     begin_stamp datetime year to fraction(5), 
     end_stamp   datetime year to fraction(5), 
     ts          TimeSeries, ...) 
returns TimeSeries with (handlesnulls);

Apply(sql_express lvarchar, 
     filter      lvarchar, 
     begin_stamp datetime year to fraction(5), 
     end_stamp   datetime year to fraction(5), 
     multiset_ts multiset(TimeSeries)) 
returns TimeSeries with (handlesnulls);

sql_express: 要求值的 SQL 表达式或函数。
filter: 用于选择时间系列元素的过滤表达式。
begin_stamp: 范围的起始点。请参阅 Clip 函数以获取有关范围规范的更多详细信息。
end_stamp: 范围的结束点。请参阅 Clip 函数以获取有关范围规范的更多详细信息。
ts: 第一个 ts 参数是第一个系列，第二个 ts 参数是第二个系列，依此类推。此函数最多可以采用八个 ts 参数。参数的顺序必须对应于 SQL 表达式或函数中所需的顺序。表达式中 $ 参数的数量没有限制。
multiset_ts: 时间系列的多集。

描述

此函数对给定的时间系列运行用户指定的 SQL 表达式，并生成在输入时间系列的每个符合条件的元素包含表达式结果的新时间系列。

可以通过指定要剪切的时间段和使用过滤表达式限定输入时间系列中的元素。

sql_express 参数是要对每个选定的元素运行的以逗号分隔的表达式列表。可以运行的表达式数没有限制。表达式的结果必须与减去第一个时间戳记列的结果时间系列的对应列匹配。不要将第一个时间戳记指定为第一个表达式；第一个时间戳记是针对每个表达式结果生成的。

表达式的参数可以是输入元素，也可以是输入时间系列的任何列。应该使用 $，后跟给定时间系列在输入时间系列列表上的位置（用于表示数据元素），加上一个点，然后是列号。位置编号和列号都是从零开始。

例如，$0 表示第一个输入时间系列的元素，$0.0 表示它的时间戳记列，$0.1 是时间戳记列后面的列。另一种引用列的方式是直接使用列名，而不使用列号。假设第二个时间系列具有称为 high 的列，那么可以使用 $1.high 来引用该列。如果 high 列是元素中的第二列，那么 $1.high 与 $1.1 等效。

如果 Apply 只有一个时间系列参数，那么可以引用不包括时间系列位置部分的列名；因此，$0.high 与 $high 等效。请注意，$0 始终表示第一个时间系列的全部元素。它不表示时间系列的第一列，即使只有一个时间系列参数也不能表示。

如果使用函数作为表达式，那么该函数必须按照其参数的顺序采用每个输入时间系列的子类型，并返回对应于 Apply 结果时间系列子类型的行类型。在大多数情况下，对函数求值比对一般表达式求值要快。如果性能处于临界状态，应该使计算在函数中执行并使用函数语法。请参阅示例以了解如何实现这一点。

以下示例显示要应用的 Apply 的有效表达式。假设两个时间系列参数具有相同的子类型 daybar(t DATETIME YEAR TO FRACTION(5), high REAL, low REAL, close REAL, vol REAL)。表达式可以是以下任意一个：

"$0.high + $1.high)/2, ($0.low + $1.low)/2"
"($0.1 + $1.1)/2, ($0.2 + $1.2)/2"
"$0.high, $1.high"
"avghigh"

avghigh 的特征符为：

"avghigh(arg1 daybar, arg2 daybar) returns (one_real)"

filter 参数的语法与前一个表达式类似，只是它必须求值为单列布尔值结果。仅选择那些求值为 TRUE 的元素。

"$0.vol > $1.vol and $0.close > ($0.high - $0.low)/2"

包含 multiset_ts 参数的 Apply 通过从集中访存 TimeSeries 值并按照集管理代码返回值的顺序处理值，来指定参数编号。由于集是无序的，因此可能不会按照预测为参数指定编号。包含 multiset_ts 参数的 Apply 仅在可以确保 TimeSeries 值以固定顺序返回时非常有用。有两种方法可以确保这一点：

编写创建集的 C 函数并将该函数用作 Apply 的 multiset_ts 参数。C 函数可以通过您需要的任何顺序返回 TimeSeries 值。
在 multiset_ts 表达式中使用 ORDER BY

包含 multiset_ts 参数的 Apply 为生成的时间系列值并集中的每个时间点对表达式求值一次。剪切时间段中的所有数据都用完时，Apply 将返回生成的系列。

Apply 使用可选的剪切时间范围将数据限制到特定的时间段。如果开始时间点为 NULL，那么 Apply 使用所有输入时间系列的最早有效时间点。如果结束时间点为 NULL，那么 Apply 使用所有输入时间系列的最晚有效时间点。如果未使用可选的剪切时间范围，就相当于开始时间点和结束时间点都是 NULL：Apply 会考虑所有元素。

如果剪切时间范围和过滤表达式同时提供，那么在剪切完成后再进行过滤。

如果对剪切时间范围使用字符串字面值或 NULL，那么至少应该在开始时间点上强制转型为 DATETIME YEAR TO FRACTION(5) 以避免函数解析中出现歧义。

如果指定了多个输入时间系列，将对所有输入时间系列执行并集以生成 Apply 过滤和求值的数据源。因此，Apply 充当并集函数，另外对并集结果执行过滤和处理。有关 Union 函数如何工作的详细信息，请参阅 Union 函数。

返回结果

包含对源时间系列中每一个选定元素执行表达式求值的结果的新时间系列。

示例

以下示例使用 Apply，但不使用过滤参数也不包含剪切的范围：

select Apply('$high-$low',
      datetime(2011-01-01) year to day,
      datetime(2011-01-06) year to day,
      stock_data)::TimeSeries(one_real)
   from daily_stocks
   where stock_name = 'GBase';

以下示例显示包含剪切的范围的 Apply，但不使用过滤器：

select Apply(
   '($0.high+$1.high)/2, ($0.low+$1.low)/2, ($0.final+$1.final)/2,
($0.vol+$1.vol)/2',
   datetime(2011-01-04) year to day, 
   datetime(2011-01-05) year to day,
   t1.stock_data, t2.stock_data)
   ::TimeSeries(stock_bar)
from daily_stocks t1, daily_stocks t2
where t1.stock_name = 'GBase' and t2.stock_name = 'HWP';

以下示例显示使用过滤器的 Apply，但不包含剪切范围。生成的时间系列包含交易范围超出低价 10% 的日期的收盘价格：

create function ts_sum(a stock_bar)
    returns one_real;
    return row(null::datetime year to fraction(5),
   (a.high + a.low + a.final + a.vol))::one_real;
end function;

select Apply('ts_sum',
   '2011-01-03 00:00:00.00000'::datetime year 
      to fraction(5),
   '2011-01-03 00:00:00.00000'::datetime year 
      to fraction(5),
   stock_data)::TimeSeries(one_real) 
   from daily_stocks
      where stock_id = 901;

以下示例将函数用作表达式进行求值以提高性能。第一步是将以下 C 函数编译为 applyfunc.so：

/* begin applyfunc.c */
#include "mi.h"
MI_ROW *
high_low_diff(MI_ROW *row, MI_FPARAM *fp)
{
    MI_ROW_DESC            *rowdesc;
    MI_ROW            *result;
    void            *values[2];
    mi_boolean            nulls[2];
    mi_real            *high, *low;
    mi_real            r;
    mi_integer            len;
    MI_CONNECTION            *conn;
    mi_integer               rc;
    
    nulls[0] = MI_TRUE;
    nulls[1] = MI_FALSE;
    conn = mi_open(NULL,NULL,NULL);
    if ((rc = mi_value(row, 1, (MI_DATUM *) &high, 
      &len)) == MI_ERROR)
   mi_db_error_raise(conn, MI_EXCEPTION,
      "ts_test_float_sql: corrupted argument row");
    if (rc == MI_NULL_VALUE)
   goto retisnull;
    
    if ((rc = mi_value(row, 2, (MI_DATUM *) &low, 
      &len)) == MI_ERROR)
   mi_db_error_raise(conn, MI_EXCEPTION,
      "ts_test_float_sql: corrupted argument row");
    if (rc == MI_NULL_VALUE)
   goto retisnull;
    
    r = *high - *low;
    values[1] = (void *) &r;
    rowdesc = mi_row_desc_create(mi_typestring_to_id(conn, 
      "one_real"));
    result = mi_row_create(conn, rowdesc, (MI_DATUM *) 
      values, nulls);
    mi_close (conn);
return (result);
 retisnull:
    mi_fp_setreturnisnull (fp, 0, MI_TRUE);
return (MI_ROW *) NULL;
}
/* end of applyfunc.c */

然后创建以下 SQL 函数：

create function HighLowDiff(arg stock_bar) returns one_real
external name '/tmp/applyfunc.bld(high_low_diff)'
language C;


select stock_name, Apply('HighLowDiff', 
        stock_data)::TimeSeries(one_real)
from daily_stocks;

以下查询与前一查询等效，但不具备将函数用作表达式进行求值的性能优势：

select stock_name, Apply('$high - $low', 
   stock_data)::TimeSeries(one_real)
from daily_stocks;