Skip to content

[opt](catalog) add iceberg branch/tag #2445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions docs/lakehouse/catalogs/iceberg-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,28 @@ SELECT * FROM iceberg_table FOR TIME AS OF '2023-01-01 00:00:00';
SELECT * FROM iceberg_table FOR VERSION AS OF 123456789;
```

### Branch and Tag

> This feature is supported since version 3.1.0

Reading specific branches and tags of Iceberg tables is supported.

Multiple syntax forms are supported to be compatible with systems such as Spark/Trino.

```sql
-- BRANCH
SELECT * FROM iceberg_tbl@brand(branch1);
SELECT * FROM iceberg_tbl@brand("name" = "branch1");
SELECT * FROM iceberg_tbl FOR VERSION AS OF 'branch1';

-- TAG
SELECT * FROM iceberg_tbl@tag(tag1);
SELECT * FROM iceberg_tbl@tag("name" = "tag1");
SELECT * FROM iceberg_tbl FOR VERSION AS OF 'tag1';
```

For the `FOR VERSION AS OF` syntax, Doris will automatically determine whether the parameter is a timestamp or a Branch/Tag name.

## Write Operations

### INSERT INTO
Expand Down
6 changes: 6 additions & 0 deletions docs/lakehouse/sql-convertor/sql-convertor-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,12 @@ The following table shows how various data types are displayed in different seri
| `enable_sql_convertor_features` | `set enable_sql_convertor_features="ctas"` | Session variable, user-specified to enable certain special features of sql converter. `ctas`: Allows conversion of the `SELECT` part of a `CTAS` statement. (This variable is supported since Doris 3.0.6 and SQL Convertor 1.0.8.10)|
| `sql_convertor_config` | `set sql_convertor_config = '{"ignore_udf": ["func1", "func2", "fucn3"]}'` | Session variable used to specify that SQL Convertor ignore some UDFs. SQL Convertor will not convert the functions in the list, otherwise it may report an error "Unknown Function". (This variable is supported since Doris 3.0.6 and SQL Convertor 1.0.8.10)|

## Best Practices

- Specify functions that do not need to be converted

In some cases, you may not be able to find a function in Doris that is completely consistent with the original system, or some functions after conversion may not behave exactly the same as the original function under some special parameters. In this case, the user can first use UDF to implement a function that is completely consistent with the original system and register it in Doris. Then, add this UDF in `ignore_udf` of `sql_convertor_config`. In this way, SQL Convertor will not convert this function, so that users can use UDF to control the function behavior.

## Release Notes

[SQL Convertor Release Notes](https://docs.selectdb.com/docs/ecosystem/sql-converter/sql-converter-release-node)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,28 @@ SELECT * FROM iceberg_tbl FOR TIME AS OF "2022-10-07 17:20:37";
SELECT * FROM iceberg_tbl FOR VERSION AS OF 868895038966572;
```

### Branch 和 Tag

> 该功能自 3.1.0 版本支持

支持读取指定 Iceberg 表的分支(Branch)和标签(Tag)。

支持多种不同的语法形式,以兼容 Spark/Trino 等系统的语法。

```sql
-- BRANCH
SELECT * FROM iceberg_tbl@brand(branch1);
SELECT * FROM iceberg_tbl@brand("name" = "branch1");
SELECT * FROM iceberg_tbl FOR VERSION AS OF 'branch1';

-- TAG
SELECT * FROM iceberg_tbl@tag(tag1);
SELECT * FROM iceberg_tbl@tag("name" = "tag1");
SELECT * FROM iceberg_tbl FOR VERSION AS OF 'tag1';
```

对于 `FOR VERSION AS OF` 语法,Doris 会根据后面的参数,自动判断是时间戳还是 Branch/Tag 名称。

## 写入操作

### INSERT INTO
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,12 @@ SET serde_diactor=<dialect>;
| `enable_sql_convertor_features` | `set enable_sql_convertor_features="ctas"` | 会话变量,用户指定开启 sql convertor 的某些特殊功能。`ctas`: 允许对 `CTAS` 语句中的 `SELECT` 部分进行转换。(该参数自 Doris 3.0.6 和 SQL Convertor 1.0.8.10 支持)|
| `sql_convertor_config` | `set sql_convertor_config = '{"ignore_udf": ["func1", "func2", "fucn3"]}'` | 会话变量,用于指定 SQL Convertor 忽略一些 UDF。在列表中的函数,SQL Convertor 不会进行转换,否则可能报错 "Unknown Function" (该参数自 Doris 3.0.6 和 SQL Convertor 1.0.8.10 支持)|

## 最佳实践

- 指定不需要转换的函数

在某些情况下,可能无法在 Doris 中找到和原系统完全对应的函数,或者部分经过转换后的函数,在一些特殊参数下行为和原函数不完全一致。此时,用户可以先通过 UDF 来实现和原系统完全一致的函数,注册到 Doris 中。之后,在 `sql_convertor_config` 的 `ignore_udf` 中添加这个 UDF。这样,SQL Convetor 将不会对这个函数进行转换,以便用户可以使用 UDF 来控制函数行为。

## 版本变更记录

[SQL Convertor 版本变更记录](https://docs.selectdb.com/docs/ecosystem/sql-converter/sql-converter-release-node)
Expand Down