-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Join side aliases should be optional #862
Conversation
Signed-off-by: Lantao Jin <ltjin@amazon.com>
} | ||
|
||
/** Optional alias name for the relation. */ | ||
@Setter @Getter private String alias; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alias
in Relation could be removed since we have the SubqueryAlias
in AST: https://github.com/opensearch-project/opensearch-spark/blob/main/ppl-spark-integration/src/main/java/org/opensearch/sql/ast/tree/SubqueryAlias.java
Remaining this member causes potential issue if conflicts. All table with alias in PPL should add a SubqueryAlias such as #862 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LantaoJin can u give an example for a union query ?
source = a as table_a, b as table_b, c as table_c | ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LantaoJin can u give an example for a union query ? source = a as table_a, b as table_b, c as table_c | ...
No, we didn't support this pattern, see
: tableSource (COMMA tableSource)* (AS alias = qualifiedName)? |
But we supported the pattern:
source = table1, table2, table3 as t | where t.name = 'Molly'
The plan of above is
'Union true, true
:- 'Project [*]
: +- 'Filter ('t.name = Molly)
: +- 'SubqueryAlias t
: +- 'UnresolvedRelation [table1], [], false
:- 'Project [*]
: +- 'Filter ('t.name = Molly)
: +- 'SubqueryAlias t
: +- 'UnresolvedRelation [table2], [], false
+- 'Project [*]
+- 'Filter ('t.name = Molly)
+- 'SubqueryAlias t
+- 'UnresolvedRelation [table3], [], false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add an UT and IT of the above case in latest commit.
* | ||
* @return table name | ||
*/ | ||
public List<String> getTableName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never used.
| | inner JOIN left = l right = r ON l.id = r.id $testTable2 | ||
| | left JOIN left = l right = r ON l.name = r.name $testTable3 | ||
| | cross JOIN left = l right = r $testTable4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comma is optional, I am trying to reduce this usage. But keep the comma won't fail query.
? new Relation(ctx.tableSource().stream().map(this::internalVisitExpression).collect(Collectors.toList())) | ||
: new Relation(ctx.tableSource().stream().map(this::internalVisitExpression).collect(Collectors.toList()), ctx.alias.getText()); | ||
Relation relation = new Relation(ctx.tableSource().stream().map(this::internalVisitExpression).collect(Collectors.toList())); | ||
return ctx.alias != null ? new SubqueryAlias(ctx.alias.getText(), relation) : relation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add SubqueryAlias
ast node instead of adding the alias string into Relation
.
* | ||
* @return table name | ||
*/ | ||
public List<String> getTableName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never used.
Without this fixing, q21 in TPCH couldn't be rewritten to PPL. Link #830 |
@ToString | ||
@Getter | ||
@EqualsAndHashCode(callSuper = false) | ||
@RequiredArgsConstructor | ||
public class Relation extends UnresolvedPlan { | ||
private static final String COMMA = ","; | ||
|
||
private final List<UnresolvedExpression> tableName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LantaoJin lates change the name to indicate it may contain multi-part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
|
||
/** Optional alias name for the relation. */ | ||
@Setter @Getter private String alias; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LantaoJin can u give an example for a union query ?
source = a as table_a, b as table_b, c as table_c | ...
public Expression resolveJoinCondition( | ||
UnresolvedExpression expr, | ||
BiFunction<UnresolvedExpression, CatalystPlanContext, Expression> transformFunction) { | ||
isResolvingJoinCondition = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u make this an atomic bool ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CatalystPlanContext
is generated by each query. All members in CatalystPlanContext
should be thread-safe. There is no multi-threads need to check and change the boolean and other structures of CatalystPlanContext
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add this note on the CatalystPlanContext
class description for better clarity
UnresolvedPlan rightRelation = visit(ctx.tableOrSubqueryClause()); | ||
UnresolvedPlan right = new SubqueryAlias(rightAlias, rightRelation); | ||
UnresolvedPlan right; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some comment explaining the condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Lantao Jin <ltjin@amazon.com>
@LantaoJin also plz resolve conflicts... |
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Conflicts is related to imports. CI tests succeed with the related files. Merging to main. |
* Join side aliases should be optional Signed-off-by: Lantao Jin <ltjin@amazon.com> * address comments Signed-off-by: Lantao Jin <ltjin@amazon.com> * typo Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com>
Description
Imaging a case as following: Assume table1, table2, and table3 all contain a column
id
.To rewrite above SQL query to PPL query, we will get
The PPL query throws an exception with message:
It because the required left side alias
l1
overrides the table aliast1
andt2
.Its logical plan looks
Check #857 for details
Related Issues
Resolves #857
Check List
--signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.