Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand ppl command #868

Merged
merged 15 commits into from
Nov 7, 2024
Merged

Conversation

YANG-DB
Copy link
Member

@YANG-DB YANG-DB commented Nov 5, 2024

Description

support expand command

Related Issues

#657

Check List

  • Updated documentation (docs/ppl-lang/README.md)
  • Implemented unit tests
  • Implemented tests for combination with other commands
  • New added source code should include a copyright header
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
# Conflicts:
#	ppl-spark-integration/src/main/antlr4/OpenSearchPPLLexer.g4
#	ppl-spark-integration/src/main/antlr4/OpenSearchPPLParser.g4
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
# Conflicts:
#	ppl-spark-integration/src/main/antlr4/OpenSearchPPLParser.g4
#	ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/CatalystQueryPlanVisitor.java
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
@YANG-DB YANG-DB added Lang:PPL Pipe Processing Language support 0.6 labels Nov 5, 2024
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
#### **expand**
[See additional command details](ppl-expand-command.md)
```sql
- `source= table | expand field_with_array as array_list`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add this example here:

  • source= table | expand json_array(1, 2, 3) as uid | fields uid (returns 3 rows with values 1, 2 and 3)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And add a similar test case in IT.

Copy link
Member Author

@YANG-DB YANG-DB Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LantaoJin question:
currently both flatten & expand only support fieldExpression

expandCommand
    : EXPAND fieldExpression (AS alias = qualifiedName)?
    ;
    
flattenCommand
    : FLATTEN fieldExpression
    ;

Maybe using the next slightly different version of this expand array query ?
source = table | eval array=json_array(1, 2, 3) | expand array as uid | fields name, occupation, uid
would give a similar functional result without actually changing the grammar ?

Optional<Expression> alias = node.getAlias().map(aliasNode -> visitExpression(aliasNode, context));
context.retainAllNamedParseExpressions(p -> (NamedExpression) p);
Explode explodeGenerator = new Explode(field);
scala.collection.mutable.Seq seq = alias.isEmpty() ? seq() : seq(alias.get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename the variable seq to outputs? It is confusing.

Explode explodeGenerator = new Explode(field);
scala.collection.mutable.Seq seq = alias.isEmpty() ? seq() : seq(alias.get());
if(alias.isEmpty())
return context.apply(p -> new Generate(new GeneratorOuter(explodeGenerator), seq(), true, (Option) None$.MODULE$, seq, p));
Copy link
Member

@LantaoJin LantaoJin Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, no need the GeneratorOuter here. How about simplify to

return context.apply(p -> new Generate(explodeGenerator, seq(), false, (Option) None$.MODULE$, seq, p));

Note. the third parameter outer should be false either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LantaoJin the issue here is that in the case where row is null not using GeneratorOuter will not output the row as shown here:

 ( 1, array(STRUCT("1_one", 1), STRUCT(null, 11), STRUCT("1_three", null)) ),
 ( 2, array(STRUCT("2_Monday", 2), null) ),
 ( 3, array(STRUCT("3_third", 3), STRUCT("3_4th", 4)) ),
 ( 4, null )

For the query:
source = $multiValueTable | expand multi_value AS exploded_multi_value | fields exploded_multi_value

return context.apply(p -> new Generate(new GeneratorOuter(explodeGenerator), seq(), true, (Option) None$.MODULE$, seq, p));
else {
//in case an alias does appear - remove the original field from the returning columns
context.apply(p -> new Generate(new GeneratorOuter(explodeGenerator), seq(), true, (Option) None$.MODULE$, seq, p));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@YANG-DB YANG-DB marked this pull request as ready for review November 5, 2024 22:20
remove outer generator

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
remove outer generator

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
@YANG-DB YANG-DB requested a review from LantaoJin November 6, 2024 18:14
Signed-off-by: YANGDB <yang.db.dev@gmail.com>
@LantaoJin LantaoJin merged commit 4303057 into opensearch-project:main Nov 7, 2024
4 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 12, 2024
* add expand command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add expand command with visitor

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* create unit / integration tests

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update expand tests

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add tests

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update doc

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update docs with examples

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update scala style

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update with additional test case
remove outer generator

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update with additional test case
remove outer generator

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update documentation

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

---------

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
(cherry picked from commit 4303057)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
YANG-DB pushed a commit that referenced this pull request Nov 13, 2024
* add expand command



* add expand command with visitor



* create unit / integration tests



* update expand tests



* add tests



* update doc



* update docs with examples



* update scala style



* update with additional test case
remove outer generator



* update with additional test case
remove outer generator



* update documentation



---------


(cherry picked from commit 4303057)

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
kenrickyap pushed a commit to Bit-Quill/opensearch-spark that referenced this pull request Dec 11, 2024
* add expand command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add expand command with visitor

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* create unit / integration tests

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update expand tests

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add tests

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update doc

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update docs with examples

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update scala style

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update with additional test case
remove outer generator

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update with additional test case
remove outer generator

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update documentation

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

---------

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.6 backport 0.6 Lang:PPL Pipe Processing Language support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants