Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix aggregate queries with case expressions #354

Merged
merged 6 commits into from
Mar 11, 2024
Merged

Conversation

dauinsight
Copy link
Contributor

Addresses issue: #316

Params in the SELECT clause are interpreted differently than the params in the GROUP BY clause.

Example:
The following query fails

SELECT count(1) , case when name = ? then 1 else 2 end from TABLE group by case when name = ? then 1 else 2 end;

Results:

[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Column 'name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (8120) (SQLExecDirectW)"

However, grouping these params avoids the error:

DECLARE @var1 NVARCHAR(64);
SET @var1 = ?
SELECT count(1) , case when name = @var1 then 1 else 2 end 
FROM TABLE 
GROUP BY case when name = @var1 then 1 else 2 end;

@dauinsight dauinsight linked an issue Feb 28, 2024 that may be closed by this pull request
@federicoemartinez
Copy link
Contributor

I would add a test like this one:

    def test_group_by_case(self):
        annotated_queryset = Book.objects.annotate(age=Case(
            When(id__gt=1000, then=Value("new")),
            default=Value("old"),
            output_field=CharField())).values('age').annotate(sum=Sum('id'))
        self.assertEqual(list(annotated_queryset.all()), [])

@@ -616,6 +664,8 @@ def format_params(self, params):

def execute(self, sql, params=None):
self.last_sql = sql
if 'GROUP BY' in sql:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do the same for executemany ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we'll have to update every set of params

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be difficult to support this logic for multiple queries because we group params if they are the same (and different queries have different params).

In the future we could tighten the constraints for grouping params so that it only applies to case/when expressions.

length = len(value)
if length == 0:
return 'NVARCHAR'
return 'NVARCHAR(%s)' % len(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if value is too long, we should use nvarchar(max), shouldn't we?

mssql/base.py Outdated
@@ -571,6 +573,36 @@ def __init__(self, cursor, connection):
self.last_sql = ''
self.last_params = ()

def _as_sql_type(self, typ, value):
if value is None:
return 'INT'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will generate errors.
For example, if the column where I use this value is of type datetime, it will fail as seen in #340.
Maybe we have to replace None with nulls in our query?

Copy link
Contributor

@federicoemartinez federicoemartinez Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried with sql_variant and it fails. I didn't try it with a group by query, but I think we can get the same result

1> declare @test sql_variant;
2> set @test = null
3> update TABLE_NAMe set datetime_column = @test;
4> go
Msg 257, Level 16, State 3, Server sqlhost, Line 3
Implicit conversion from data type sql_variant to datetimeoffset is not allowed. Use the CONVERT function to run this query.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not been able to generate this with the ORM, but using raw i was able to make it fail:

query = Comment.objects.raw("SELECT * from testapp_comment where created_at = %s GROUP BY id", params=(None,))
print(list(query))

The generated query is:

DECLARE @var0 INT = ?  
SELECT * from testapp_comment where created_at = @var0 GROUP BY id

And it fails:
pyodbc.DataError: ('22018', '[22018] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Operand type clash: datetime2 is incompatible with int (206) (SQLExecDirectW)')

It is a silly example, but I wonder what else we can break

@federicoemartinez
Copy link
Contributor

I think it is ok now

@dauinsight
Copy link
Contributor Author

I think it is ok now

Thanks @federicoemartinez , we will have an additional team member review before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Would you consider client side param substitution?
3 participants