You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently source-postgres captures array columns as a JSON object with dimensions and elements properties, each of which is an array. This is because PostgreSQL arrays are inherently multidimensional, and we're trying to preserve them as faithfully as possible.
(The JSON schema for a recursive "array whose elements can each be type X or another nested array" type is so incredibly terrible that it was never a realistic option, so "a flat array of values plus a list of array dimensions" was the best we could do to represent the original value faithfully)
But it turns out that nobody ever wants that, they just want simple, boring one-dimensional arrays of values to translate into a JSON array of equivalent values. So we should do that.
Plan:
Add a feature flag flattened_arrays.
a. When flattened_arrays is set, value translation will return just the elements array instead of the array-with-dimensions object.
b. When flattened_arrays is set, discovered schemas will of course reflect that change in behavior.
Set no_flattened_arrays on all source-postgres captures in production.
Merge a followup PR which toggles the default for new captures.
The text was updated successfully, but these errors were encountered:
willdonnelly
changed the title
source-postgres: use flat arrays rather than array objections w/ dimensions
source-postgres: use flat arrays rather than array objects w/ dimensions
Feb 27, 2025
Backstory:
Currently
source-postgres
captures array columns as a JSON object withdimensions
andelements
properties, each of which is an array. This is because PostgreSQL arrays are inherently multidimensional, and we're trying to preserve them as faithfully as possible.(The JSON schema for a recursive "array whose elements can each be type X or another nested array" type is so incredibly terrible that it was never a realistic option, so "a flat array of values plus a list of array dimensions" was the best we could do to represent the original value faithfully)
But it turns out that nobody ever wants that, they just want simple, boring one-dimensional arrays of values to translate into a JSON array of equivalent values. So we should do that.
Plan:
flattened_arrays
.a. When
flattened_arrays
is set, value translation will return just the elements array instead of the array-with-dimensions object.b. When
flattened_arrays
is set, discovered schemas will of course reflect that change in behavior.no_flattened_arrays
on allsource-postgres
captures in production.The text was updated successfully, but these errors were encountered: