Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature auto infer orient as table or split in read_json #60969

Conversation

chandra-teajunkie
Copy link

Example:

This example demonstrates how to use the updated functionality of read_json to correctly handle the new orient='table' format.

Saving DataFrame to JSON with orient='table':

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({"A": [1, 2, 3]})

df.to_json('test.json', indent=1, orient='table')
read = pd.read_json('test.json')
print(read)

   A
0  1
1  2
2  3

Saving DataFrame to JSON with orient='split':

df = pd.DataFrame({"A": [1, 2, 3]})

df.to_json('test.json', indent=1, orient='split')
read = pd.read_json('test.json')
print(read)

   A
0  1
1  2
2  3

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am generally -1 on this feature, especially if it will require fully loading the document twice

with open(path_or_buf, encoding="utf-8") as f:
json_data = json.load(f)
else:
json_data = json.load(path_or_buf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is loading the entire JSON document twice now? Isn't that going to at least double the runtime?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @WillAyd

Thanks a lot for your feedback. Much appreciated.
Yeah I did not fully consider the performance aspect of this.

So, if there is no orient explicitly mentioned by the user, this will read the document one extra time as you have mentioned.

But, I couldn't think of any other way to validate the schema of the json file to automatically infer an appropriate orient.

@WillAyd WillAyd added the IO JSON read_json, to_json, json_normalize label Feb 21, 2025
@mroeschke
Copy link
Member

I am also generally -1 on this feature. We've tried to move toward explicit behaviors over time, and inferencing goes against explicitness.

@chandra-teajunkie
Copy link
Author

Hi @mroeschke
Oh alright.

@WillAyd
Copy link
Member

WillAyd commented Feb 25, 2025

Thanks for the interest in a PR @chandra-teajunkie , but from the discussion I don't think this is one we will move forward with

@WillAyd WillAyd closed this Feb 25, 2025
@chandra-teajunkie
Copy link
Author

Thanks for the feedbacks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: automatically detect orient= for read_json if not supplied
3 participants