Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix notion nested bulleted list import error #2806

Closed
wants to merge 2 commits into from

Conversation

RED-ROSE515
Copy link
Collaborator

This PR resolves the issue : #2747

This issue is occured because strip function removes the nested list and return a list.

So, in this PR, I added a new processHtmlContent function that checks if the HTML has nested list structure.
If it does, preserves the structure while only cleaning up unnecessary artifacts
If not, falls back to the original stripping behavior.

I use this function instead of directly calling strip on the HTML content.

@RED-ROSE515 RED-ROSE515 marked this pull request as draft February 5, 2025 21:55
@RED-ROSE515 RED-ROSE515 marked this pull request as ready for review February 5, 2025 22:27
Copy link
Contributor

@raineorshine raineorshine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the new submission.

Here are my thoughts on this approach:

  • strip does a lot of work, including converting
    tags, decimal codes, formatting tags, html sanitization, colors, and empty formatting attributes. processHtmlContent does not do any of this, and there is no test coverage for these cases. It is assuming that Notion-like HTML will not need of this extra handling, which I don't think is safe.
  • Two separate code pathways to maintain
  • Lacking historical context: Notion HTML worked at one point. When was the regression introduced, and what caused it? Identifying the regression should probably have been the first thing to happen, long before a proposed solution.
  • Lastly, as we recently discovered, HTML importing in general is broken in main right now. It perhaps broke when importFiles was added, but further research is needed. That is a blocker for this issue. It doesn't make sense to fix a subset of HTML imports when the general case is not working.

So, unfortunately I think there are bigger questions to answer before we can propose a solution to Notion HTML.

I'm going to put this PR on hold until we have a better understanding of the original regression and consider how this may or may not be fixed within the existing strip function.

@raineorshine raineorshine added the hold Pause development label Feb 6, 2025
@raineorshine
Copy link
Contributor

Closing in favor of #2814. It restored general HTML import by removing the early strip, at the expense of some other import cases. See PR for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hold Pause development
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants