Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the check for misleading links #1901

Merged
merged 7 commits into from
Feb 14, 2025

Conversation

ed-kung
Copy link
Contributor

@ed-kung ed-kung commented Feb 13, 2025

Description

Fix/address issue #323

Currently, isMisleadingLink in lib/rehype-sn.js checks whether links are misleading by comparing the link text to the href. If the link text is URL-like (contains letters, numbers, underscores, and dots), the function adds http:// to the link-text and then checks whether the URL's origin has the same origin as href.

The issue is that if the link text contains other characters, like : or /, which are common in URLs, isMisleadingLink will return false (doesn't identify the link as misleading). This PR fixes the issue by checking if the link text is any valid URL. If it is, it compares the origin of the link text to that of the href, and if the two are different, isMisleadingLink will return true (is misleading).

Before:

After:

Additional Context

This patch does not fix (potentially) misleading links where the domain of the link text and the href are the same. For example, [https://stacker.news/items/123](https://stacker.news/items/321) will render the link text as https://stacker.,news/items/123 while linking to https://stacker.news/items/321. It was considered that there may be legitimate use cases of having a link text and the href be different while sharing the same origin, while not being able to do phishing attacks by this vector. See #323

Checklist

Are your changes backwards compatible? Please answer below:

Some link texts that weren't caught and replaced by isMisleadingLink before will now be caught and replaced. However, I've never observed such a link while using stacker news

On a scale of 1-10 how well and how have you QA'd this change and any features it might affect? Please answer below:

7: I've tested it on local instance with various posts. I'm not sure how to test effects on performance / impact on existing SN posts.

Also, here is a small unit test of isMisleadingLink that I did manually:

function isMisleadingLink (text, href) {
  let misleading = false

  try {
    const hrefUrl = new URL(href)

    try {
      const textUrl = new URL(text)
      if (textUrl.origin !== hrefUrl.origin) {
        misleading = true
      }         
    } catch {}

    if (/^\s*(\w+\.)+\w+/.test(text)) {
      if (new URL(hrefUrl.protocol + text).origin !== hrefUrl.origin) {
        misleading = true
      }
    }
  } catch {}
  return misleading
}

let tests = [
  ["https://stacker.news/items/1234", "https://stacker.news/items/1234"],
  ["innocent text", "https://stacker.news/items/1234"],
  ["innocenttext", "https://stacker.news/items/1234"],
  ["innocent.text", "https://stacker.news/items/1234"],
  ["https://stacker.news/items/1235", "https://stacker.news/items/1234"],
  ["https://google.com", "https://bing.com"],
  ["www.google.com", "https://bing.com"],
  ["www.google.com", "https://www.google.com"],
  ["stacker.news", "https://stacker.news"]
];

for (let [text, href] of tests) {
  console.log(`text: ${text}`);
  console.log(`href: ${href}`);
  console.log(`misleading: ${isMisleadingLink(text, href)}`);
}

Output:

text: https://stacker.news/items/1234
href: https://stacker.news/items/1234
misleading: false
text: innocent text
href: https://stacker.news/items/1234
misleading: false
text: innocenttext
href: https://stacker.news/items/1234
misleading: false
text: innocent.text
href: https://stacker.news/items/1234
misleading: true
text: https://stacker.news/items/1235
href: https://stacker.news/items/1234
misleading: false
text: https://google.com
href: https://bing.com
misleading: true
text: www.google.com
href: https://bing.com
misleading: true
text: www.google.com
href: https://www.google.com
misleading: false
text: stacker.news
href: https://stacker.news
misleading: false

For frontend changes: Tested on mobile, light and dark mode? Please answer below:

not frontend change

Did you introduce any new environment variables? If so, call them out explicitly here:

no

@ed-kung ed-kung marked this pull request as ready for review February 13, 2025 05:21
@ekzyis ekzyis self-requested a review February 13, 2025 11:22
@ekzyis
Copy link
Member

ekzyis commented Feb 13, 2025

Hi, thanks for the PR! Will try to review today.

Also, here is a small unit test of isMisleadingLink that I did manually:

Can you move function isMisleadingLink to lib/url.js and put the unit test into lib/url.spec.js? You'd have to move the existing test cases into the existing describe first to avoid confusion.

It's our only unit test (was created as a proof of concept) and I think the unit test you created would fit well in there!

@ed-kung ed-kung marked this pull request as draft February 13, 2025 19:24
@ed-kung ed-kung marked this pull request as ready for review February 13, 2025 20:01
@ekzyis
Copy link
Member

ekzyis commented Feb 14, 2025

Sorry, didn't get to the review today. Will do it first thing in the morning!

Copy link
Member

@ekzyis ekzyis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks good!

I only added comments to explain the test cases and realized that URLs can contain hyphens but those were nitpicks so I just pushed the changes myself. I hope you don't mind.

@ed-kung
Copy link
Contributor Author

ed-kung commented Feb 14, 2025

no problem at all, thanks!

@huumn
Copy link
Member

huumn commented Feb 14, 2025

Thanks ed! Do you have a lightning address I can send sats to?

@huumn huumn merged commit 15bd1c3 into stackernews:master Feb 14, 2025
6 checks passed
@ed-kung
Copy link
Contributor Author

ed-kung commented Feb 14, 2025

Yes, you can send to simplestacker@getalby.com. Thanks so much! Hope to keep contributing to this great project :)

@ed-kung ed-kung deleted the check-misleading-links branch February 15, 2025 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants