Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzier Quote Search #14

Open
nstringham opened this issue May 6, 2022 · 2 comments
Open

Fuzzier Quote Search #14

nstringham opened this issue May 6, 2022 · 2 comments

Comments

@nstringham
Copy link
Collaborator

nstringham commented May 6, 2022

User Story

As a user of the compsci bot I would like to be able to find quotes without needing to type a piece of the quote exactly.

Bug Report

Steps to Reproduce

  1. type !quote Tehranipour
    • observe that the bot returns a quote that contains "Tehranipour"
  2. type !quote tehranipour

Expected Behavior

that the bot returns a quote that contains "Tehranipour"

Actual Behavior

the bot returns an error

Code Analysis

this line performs an exact case sensitive search for the query in the quote
https://github.com/Nanosplitter/CompsciBot/blob/b8b60cd7b8082fee6cfaa38150f0fd7c285893f9/noncommands/quotes.py#L29

Suggested Improvements

I have 3 proposals that each fix this problem

Option A - Case Insensitive Match

replace keywords in line with something case insensitive such as keywords.lower() in line.lower()

Option B - Fast Fuzzy Search

replace keywords in line with a primitive fuzzy search algorithm/library to allow user to make typos and spelling mistakes

Option C - Fuzzy Search with index

use a proper fuzzy search algorithm with a index that is generated once and used for future searches

note: that this solution works well if the quotes are stored in a database

Comparison

Case Insensitive Misspellings and Typos Performance Impact Complexity
Option A ✅ yes ❌ no 🟨 slightly slower 🟩 not complex
Option B ✅ yes ✅ small mistakes 🟥 much slower 🟨 moderately complex
Option C ✅ yes ✅ medium mistakes 🟩 much faster 🟥 complex
coxender added a commit that referenced this issue May 30, 2022
- compare in lower case when in quote search
- fix Issue #14  with solution A, the simplest solution
@coxender
Copy link
Collaborator

I think in the long run option C would be better saved until after we migrate quotes to a database (unless that has already happened and I am unaware ). This issue should still be open then even if option A gets merged into main.

nstringham pushed a commit that referenced this issue Aug 30, 2022
- compare in lower case when in quote search
- fix Issue #14  with solution A, the simplest solution
@Nanosplitter
Copy link
Collaborator

The quotes are now in a DB so option C is now much easier if you want to do it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📋 Backlog
Development

No branches or pull requests

3 participants