Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data quality error: 8 consonants or more #11302

Open
CharlesNepote opened this issue Jan 27, 2025 · 0 comments
Open

Data quality error: 8 consonants or more #11302

CharlesNepote opened this issue Jan 27, 2025 · 0 comments
Labels
brands Not taxonomized yet. https://wiki.openfoodfacts.org/Brands 🧽 Data quality https://wiki.openfoodfacts.org/Quality ingredients Product name

Comments

@CharlesNepote
Copy link
Member

Fields with words containing 7 consonants in a row are possible:

  • product name (rare): Hirschschnitzel
  • brands: some brands do contain 6 consonants, such as knightsbridge, feldschlosschen
  • ingredients: some ingredients' lists seem to contain at least "durchschnittliche" (german word), with 7 consonants

But I was not able to find real words with 8 or more consonants in a row, excepting:

  • the product names "Frutchtschnitte" and "Borschtsch" (8 consonants)
  • names with Roman numbers. Eg. MDLCCXXXVI contains 9 consonants in a row.

In Mirabelle, I have searched for products containing 8 consonants in a row:

  • for the fields product_name, brands_tags, ingredients_tags
  • excepting Romand numbers and "Frutchtschnitte" and "Borschtsch"
    The result is 410 products as of today (2025-01-27).

Many of these products are vandalism. Many others are just bad OCR.

TODO: look at other fields.

@github-project-automation github-project-automation bot moved this to To discuss and validate in 🍊 Open Food Facts Server issues Jan 27, 2025
@CharlesNepote CharlesNepote added brands Not taxonomized yet. https://wiki.openfoodfacts.org/Brands 🧽 Data quality https://wiki.openfoodfacts.org/Quality ingredients Product name labels Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
brands Not taxonomized yet. https://wiki.openfoodfacts.org/Brands 🧽 Data quality https://wiki.openfoodfacts.org/Quality ingredients Product name
Projects
Status: To discuss and validate
Status: To do
Development

No branches or pull requests

1 participant