Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkfreetext does not detect ascii(160) #177

Open
dustymc opened this issue Feb 4, 2025 · 0 comments
Open

checkfreetext does not detect ascii(160) #177

dustymc opened this issue Feb 4, 2025 · 0 comments
Milestone

Comments

@dustymc
Copy link
Contributor

dustymc commented Feb 4, 2025

Source

https://arctos.database.museum/agent/21358024

Summary

An agent was loaded with what looks like a leading space.




arctosprod@arctos>>  select '|'||preferred_agent_name from agent where agent_id=21358024;
            ?column?            
--------------------------------
 | Tri-County Health Department
(1 row)



arctosprod@arctos>>  select checkfreetext(preferred_agent_name) from agent where agent_id=21358024;
 checkfreetext 
---------------
 t


arctosprod@arctos>>  select dump(preferred_agent_name) from agent where agent_id=21358024;
NOTICE:   ---->160
NOTICE:  T---->84
NOTICE:  r---->114
NOTICE:  i---->105
NOTICE:  ----->45
NOTICE:  C---->67
NOTICE:  o---->111
NOTICE:  u---->117
NOTICE:  n---->110
NOTICE:  t---->116
NOTICE:  y---->121
NOTICE:   ---->32
NOTICE:  H---->72
NOTICE:  e---->101
NOTICE:  a---->97
NOTICE:  l---->108
NOTICE:  t---->116
NOTICE:  h---->104
NOTICE:   ---->32
NOTICE:  D---->68
NOTICE:  e---->101
NOTICE:  p---->112
NOTICE:  a---->97
NOTICE:  r---->114
NOTICE:  t---->116
NOTICE:  m---->109
NOTICE:  e---->101
NOTICE:  n---->110
NOTICE:  t---->116

ASCII(160) is a Windows-encoded non-breaking space.

Task 1

Find some way to catch this with checkfreetext.

TASK 2:

The data (ArctosDB/arctos#8495) had multiple obvious encoding problems, https://handbook.arctosdb.org/documentation/encoding.html should probably have some sort of 'if problems, back all the way out and force-encode to UTF before doing anything else' guidance added.

User Interface Changes

none

Data Structure Changes

none

dustymc added a commit to ArctosDB/documentation-wiki that referenced this issue Feb 4, 2025
@dustymc dustymc added this to the pre-release milestone Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant