Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Document AI OCR - "Input Document too large" #267

Open
napopa opened this issue Mar 5, 2025 · 0 comments
Open

Google Document AI OCR - "Input Document too large" #267

napopa opened this issue Mar 5, 2025 · 0 comments

Comments

@napopa
Copy link

napopa commented Mar 5, 2025

Hi,

I'm using Google's Document AI API for OCR. it mostly works but I'm getting this "Input Document too large" error on a few documents that don't really seem too large to begin with -- 1 page scans worth ~300kb.

I tried manually processing the same document uploading through the Document AI processor web interface and it was OCR'd no problem.

Any clues about what could be going on?

here's the full output from PaperlessGPT logs, debug level

time="2025-03-05T02:42:35Z" level=debug msg="Found at least 25 remaining documents with tag paperless-gpt-ocr-auto"

time="2025-03-05T02:42:35Z" level=info msg="Processing document for OCR" document_id=839

time="2025-03-05T02:42:35Z" level=info msg="Starting OCR processing" document_id=839

time="2025-03-05T02:42:35Z" level=debug msg="Making HTTP request" headers="map[Authorization:[Token ****]]" method=GET url="http://192.168.178.136:8777/api/documents/839/download/"

time="2025-03-05T02:42:39Z" level=debug msg="Downloaded document images" document_id=839 page_count=1

time="2025-03-05T02:42:39Z" level=debug msg="Processing page" document_id=839 page=1

time="2025-03-05T02:42:42Z" level=error msg="Failed to process document" error="rpc error: code = InvalidArgument desc = Input Document too large" location=us processor_id=***** project_id=*****

time="2025-03-05T02:42:42Z" level=error msg="Error in processAutoTagDocuments: error in processAutoOcrTagDocuments: error processing OCR for document 839: error performing OCR for document 839, page 1: error processing document: rpc error: code = InvalidArgument desc = Input Document too large"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant