Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ways of telling by how much a too-large input is too large #5

Closed
domenic opened this issue Aug 30, 2024 · 5 comments · Fixed by #43
Closed

Ways of telling by how much a too-large input is too large #5

domenic opened this issue Aug 30, 2024 · 5 comments · Fixed by #43

Comments

@domenic
Copy link
Collaborator

domenic commented Aug 30, 2024

It's possible an input fed to these APIs is too large. This will be signaled by a rejected promise, probably a "QuotaExceededError" DOMException.

However, @uskay points out that this does not allow you to give informative error messages, telling the user or web developer by how much the input is too large.

There are two possible APIs one could imagine here:

Measure, then summarize

const summarizer = await ai.summarizer.create();
const summarizerCapabilities = await ai.summarizer.capabilities();

const tokenCount = await summarizer.countTokens(input);
if (tokenCount > summarizerCapabilities.maxTokens) {
  console.error(`Too large! You tried to summarize ${tokenCount} tokens, but only up to ${summarizerCapabilities.maxTokens} is possible!`);
} else {
  console.log(await summarizer.summarize(input));
}

This API is probably bad because it requires two round-trips to the language model, one to tokenize, and then a second one to tokenize-plus-summarize.

More informative errors

This would probably look something like:

const summarizer = await ai.summarizer.create();
const summarizerCapabilities = await ai.summarizer.capabilities();

try {
  console.log(await summarizer.summarize(input));
} catch (e) {
  if (e.name === "TooManyTokensError") {
    console.error(`Too large! You tried to summarize ${e.tokenCount} tokens, but only up to ${summarizerCapabilities.maxTokens} is possible!`);
  } else {
    throw e;
  }
}

This is probably better since it only has one round trip.

@domenic
Copy link
Collaborator Author

domenic commented Nov 26, 2024

In an separate thread @andreban pointed out that you probably need both. Above I pointed out situations where "more informative errors" is better, but his scenario is

A developer may want to provide feedback to the user how close they are to reaching the limit before the user submits the content to be summarized, so you'd need to know max tokens and be able to count tokens, without actually summarizing

which cannot really be handled without the "measure, then summarize" approach. We'll just have to be sure to put appropriate warnings on the countTokens() API.

@domenic
Copy link
Collaborator Author

domenic commented Jan 22, 2025

I've put up an initial draft for this in #31.

The prompt API will change to align with this.

@domenic
Copy link
Collaborator Author

domenic commented Feb 26, 2025

Let's assume that whatwg/webidl#1465 works out. Then we will have a QuotaExceededError with properties quota and requested.

Should we rename other properties and methods to align with this?

Considerations:

  • It might be good to move away from "tokens" since that is somewhat specific to the LLM implementation strategy.
  • Similarly, it might be good to move away from the current prompt API's oncontextoverflow, since the word "context" doesn't appear anywhere else in the API.
  • Probably reusing the "quota" language is a good idea?
  • We should probably align prompt API and writing assistance APIs to some degree.
    • Right now prompt API has countPromptTokens(), maxTokens, tokensSoFar, tokensLeft.
    • Writing assistance APIs don't need a tokensSoFar/tokensLeft, since they're not stateful, but they would benefit from a count method and a max value.

Proposal 1:

  • maxTokens => quota
  • tokensSoFar => consumed (or used?)
  • tokensLeft => nothing, just subtract
  • countPromptTokens() => consumption() (or usage()?)
  • oncontextoverflow => onquotaoverflow

These are nice and short. However, I'm a bit worried that these names are too generic. Compare ai.languageModel.consumption("string") to ai.languageModel.countTokens("string"), or ai.languageModel.tokensSoFar to ai.languageModel.consumed.

So proposal 2 would be to add a prefix, and change a few names to fit better with that prefix:

  • maxTokens => inputQuota
  • tokensSoFar => inputUsage
  • tokensLeft => nothing, just subtract
  • countPromptTokens() => measureInputUsage()
  • oncontextoverflow => oninputquotaoverflow

@dslee414
Copy link

Agreed on all the points mentioned for "Considerations".

As you pointed out about the stateful-ness difference between Prompt API and Writing Assistance API, I think proposal (2) covers both APIs better. For example, with proposal (1), it may not be clear that what consumption or usage is exactly for (is it for the whole session?). Also measureInputUsage() seems more clear as opposed to just consumption() or usage(), which sounds like it's for the "current" state. (Alternatively, we could use measureUsage()). That being said, we could do a hybrid of proposal (1) and (2), which is like

  • quota
  • usage
  • measureInputUsage() (or measureUsage())
  • onquotaoverflow

Or, just sticking with proposal (2) also seems good, for consistent use of the prefix. And between "consumption" and "usage", I prefer "usage"; to me "consumption" sounds like it's also assuming LLM implementation. :)

@domenic
Copy link
Collaborator Author

domenic commented Mar 3, 2025

Thanks for your thoughts!

I really like your hybrid proposal. But I think I slightly prefer (2) with the consistent input prefix everywhere. What helped was remembering that we are already using the "input" language for expectedInputLanguages.

I will update the relevant PRs now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants