CHROME | Update #1 | |
Built-in AI Early Preview Program Welcome and heads-up about the Prompt API | Authors | |
Contact | See this section | |
Last-updated | Jul 25, 2024 See changelog. |
Welcome, and thank you for participating in our Early Preview Program for built-in AI capabilities (article, talk at Google I/O 2024). Your involvement is invaluable as we explore opportunities to improve or augment web experiences with AI!
The Built-in AI Early Preview Program has several goals:
📣 | Know of other folks who would love to join this program? Or perhaps you got access to this document from a friend? Sign up to get the latest updates directly in your inbox. |
In this first update, we're excited to provide details about the upcoming exploratory Prompt API, designed to facilitate the discovery of AI use cases through local prototyping. More concretely, this API will let you interact with Gemini Nano, on-device, directly in your development environment.
📝 | Exploratory APIs are intended for local prototyping only. With these exploratory APIs, we intend to solicit feedback, confirm assumptions, and determine what task APIs (such as a Translation API) we build in the future. Consequently, the exploratory APIs may never launch. |
As we explore the possibilities of this technology together, it’s critical that we prioritize responsible AI development. To help guide our efforts, take a moment to review Google's Generative AI Prohibited Uses Policy. This policy outlines some key considerations for ethical and safe deployment of AI.
Let's learn and build together ✨!
The Prompt API is provided for local experimentation, to facilitate the discovery of use cases for built-in AI. With this API, you can send natural language instructions to an instance of Gemini Nano in Chrome.
While the Prompt API offers the most flexibility, it won’t necessarily deliver the best results and may not deliver sufficient quality in some cases. That’s why we believe that task-specific APIs (such as a Translation API) paired with fine tuning or expert models, will deliver significantly better results. Our hope for the Prompt API is that it helps accelerate the discovery of compelling use cases to inform a roadmap of task-specific APIs.
The Prompt API is available, behind an experimental flag, from Chrome 127+ for desktop.
Our Built-in AI program is currently focused on desktop platforms. In addition, the following conditions are required for Chrome to download and run Gemini Nano.
Aspect | Windows | MacOS | Linux |
OS version | 10, 11 | ≥ 13 (Ventura) | Not specified |
Storage | At least 22 GB on the volume that contains your Chrome profile. Note that the model requires a lot less storage, it’s just for the sake of having an ample storage margin. | ||
GPU | Integrated GPU, or discrete GPU (e.g. video card). | ||
Video RAM | 4 GB (minimum) | ||
Network connection | A non-metered connection |
🚧 | These are not necessarily the final requirements for Gemini Nano in Chrome. |
We are interested in feedback about performance aspects to further improve the user experience, and adjust the requirements. | |
Not yet supported:
|
Follow these steps to enable Gemini Nano and the Prompt API flags for local experimentation:
// Start by checking if it's possible to create a session based on the availability of the model, and the characteristics of the device.
const canCreate = await window.ai.canCreateTextSession();
// canCreate will be one of the following:
// * "readily": the model is available on-device and so creating will happen quickly
// * "after-download": the model is not available on-device, but the device is capable,
// so creating the session will start the download process (which can take a while).
// * "no": the model is not available for this device.
if (canCreate !== "no") {
const session = await window.ai.createTextSession();
// Prompt the model and wait for the whole result to come back.
const result = await session.prompt("Write me a poem");
console.log(result);
}
const canCreate = await window.ai.canCreateTextSession();
if (canCreate !== "no") {
const session = await window.ai.createTextSession();
// Prompt the model and stream the result:
const stream = session.promptStreaming("Write me an extra-long poem");
for await (const chunk of stream) {
console.log(chunk);
}
}
Each session can be customized with topK and temperature. The default values for these parameters are returned from window.ai.defaultTextSessionOptions()
const defaults = await window.ai.defaultTextSessionOptions();
const session = await window.ai.createGenericSession(
{
temperature: 0.6,
topK: defaults.topK
}
);
Call destroy() to free resources if you no longer need a session. When a session is destroyed, it can no longer be used, and any ongoing execution will be aborted. You may want to keep the session around if you intend to prompt the model often since creating a session can take some time.
await session.prompt(`
You are a friendly, helpful assistant specialized in clothing choices.
`);
session.destroy();
// The promise will be rejected with an error explaining that the session is destroyed.
await session.prompt(`
What should I wear today? It's sunny and I'm unsure between a t-shirt and a polo.
`);
The sequence of characters <ctrl23> is Gemini Nano’s control sequence. It helps the model understand rounds of simulated interactions and when it’s supposed to respond. The sequence is meant to be used directly in the prompt (see the Prompt 101 section for details).
💡 | Currently, the control sequence <ctrl23> is automatically added by the Prompt API at the end of the prompt. |
However, for one shot* or few shots** prompting, it’s better to add the control sequence at the end of every simulated round of interaction (shot). See these examples. *One shot: a prompt where the model is provided with a single example to learn from and generate a response. **Few shots: a prompt where the model is provided with a few examples, typically ranging from two to five, to help it generate a more accurate response. |
The Prompt API may receive errors from the AI runtime. See this section for a list of possible errors, and how they are mapped into DOMExceptions.
The “after-download” state and behavior is not supported. The API doesn’t trigger a download of the model. Instead, Chrome will trigger the download, either as part of the chrome://flags state change, or because of another on-device AI feature.
Currently, promptStreaming() returns a ReadableStream whose chunks successively build on each other.
For example, the following code logs a sequence, such as "Hello," "Hello world," "Hello world I am," "Hello world I am an AI."
for await (const chunk of stream) {
console.log(chunk);
}
This is not the desired behavior. We intend to align with other streaming APIs on the platform, where the chunks are successive pieces of a single long stream. This means the output would be a sequence like "Hello", " world", " I am", " an AI".
For now, to achieve the intended behavior, you can implement the following:
let result = '';
let previousLength = 0;
for await (const chunk of stream) {
const newContent = chunk.slice(previousLength);
console.log(newContent);
previousLength = chunk.length;
result += newContent;
}
console.log(result);
🆕 | New: session persistence |
Chrome version: 128.0.6606.0 and above | |
From Chrome 128.0.6606.0, information sent to the model during a session is now retained.
Session cloning is not yet implemented, stay tuned. |
Only for Chrome versions strictly prior to 128.0.6606.0:
Although there is an object called AITextSession, persistence has not yet been implemented. Any information sent to the model is not retained, which means you must re-send the whole conversation each time. In other words, each call to prompt() or promptStreaming() is independent.
Once persistence is implemented, we intend to allow session cloning. This means it will be possible to set a session with a baseline context and prompt the model without needing to reiterate.
Our implementation of the Prompt API doesn’t currently support the Incognito mode, nor the guest mode. This is due to a dependency in the AI runtime layer, and not intended to be a permanent limitation.
This API will not work if GenAILocalFoundationalModelSettings is set to “Do not download model”.
As of Jun 18, 2024, the Prompt API is available in Dedicated Workers, Shared Workers, and Service Workers. For the last two types of workers, you will need a recent Chrome Canary or Dev version.
To write the best prompts, we recommend reading the following:
In addition, here are our recommendations for Gemini Nano in Chrome.
DOs | DONTs | ||
Include examples to guide your prompt. One example, or one shot prompting, is better than no examples. Few shot prompting is best in guiding the model to return your intended results. | Avoid use cases with right or wrong answers. | ||
Add rules. You can improve the quality of the output by adding rules in your prompt. Examples: “Respond in a friendly tone”, “Use the category ‘ambiguous’ if it’s unclear”, “Do not provide any explanation.” | Avoid use cases that bypass the user. LLMs may not always have a satisfying response. So, it’s better to position your AI feature as a tool to support the user in their tasks (e.g. generating a list of potential keywords that the user can quickly review and adjust). | ||
Add a persona. This can help the model return something that’s more aligned to your needs. Example: “You just bought a product, and are writing a review for said product. Please rephrase [...]” | Avoid customizing the parameters. While it’s possible to change the parameters (such as temperature and topK), we strongly recommend that you keep the default values unless you’ve run out of ideas with the prompt or you need the model to behave differently. | ||
(Non-English) Specify the output language. If you prompt the model in English but need a non-English response, specify the output language in the prompt. Example: “Please rephrase the following sentence in Spanish. Hola, cómo estás”. | |||
Use AI capabilities responsibly. Generative AI models can help you or your users be more productive, enhance creativity and learning. We expect you to use and engage with this technology in accordance with Google’s Generative AI Prohibited Uses Policy. |
The control sequence helps the model understand rounds of simulated interactions and when it’s supposed to respond. It consists of the regular characters < c t r l 2 3 > . It’s not a special code that needs to be inserted by hitting various modifier keys.
For example, let’s say you want to retrieve a summary of an article. You can share an example of the full text of one article and the expected output, along with the control sequence. Then, include your full article and the requested output:
“[Full text of example article]
A paraphrased summary of this article is: [example summary]<ctrl23>
[Full text of article to be summarized]
A paraphrased summary of this article is:”
With more examples, it’s likely the model can share a result closer to your expectations. Here is a few shot example to summarize an article:
“[Full text of example article #1]
A paraphrased summary of this article is: [example summary #1]<ctrl23>
[Full text of example article #2]
A paraphrased summary of this article is: [example summary #2]<ctrl23>
[Full text of example article #3]
A paraphrased summary of this article is: [example summary #3]<ctrl23>
[Full text of article to be summarized]
A paraphrased 3summary of this article is:”
We’ll send surveys on an ongoing basis to get a sense of how the APIs and the task-specific APIs approach is working, to collect signals about popular use cases, issues, etc.
If you experience quality or technical issues, consider sharing details. Your reports will help us refine and improve our models, APIs, and components in the AI runtime layer, to ensure safety and responsible use.
If you want to report bugs or other issues related to Chrome’s behavior / implementation of the Prompt API, provide as many details as possible (e.g. repro steps) in a public chromium bug report.
If you want to report ergonomic issues or other problems related to one of the built-in AI APIs, see if there is any related issue first and if not then file a public spec issue:
For other questions or issues, reach out directly by sending an email to the mailing list owners (chrome-ai-dev-preview+owners@chromium.org). We’ll do our best to be as responsive as possible or update existing documents when more appropriate (such as adding to the FAQ section).
To opt-out from the Early Preview Program, simply send an email to:
If you know someone who would like to join the program, ask them to fill out this form and that they communicate their eagerness to provide feedback when answering the last question of the survey!
The use of Rosetta to run the x64 version of Chromium on ARM is neither tested nor maintained, and unexpected behavior will likely result. Please check that all tools that spawn Chromium are ARM-native.
The current implementation of the Prompt API is primarily for experimentation and may not reflect the final output quality when integrated with task APIs we intend to ship.
That said, if you see the model generates harmful content or problematic responses, we encourage you to share feedback on output quality. Your reports are invaluable in helping us refine and improve our models, APIs, and components in the AI runtime layer, to ensure safety and responsible use.
Not at the moment. We acknowledge that this is inconvenient. For Latin languages, consider using this rule of thumb: one token is about four characters long.
The default context window is set to 1024 tokens. For Gemini Nano, the theoretical maximum is 32k, but there is a tradeoff between context size and performance.
The current design only considers the last N tokens in a given input. Consequently, providing too much text in the prompt, may result in the model ignoring the beginning portion of the prompt. For example, if the prompt begins with "Translate the following text to English:...", preceded by thousands of lines of text, the model may not receive the beginning of the prompt and thus fail to provide a translation.
Let us know if you can reproduce the issue: prompt, session options, details about the device, etc. In the meantime, restart Chrome and try again.
Some participants have reported that the following steps helped them get the component to show up:
The browser may not start downloading the model right away. If your computer fulfills all the requirements and you don't see the model download start on chrome://components after calling window.ai.createTextSession(), and Optimization Guide On Device Model shows version 0.0.0.0 / New, leave the browser open for a few minutes to wait for the scheduler to start the download.
If everything fails:
If you encounter a crash error message such as “the model process crashed too many times for this version.”, then we’ll need a crash ID to investigate the root causes.
In the meantime, you’ll need to wait for a new Chrome version to get another chance of trying the Prompt API.
The full API surface is described below. See Web IDL for details on the language.
partial interface WindowOrWorkerGlobalScope {
readonly attribute AI ai;
}
[Exposed=(Window,Worker)]
interface AI {
Promise<AIModelAvailability> canCreateTextSession();
Promise<AITextSession> createTextSession(
optional AITextSessionOptions options = {});
Promise<AITextSessionOptions> defaultTextSessionOptions();
};
[Exposed=(Window,Worker)]
interface AITextSession {
Promise<DOMString> prompt(DOMString input);
ReadableStream promptStreaming(DOMString input);
undefined destroy();
AITextSession clone(); // Not yet implemented (see persistence and cloning for context)
};
dictionary AITextSessionOptions {
[EnforceRange] unsigned long topK;
float temperature;
};
enum AIModelAvailability { "readily", "after-download", "no" };
Methods | DOMExceptions | Error messages | Comments | |
All methods | InvalidStateError | The execution context is not valid. | The JS context is invalid (e.g. a detached iframe). Remedy: ensure the API is called from a valid JS context. | |
createTextSession, defaultTextSessionOptions | OperationError | Model execution service is not available. | Remedy: try again, possibly after relaunching Chrome. | |
When streaming a response | UnknownError | An unknown error occurred: <error code> | Remedy: retry, possibly after relaunching Chrome. Report a technical issue if you are stuck and/or can easily reproduce the problem. | |
NotSupportedError | The request was invalid. | |||
UnknownError | Some other generic failure occurred. | |||
NotReadableError | The response was disabled. | |||
AbortError | The request was canceled. | |||
prompt, promptStreaming | InvalidStateError | The model execution session has been destroyed. | This happens when calling prompt or promptStreaming after the session has been destroyed. Remedy: create a new session. | |
createTextSession | NotSupportedError | Initializing a new session must either specify both topK and temperature, or neither of them. | This happens when the optional parameters are partially specified when calling createTextSession.
Remedy: You need to specify both topK and temperature parameters, or none at all, when calling createTextSession. Use the values from defaultTextSessionOptions if you only want to change the value of one parameter. | |
InvalidStateError | The session cannot be created. | If canCreateTextSession returned "readily" then this should not happen. Remedy: retry, possibly after relaunching Chrome. Report a technical issue if you are stuck and/or can easily reproduce the problem. |
Links to all previous updates and surveys we’ve sent can be found in The Context Index also available via goo.gle/chrome-ai-dev-preview-index
Date | Changes |
Jun 28, 2024 |
|
May 29, 2024 |
|
May 30, 2024 |
|
May 31, 2024 |
|
Jun 1, 2024 |
|
Jun 3, 2024 |
|
Jun 6, 2024 |
|
Jun 9, 2024 |
|
Jun 18, 2024 |
|
Jun 25, 2024 |
|
Jun 26, 2024 |
|
Jun 27, 2024 |
|
Jun 28, 2024 |
|
Jun 29, 2024 |
|
Jul 1, 2024 |
|
Jul 9, 2024 |
|
Jul 11, 2024 |
|
Jul 19, 2024 |
|
Jul 25, 2024 |
|