Getting Started with the Gemini Pro API
On December 6, 2023, Google unveiled its long-awaited GPT-4 rival Gemini. The model comes in three variants, Gemini Nano to run on edge (laptops, phones, devices, etc.), Gemini Pro powering Google's Bard chatbot, and a yet-to-be-released Gemini Ultra allegedly topping GPT-4 on industry benchmarks.
Gemini Pro also comes with an API, which means it can be integrated inside any AI-powered product. Let's explore how to do that.
Getting an API key
Google gives access to the Gemini Pro API as part of their Google AI Studio product. You will need to sign up for this free product to get an API key.
Individual Google account
If you have an individual Google account (i.e. you are not part of an organization's workspace), you can simply go to ai.google.dev and click "Get API key in Google AI Studio".
Then click "Get API key".
Then, depending on whether you already have a GCP project click either of the two options. If you don't have a project, Google will create one for you automatically.
An API key will be create, copy it and paste it in a safe location as you will not be able to see it again later.
Google Workspace account
If your Google account is part of a Workspace account (e.g. you are using a work email) your Workspace admin will need to activate Early Access Apps before you can follow the above steps.
Workspace admins should navigate to the Admin Console and find "Additional Google Services" in the left menu.
Then scroll down to "Early Access Apps" and click on it.
Enable it for everyone:
And also enable Core Data Access Permissions. You will not be able to get an API key without this enabled.
Once this is complete, you should be able to follow the steps in the Individual Account section to get your API key.
Integrating Gemini Pro in your application
Once you have obtained an API key and saved it in a safe place, you can start integrating the Gemini Pro API in your application.
There are multiple options depending on your stack.
Raw HTTP requests
The most versatile way to query the Gemini Pro API is via direct HTTP requests:
curl \
-H 'Content-Type: application/json' \
-d '{"contents":[{"parts":[{"text":"Say this is a test."}]}]}' \
-X POST https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=YOUR_API_KEY
Python SDK
If your app is in Python, Google offers a handy Python SDK to interact with the API. The documentation can be found here.
Install the SDK:
$ pip install -q -U google-generativeai
And use it as such:
import google.generativeai as genai
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Is Gemini Pro really that good?")
Typescript SDK
If your poison of choice is Typescript, Google's got you covered too. Find the documentation here.
Install the library:
$ npm install @google/generative-ai
Then use it as such:
const { GoogleGenerativeAI } = require("@google/generative-ai");
const genAI = new GoogleGenerativeAI(GOOGLE_API_KEY);
async function run() {
const model = genAI.getGenerativeModel({ model: "gemini-pro"});
const prompt = "Write a story about a magic backpack."
const result = await model.generateContent(prompt);
}
Other languages
If your using other languages such as Go or Swift, find all Google's SDKs here.
API differences with OpenAI
Note that the query and response payloads for OpenAI and Gemini Pro's APIs are different.
Query payloads
OpenAI's query payload looks like this:
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Write the first line of a story about a magic backpack."
},
{
"role": "assistant",
"content": "In the bustling city of Meadow brook, lived a young girl named Sophie. She was a bright and curious soul with an imaginative mind."
},
{
"role": "user",
"content": "Can you set it in a quiet village in 1600s France?"
}
],
"temperature": 0.7
}'
while Gemini Pro's looks like this:
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [
{
"role":"user",
"parts":[{
"text": "Write the first line of a story about a magic backpack."
}]
},
{
"role": "model",
"parts":[{
"text": "In the bustling city of Meadow brook, lived a young girl named Sophie. She was a bright and curious soul with an imaginative mind."
}]
},
{
"role": "user",
"parts":[{
"text": "Can you set it in a quiet village in 1600s France?"
}]
},
]
}'
Notice these three main differences:
- OpenAI expects the model name as part of the payload whereas Gemini Pro has different endpoints for different models
- OpenAI refers to messages as messages whereas Gemini calls them contents
- OpenAI roles refer to the model as assistant while Gemini uses model
Response payloads
Response payloads are also quite different between OpenAI and Gemini's APIs.
OpenAI's response payloads reads:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "gpt-3.5-turbo-1106",
"usage": {
"prompt_tokens": 13,
"completion_tokens": 7,
"total_tokens": 20
},
"choices": [
{
"message": {
"role": "assistant",
"content": "In the quaint village of Fleur-de-Lys ..."
},
"logprobs": null,
"finish_reason": "stop",
"index": 0
}
]
}
whereas Gemini's reads:
{
"candidates": [
{
"content": {
"parts": [
{
"text": "In the quaint village of Fleur-de-Lys ..."
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0,
"safetyRatings": [
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE"
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE"
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE"
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "NEGLIGIBLE"
}
]
}
],
"promptFeedback": {
"safetyRatings": [
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE"
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE"
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE"
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "NEGLIGIBLE"
}
]
}
}
Note that Gemini returns many safety metrics, which you can also configure thresholds for at query time.
A comprehensive AI platform
Dataset Curation
Generate high-quality datasets.
LLM Fine-Tuning
Customize LLMs to your specific use case.
LLM Playground
Vibe-check 30+ SOTA LLMs at once.
LLM Evaluation
Compare LLMs on your entire eval set.