Getting Started with the Gemini Pro API

Getting Started with the Gemini Pro API

Emmanuel Turlay·1/5/2024

On December 6, 2023, Google unveiled its long-awaited GPT-4 rival Gemini. The model comes in three variants, Gemini Nano to run on edge (laptops, phones, devices, etc.), Gemini Pro powering Google's Bard chatbot, and a yet-to-be-released Gemini Ultra allegedly topping GPT-4 on industry benchmarks.

Gemini Pro also comes with an API, which means it can be integrated inside any AI-powered product. Let's explore how to do that.

Getting an API key

Google gives access to the Gemini Pro API as part of their Google AI Studio product. You will need to sign up for this free product to get an API key.

Individual Google account

If you have an individual Google account (i.e. you are not part of an organization's workspace), you can simply go to ai.google.dev and click "Get API key in Google AI Studio".

gemini landing

Then click "Get API key".

get api key

Then, depending on whether you already have a GCP project click either of the two options. If you don't have a project, Google will create one for you automatically.

api_keys

An API key will be create, copy it and paste it in a safe location as you will not be able to see it again later.

Google Workspace account

If your Google account is part of a Workspace account (e.g. you are using a work email) your Workspace admin will need to activate Early Access Apps before you can follow the above steps.

Workspace admins should navigate to the Admin Console and find "Additional Google Services" in the left menu.

workspace

Then scroll down to "Early Access Apps" and click on it.

early access

Enable it for everyone:

enable early access

And also enable Core Data Access Permissions. You will not be able to get an API key without this enabled.

permissions

Once this is complete, you should be able to follow the steps in the Individual Account section to get your API key.

Integrating Gemini Pro in your application

Once you have obtained an API key and saved it in a safe place, you can start integrating the Gemini Pro API in your application.

There are multiple options depending on your stack.

Raw HTTP requests

The most versatile way to query the Gemini Pro API is via direct HTTP requests:

curl \
  -H 'Content-Type: application/json' \
  -d '{"contents":[{"parts":[{"text":"Say this is a test."}]}]}' \
  -X POST https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=YOUR_API_KEY

Python SDK

If your app is in Python, Google offers a handy Python SDK to interact with the API. The documentation can be found here.

Install the SDK:

$ pip install -q -U google-generativeai

And use it as such:

import google.generativeai as genai

genai.configure(api_key=GOOGLE_API_KEY)

model = genai.GenerativeModel('gemini-pro')

response = model.generate_content("Is Gemini Pro really that good?")

Typescript SDK

If your poison of choice is Typescript, Google's got you covered too. Find the documentation here.

Install the library:

$ npm install @google/generative-ai

Then use it as such:

const { GoogleGenerativeAI } = require("@google/generative-ai");

const genAI = new GoogleGenerativeAI(GOOGLE_API_KEY);

async function run() {
  const model = genAI.getGenerativeModel({ model: "gemini-pro"});

  const prompt = "Write a story about a magic backpack."

  const result = await model.generateContent(prompt);
}

Other languages

If your using other languages such as Go or Swift, find all Google's SDKs here.

API differences with OpenAI

Note that the query and response payloads for OpenAI and Gemini Pro's APIs are different.

Query payloads

OpenAI's query payload looks like this:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [
       {
         "role": "user",
         "content": "Write the first line of a story about a magic backpack."
       },
       {
         "role": "assistant",
         "content": "In the bustling city of Meadow brook, lived a young girl named Sophie. She was a bright and curious soul with an imaginative mind."
       },
       {
         "role": "user",
         "content": "Can you set it in a quiet village in 1600s France?"
       }
     ],
     "temperature": 0.7
   }'

while Gemini Pro's looks like this:

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [
        {
          "role":"user",
          "parts":[{
            "text": "Write the first line of a story about a magic backpack."
          }]
        },
        {
          "role": "model",
          "parts":[{
            "text": "In the bustling city of Meadow brook, lived a young girl named Sophie. She was a bright and curious soul with an imaginative mind."
          }]
        },
        {
          "role": "user",
          "parts":[{
            "text": "Can you set it in a quiet village in 1600s France?"
          }]
        },
      ]
    }'

Notice these three main differences:

  • OpenAI expects the model name as part of the payload whereas Gemini Pro has different endpoints for different models
  • OpenAI refers to messages as messages whereas Gemini calls them contents
  • OpenAI roles refer to the model as assistant while Gemini uses model

Response payloads

Response payloads are also quite different between OpenAI and Gemini's APIs.

OpenAI's response payloads reads:

{
    "id": "chatcmpl-abc123",
    "object": "chat.completion",
    "created": 1677858242,
    "model": "gpt-3.5-turbo-1106",
    "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 7,
        "total_tokens": 20
    },
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "In the quaint village of Fleur-de-Lys ..."
            },
            "logprobs": null,
            "finish_reason": "stop",
            "index": 0
        }
    ]
}

whereas Gemini's reads:

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "In the quaint village of Fleur-de-Lys ..."
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "promptFeedback": {
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability": "NEGLIGIBLE"
      }
    ]
  }
}

Note that Gemini returns many safety metrics, which you can also configure thresholds for at query time.

AI Data Platform

A comprehensive AI platform

Dataset Curation

Generate high-quality datasets.

LLM Fine-Tuning

Customize LLMs to your specific use case.

LLM Playground

Vibe-check 30+ SOTA LLMs at once.

LLM Evaluation

Compare LLMs on your entire eval set.

Accelerate your AI workflows with Airtrain's comprehensive suite of tools. From dataset curation to LLM fine-tuning and evaluation.

Unlock your data, control your AI.