Skip to main content
Recallr seamlessly integrates with Google Gemini by acting as a forward proxy. Configure your Gemini client to use our proxy URL and we’ll inject relevant context from user memory into each request.

Quick Start

from google import genai
from google.genai import types

client = genai.Client(
    api_key='YOUR_GEMINI_API_KEY',  # Your Google Gemini API key
    http_options={
        'api_version': 'v1beta',
        'base_url': 'https://api.recallrai.com/api/v1/forward/https://generativelanguage.googleapis.com',
        'headers': {
            'X-Recallr-API-Key': 'rai-...',
            'X-Recallr-Project-Id': 'your-project-id',
            'X-Recallr-User-Id': 'user-123',
            'X-Recallr-Allow-New-User-Creation': 'true',
            'X-Recallr-Session-Timeout-Seconds': '600',
            'X-Recallr-Recall-Strategy': 'low_latency',  # Optional
        }
    }
)

# Use normally - memory is automatically injected
response = client.models.generate_content(
    model='gemini-2.5-pro',
    contents='My name is Alice',
    config=types.GenerateContentConfig(
        system_instruction='You are a helpful assistant.',
    )
)

print(response.text)

Supported APIs

Required Headers

These headers must be included via the http_options configuration:
X-Recallr-API-Key
string
required
Your Recallr API key. Get it from the dashboard.
X-Recallr-Project-Id
string
required
Your Recallr Project ID. Get it from the dashboard.
X-Recallr-User-Id
string
required
Unique identifier for the user. Used to maintain separate memory graphs per user.
Must be passed in the headers configuration when initializing the Gemini client.

Optional Headers

Session Management

X-Recallr-Allow-New-User-Creation
boolean
default:"false"
Automatically create a new user if the specified User-ID doesn’t exist. Set to true to avoid errors for new users.
X-Recallr-Session-Timeout-Seconds
integer
default:"600"
Inactivity period (in seconds) before creating a new session. Minimum value is 600 (10 minutes).
Messages within a session are always passed directly to the LLM. Only memories from previous sessions are retrieved and injected as context.

Recall Configuration

X-Recallr-Recall-Strategy
string
default:"balanced"
Controls the recall method used for retrieving memories. Affects latency and accuracy.
  • low_latency
  • balanced
  • deep
Best for: Voice agents and real-time applications
  • Fastest response time
  • Retrieves more memories to compensate for reduced accuracy
  • Use when sub-second latency is critical
X-Recallr-Min-Top-K
integer
default:"10"
Minimum number of memories to retrieve from the knowledge graph.
X-Recallr-Max-Top-K
integer
default:"50"
Maximum number of memories to retrieve from the knowledge graph.
X-Recallr-Memories-Threshold
float
default:"0.7"
Similarity threshold for retrieving individual memories (0.0 to 1.0). Lower values retrieve more memories.
X-Recallr-Summaries-Threshold
float
default:"0.6"
Similarity threshold for retrieving session summaries (0.0 to 1.0). Lower values retrieve more summaries.
X-Recallr-Last-N-User-Messages
integer
Include last N messages from past sessions when building context.
X-Recallr-Last-N-Summaries
integer
Include last N session summaries when building context.
X-Recallr-Timezone
string
User’s timezone for temporal context (e.g., “America/New_York”). Helps with time-based memories.
X-Recallr-Include-System-Prompt
boolean
default:"true"
Whether to include Recallr AI’s system prompt (~ 3k tokens) in the context. This prompt includes instructions for how to use the injected memories. Set to false if you already have those instructions in your system prompt.

Response Headers

Recallr returns these headers in the response for debugging and session tracking:
X-Recallr-Session-Id
string
The internal session ID used by Recallr. Use this to continue the same session in future requests.
X-Recallr-User-Id
string
Unique identifier for the user. Matches the X-Recallr-User-Id sent in the request.
X-Recallr-Request-Id
string
Unique identifier for this request. Use for debugging and tracing.
X-Recallr-Process-Time
string
Time taken to process the request on Recallr’s side (in milliseconds).

Examples

Generate Content - Non-Streaming

from google import genai
from google.genai import types

client = genai.Client(
    api_key='YOUR_GEMINI_API_KEY',  # Your Google Gemini API key
    http_options={
        'api_version': 'v1beta',
        'base_url': 'https://api.recallrai.com/api/v1/forward/https://generativelanguage.googleapis.com',
        'headers': {
            'X-Recallr-API-Key': 'rai-...',
            'X-Recallr-Project-Id': 'project-id',
            'X-Recallr-User-Id': 'alice-123',
            'X-Recallr-Allow-New-User-Creation': 'true',
            'X-Recallr-Session-Timeout-Seconds': '600',
            'X-Recallr-Recall-Strategy': 'low_latency',  # Optional
        }
    }
)

# Store user information in memory
response = client.models.generate_content(
    model='gemini-2.5-pro',
    contents='My name is Alice and I love Python programming.',
    config=types.GenerateContentConfig(
        system_instruction='You are a helpful assistant.',
    )
)

print(response.text)

Generate Content - Streaming

from google import genai
from google.genai import types

client = genai.Client(
    api_key='YOUR_GEMINI_API_KEY',  # Your Google Gemini API key
    http_options={
        'api_version': 'v1beta',
        'base_url': 'https://api.recallrai.com/api/v1/forward/https://generativelanguage.googleapis.com',
        'headers': {
            'X-Recallr-API-Key': 'rai-...',
            'X-Recallr-Project-Id': 'project-id',
            'X-Recallr-User-Id': 'alice-123',
            'X-Recallr-Allow-New-User-Creation': 'true',
            'X-Recallr-Session-Timeout-Seconds': '600',
            'X-Recallr-Recall-Strategy': 'low_latency',  # Optional
        }
    }
)

# Recall stored information about the user with streaming
response = client.models.generate_content_stream(
    model='gemini-2.5-pro',
    contents='What do you know about me and my interests?',
    config=types.GenerateContentConfig(
        system_instruction='You are a helpful assistant.',
    )
)

for chunk in response:
    if chunk.text:
        print(chunk.text, end='', flush=True)

How It Works

Need Help?

Contact our support team for assistance with Gemini integration