Google Gemini Integration

Recallr seamlessly integrates with Google Gemini by acting as a forward proxy. Configure your Gemini client to use our proxy URL and we’ll inject relevant context from user memory into each request.

Quick Start

from google import genai
from google.genai import types

client = genai.Client(
    api_key='YOUR_GEMINI_API_KEY',
    http_options=types.HttpOptions(
        client_args={
            'base_url': 'https://api.recallrai.com/api/v1/forward/https://generativelanguage.googleapis.com',
            'headers': {
                'X-Recallr-API-Key': 'rai-...',
                'X-Recallr-Project-Id': 'your-project-id',
                'X-Recallr-Allow-New-User-Creation': 'true',
                'X-Recallr-Session-Timeout-Seconds': '600',
            }
        }
    )
)

# Use normally - memory is automatically injected
response = client.models.generate_content(
    model='gemini-2.5-pro',
    contents='My name is Alice and I love Python programming.',
    config=types.GenerateContentConfig(
        system_instruction='You are a helpful assistant.',
        thinking_config=types.ThinkingConfig(thinking_budget=0)  # Optional
    }
)

print(response.text)

Supported APIs

Generate Content

Standard text generation with non-streaming support

Generate Content Stream

Real-time streaming responses for interactive experiences

Required Headers

These headers must be included via the client_args configuration:

X-Recallr-API-Key

string

required

Your Recallr API key. Get it from the dashboard.

X-Recallr-Project-Id

string

required

Your Recallr Project ID. Get it from the dashboard.

X-Recallr-User-Id

string

required

Unique identifier for the user. Used to maintain separate memory graphs per user.

Can also be passed as user field in the request body for OpenAI compatibility.

Optional Headers

Session Management

X-Recallr-Allow-New-User-Creation

boolean

default:"false"

Automatically create a new user if the specified User-ID doesn’t exist. Set to true to avoid errors for new users.

X-Recallr-Session-Timeout-Seconds

integer

default:"600"

Inactivity period (in seconds) before creating a new session. Minimum value is 600 (10 minutes).

Messages within a session are always passed directly to the LLM. Only memories from previous sessions are retrieved and injected as context.

X-Recallr-Session-Id

string

Continue a specific past session by providing its ID. Get session IDs from response headers.

Recall Configuration

X-Recallr-Recall-Strategy

string

default:"balanced"

Controls the recall method used for retrieving memories. Affects latency and accuracy.

low_latency
balanced
deep

Best for: Voice agents and real-time applications

Fastest response time
Retrieves more memories to compensate for reduced accuracy
Use when sub-second latency is critical

X-Recallr-Min-Top-K

integer

default:"10"

Minimum number of memories to retrieve from the knowledge graph.

X-Recallr-Max-Top-K

integer

default:"50"

Maximum number of memories to retrieve from the knowledge graph.

X-Recallr-Memories-Threshold

float

default:"0.7"

Similarity threshold for retrieving individual memories (0.0 to 1.0). Lower values retrieve more memories.

X-Recallr-Summaries-Threshold

float

default:"0.6"

Similarity threshold for retrieving session summaries (0.0 to 1.0). Lower values retrieve more summaries.

X-Recallr-Last-N-User-Messages

integer

Include only the last N messages from past sessions when building context.

X-Recallr-Last-N-Summaries

integer

Include only the last N session summaries when building context.

X-Recallr-Timezone

string

User’s timezone for temporal context (e.g., “America/New_York”). Helps with time-based memories.

X-Recallr-Include-System-Prompt

boolean

default:"true"

Whether to include Recallr AI’s system prompt (~ 3k tokens) in the context. This prompt includes instructions for how to use the injected memories. Set to false if you already have those instructions in your system prompt.

Response Headers

Recallr returns these headers in the response for debugging and session tracking:

X-Recallr-Session-Id

string

The internal session ID used by Recallr. Use this to continue the same session in future requests.

X-Recallr-User-Id

string

Unique identifier for the user. Matches the X-Recallr-User-Id sent in the request.

X-Recallr-Request-Id

string

Unique identifier for this request. Use for debugging and tracing.

X-Recallr-Process-Time

string

Time taken to process the request on Recallr’s side (in milliseconds).

Examples

Chat Completions - Non-Streaming

from google import genai
from google.genai import types

client = genai.Client(
    api_key='YOUR_GEMINI_API_KEY',
    http_options=types.HttpOptions(
        client_args={
            'base_url': 'https://api.recallrai.com/api/v1/forward/https://generativelanguage.googleapis.com',
            'headers': {
                'X-Recallr-API-Key': 'rai-...',
                'X-Recallr-Project-Id': 'project-id',
                'X-Recallr-Allow-New-User-Creation': 'true',
                'X-Recallr-Session-Timeout-Seconds': '600',
            }
        }
    )
)

# Get raw response with headers
raw_response = client.chat.completions.with_raw_response.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "My name is Alice and I love Python."}
    ],
    extra_headers={
        'X-Recallr-User-Id': 'alice-123',
        'X-Recallr-Recall-Strategy': 'low_latency', # Optional
    }
)

# Access headers
session_id = raw_response.headers.get('X-Recallr-Session-Id')
request_id = raw_response.headers.get('X-Recallr-Request-Id')

# Parse response
response = raw_response.parse()
print(response.choices[0].message.content)

Chat Completions - Streaming

from openai import OpenAI

client = OpenAI(
    base_url='https://api.recallrai.com/api/v1/forward/https://api.openai.com/v1',
    api_key='sk-...',
    default_headers={
        'X-Recallr-API-Key': 'rai-...',
        'X-Recallr-Project-Id': 'project-id',
        'X-Recallr-Allow-New-User-Creation': 'true',
        'X-Recallr-Session-Timeout-Seconds': '600', # Optional
    }
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Tell me a joke about programming"}
    ],
    stream=True,
    extra_headers={
        'X-Recallr-User-Id': 'alice-123',
        'X-Recallr-Recall-Strategy': 'low_latency',
    }
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end='', flush=True)

Responses API - Non-Streaming

from openai import OpenAI

client = OpenAI(
    base_url='https://api.recallrai.com/api/v1/forward/https://api.openai.com/v1',
    api_key='sk-...',
    default_headers={
        'X-Recallr-API-Key': 'rai-...',
        'X-Recallr-Project-Id': 'project-id',
        'X-Recallr-Allow-New-User-Creation': 'true',
        'X-Recallr-Session-Timeout-Seconds': '600', # Optional
    }
)

response = client.responses.create(
    model="gpt-4o-mini",
    input="I'm allergic to peanuts and love Italian food",
    max_output_tokens=150,
    extra_headers={
        'X-Recallr-User-Id': 'alice-123',
        'X-Recallr-Recall-Strategy': 'low_latency',
    }
)

print(response.output[0].content[0].text)

Responses API - Streaming

from openai import OpenAI

client = OpenAI(
    base_url='https://api.recallrai.com/api/v1/forward/https://api.openai.com/v1',
    api_key='sk-...',
    default_headers={
        'X-Recallr-API-Key': 'rai-...',
        'X-Recallr-Project-Id': 'project-id',
        'X-Recallr-Allow-New-User-Creation': 'true',
        'X-Recallr-Session-Timeout-Seconds': '600', # Optional
    }
)

response = client.responses.create(
    model="gpt-4o-mini",
    input="What are my food preferences?",
    stream=True,
    extra_headers={
        'X-Recallr-User-Id': 'alice-123',
        'X-Recallr-Recall-Strategy': 'low_latency',
    }
)

for event in response:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='', flush=True)

How It Works

Need Help?

Contact our support team for assistance with OpenAI integration

SDKs

Integrations

Quick Start

Supported APIs

Generate Content

Generate Content Stream

Required Headers

Optional Headers

Session Management

Recall Configuration

Response Headers

Examples

Chat Completions - Non-Streaming

Chat Completions - Streaming

Responses API - Non-Streaming

Responses API - Streaming

How It Works

Need Help?

SDKs

Integrations

​Quick Start

​Supported APIs

Generate Content

Generate Content Stream

​Required Headers

​Optional Headers

​Session Management

​Recall Configuration

​Response Headers

​Examples

​Chat Completions - Non-Streaming

​Chat Completions - Streaming

​Responses API - Non-Streaming

​Responses API - Streaming

​How It Works

Need Help?

Quick Start

Supported APIs

Required Headers

Optional Headers

Session Management

Recall Configuration

Response Headers

Examples

Chat Completions - Non-Streaming

Chat Completions - Streaming

Responses API - Non-Streaming

Responses API - Streaming

How It Works