---
name: perplexity-known-pitfalls
description: 'Identify and avoid Perplexity anti-patterns and common integration mistakes.
Use when reviewing Perplexity code, onboarding new developers,
or auditing existing integrations for best practices violations.
Trigger with phrases like "perplexity mistakes", "perplexity anti-patterns",
"perplexity pitfalls", "perplexity code review", "perplexity gotchas".
'
allowed-tools: Read, Grep
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
tags:
- saas
- perplexity
- audit
compatibility: Designed for Claude Code, also compatible with Codex and OpenClaw
---
# Perplexity Known Pitfalls
## Overview
Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.
## Prerequisites
- Perplexity API key configured
- Understanding of OpenAI-compatible chat API format
## Pitfalls
### 1. Using It as a Generic Chatbot
Perplexity searches the web per request. Using it for tasks that don't need web search wastes money.
```python
# BAD: general chatbot (wastes a search query)
response = call_perplexity("Write me a haiku about cats")
# Costs $0.005+ for something any LLM can do offline
# GOOD: leverage web search capability
response = call_perplexity(
"What are the latest Next.js 15 features released this month?",
search_recency_filter="month"
)
```
### 2. Ignoring Citations
Perplexity returns `[1]`, `[2]` markers in text with a separate `citations` array. Ignoring them loses the key value prop.
```python
data = response.model_dump() # or response.json() for raw HTTP
answer = data["choices"][0]["message"]["content"]
citations = data.get("citations", []) # NOT in choices — top-level field
# BAD: displaying raw markers
print(answer) # "According to [1], Node.js 22 adds..."
# GOOD: replace markers with links
import re
for i, url in enumerate(citations, 1):
answer = answer.replace(f"[{i}]", f"{i}")
```
### 3. Using Wrong SDK Import
There is no `@perplexity/sdk` or `perplexity` Python package. Use the standard OpenAI client.
```typescript
// BAD — this package doesn't exist
import { PerplexityClient } from "@perplexity/sdk";
// GOOD — use OpenAI client with Perplexity base URL
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: "https://api.perplexity.ai",
});
```
### 4. Not Setting max_tokens
Without `max_tokens`, responses can be arbitrarily long, increasing costs unpredictably.
```typescript
// BAD: no token limit — output cost can spike
await client.chat.completions.create({
model: "sonar-pro", // $15/M output tokens!
messages: [{ role: "user", content: "Tell me about AI" }],
});
// GOOD: always set max_tokens
await client.chat.completions.create({
model: "sonar-pro",
messages: [{ role: "user", content: "Tell me about AI" }],
max_tokens: 1024,
});
```
### 5. No Recency Filter for Time-Sensitive Queries
Without `search_recency_filter`, Perplexity may cite outdated articles.
```python
# BAD: may return articles from any time period
response = call_perplexity("current Bitcoin price")
# GOOD: constrain to recent results
response = call_perplexity(
"current Bitcoin price",
search_recency_filter="day" # hour | day | week | month
)
```
### 6. Sending Full Conversation History
Each message in the conversation may trigger new search queries. Sending 20 turns of history is expensive and slow.
```python
# BAD: 20 turns of history = many search queries
messages = long_history + [{"role": "user", "content": "summarize"}]
# GOOD: summarize context, send focused query
messages = [
{"role": "system", "content": "Answer based on web search."},
{"role": "user", "content": f"Context: {summary}\nQuestion: {question}"}
]
```
### 7. Using sonar-pro for Simple Queries
`sonar-pro` costs 3-15x more than `sonar`. Using it for simple factual lookups wastes budget.
```typescript
// BAD: sonar-pro for a trivial question
await client.chat.completions.create({
model: "sonar-pro", // $3 input + $15 output per M tokens
messages: [{ role: "user", content: "What is the capital of France?" }],
});
// GOOD: match model to complexity
const model = isComplexQuery(query) ? "sonar-pro" : "sonar";
```
### 8. Mixing Allowlist and Denylist in Domain Filter
`search_domain_filter` supports either allowlist (include) or denylist (exclude with `-` prefix), but not both in the same request.
```typescript
// BAD: mixing modes
search_domain_filter: ["python.org", "-reddit.com"] // ERROR
// GOOD: pick one mode
search_domain_filter: ["python.org", "docs.python.org"] // Allowlist
// OR
search_domain_filter: ["-reddit.com", "-quora.com"] // Denylist
```
### 9. Not Caching Search Results
Every uncached call performs a web search. At scale, duplicate queries burn budget.
```typescript
// BAD: same query hits API every time
app.get("/search", (req, res) => {
const result = await client.chat.completions.create({ ... });
res.json(result);
});
// GOOD: cache by query hash
const cache = new LRUCache({ max: 1000, ttl: 3600_000 });
app.get("/search", (req, res) => {
const key = hash(req.query.q);
if (cache.has(key)) return res.json(cache.get(key));
const result = await client.chat.completions.create({ ... });
cache.set(key, result);
res.json(result);
});
```
### 10. Wrong Base URL
The API is at `api.perplexity.ai`, not `api.perplexity.com`.
```typescript
// BAD
baseURL: "https://api.perplexity.com" // Wrong domain
// GOOD
baseURL: "https://api.perplexity.ai" // Correct
```
## Code Review Checklist
- [ ] Uses `openai` package, not fake `@perplexity/sdk`
- [ ] Base URL is `https://api.perplexity.ai`
- [ ] `max_tokens` set on every request
- [ ] Citations parsed from `response.citations` array
- [ ] `search_recency_filter` used for time-sensitive queries
- [ ] Caching implemented for repeated queries
- [ ] Model routing: sonar for simple, sonar-pro for complex
- [ ] Conversation history trimmed before sending
- [ ] PII sanitized from queries
- [ ] Domain filter uses only allowlist OR denylist, not both
## Error Handling
| Pitfall | Impact | Detection |
|---------|--------|-----------|
| No caching | 3-5x cost overrun | Check cache hit rate metric |
| Wrong model | Budget waste | Grep for `sonar-pro` in simple query paths |
| No max_tokens | Unpredictable costs | Grep for `create()` calls without `max_tokens` |
| PII in queries | Privacy violation | Run sanitization check in CI |
## Output
- Identified anti-patterns in existing code
- Applied fixes for each pitfall
- Code review checklist for ongoing quality
## Resources
- [Perplexity API Documentation](https://docs.perplexity.ai)
- [Perplexity Model Guide](https://docs.perplexity.ai/getting-started/models)
- [OpenAI Compatibility](https://docs.perplexity.ai/guides/chat-completions-guide)