Identify and avoid Perplexity anti-patterns and common integration mistakes.Use when reviewing Perplexity code for issues, onboarding new developers,or auditing existing Perplexity integrations for best practices violations.Trigger with phrases like "perplexity mistakes", "perplexity anti-patterns","perplexity pitfalls", "perplexity what not to do", "perplexity code review".
Content & Writing
1.9K Stars
265 Forks
Updated Apr 3, 2026, 03:47 AM
Why Use This
This skill provides specialized capabilities for jeremylongshore's codebase.
Use Cases
Developing new features in the jeremylongshore repository
Refactoring existing code to follow jeremylongshore standards
Understanding and working with jeremylongshore's codebase structure
---
name: perplexity-known-pitfalls
description: |
Identify and avoid Perplexity anti-patterns and common integration mistakes.
Use when reviewing Perplexity code, onboarding new developers,
or auditing existing integrations for best practices violations.
Trigger with phrases like "perplexity mistakes", "perplexity anti-patterns",
"perplexity pitfalls", "perplexity code review", "perplexity gotchas".
allowed-tools: Read, Grep
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
compatible-with: claude-code, codex, openclaw
tags: [saas, perplexity, audit]
---
# Perplexity Known Pitfalls
## Overview
Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.
## Prerequisites
- Perplexity API key configured
- Understanding of OpenAI-compatible chat API format
## Pitfalls
### 1. Using It as a Generic Chatbot
Perplexity searches the web per request. Using it for tasks that don't need web search wastes money.
```python
# BAD: general chatbot (wastes a search query)
response = call_perplexity("Write me a haiku about cats")
# Costs $0.005+ for something any LLM can do offline
# GOOD: leverage web search capability
response = call_perplexity(
"What are the latest Next.js 15 features released this month?",
search_recency_filter="month"
)
```
### 2. Ignoring Citations
Perplexity returns `[1]`, `[2]` markers in text with a separate `citations` array. Ignoring them loses the key value prop.
```python
data = response.model_dump() # or response.json() for raw HTTP
answer = data["choices"][0]["message"]["content"]
citations = data.get("citations", []) # NOT in choices — top-level field
# BAD: displaying raw markers
print(answer) # "According to [1], Node.js 22 adds..."
# GOOD: replace markers with links
import re
for i, url in enumerate(citations, 1):
answer = answer.replace(f"[{i}]", f"[{i}]({url})")
```
### 3. Using Wrong SDK Import
There is no `@perplexity/sdk` or `perplexity` Python package. Use the standard OpenAI client.
```typescript
// BAD — this package doesn't exist
import { PerplexityClient } from "@perplexity/sdk";
// GOOD — use OpenAI client with Perplexity base URL
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: "https://api.perplexity.ai",
});
```
### 4. Not Setting max_tokens
Without `max_tokens`, responses can be arbitrarily long, increasing costs unpredictably.
```typescript
// BAD: no token limit — output cost can spike
await client.chat.completions.create({
model: "sonar-pro", // $15/M output tokens!
messages: [{ role: "user", content: "Tell me about AI" }],
});
// GOOD: always set max_tokens
await client.chat.completions.create({
model: "sonar-pro",
messages: [{ role: "user", content: "Tell me about AI" }],
max_tokens: 1024,
});
```
### 5. No Recency Filter for Time-Sensitive Queries
Without `search_recency_filter`, Perplexity may cite outdated articles.
```python
# BAD: may return articles from any time period
response = call_perplexity("current Bitcoin price")
# GOOD: constrain to recent results
response = call_perplexity(
"current Bitcoin price",
search_recency_filter="day" # hour | day | week | month
)
```
### 6. Sending Full Conversation History
Each message in the conversation may trigger new search queries. Sending 20 turns of history is expensive and slow.
```python
# BAD: 20 turns of history = many search queries
messages = long_history + [{"role": "user", "content": "summarize"}]
# GOOD: summarize context, send focused query
messages = [
{"role": "system", "content": "Answer based on web search."},
{"role": "user", "content": f"Context: {summary}\nQuestion: {question}"}
]
```
### 7. Using sonar-pro for Simple Queries
`sonar-pro` costs 3-15x more than `sonar`. Using it for simple factual lookups wastes budget.
```typescript
// BAD: sonar-pro for a trivial question
await client.chat.completions.create({
model: "sonar-pro", // $3 input + $15 output per M tokens
messages: [{ role: "user", content: "What is the capital of France?" }],
});
// GOOD: match model to complexity
const model = isComplexQuery(query) ? "sonar-pro" : "sonar";
```
### 8. Mixing Allowlist and Denylist in Domain Filter
`search_domain_filter` supports either allowlist (include) or denylist (exclude with `-` prefix), but not both in the same request.
```typescript
// BAD: mixing modes
search_domain_filter: ["python.org", "-reddit.com"] // ERROR
// GOOD: pick one mode
search_domain_filter: ["python.org", "docs.python.org"] // Allowlist
// OR
search_domain_filter: ["-reddit.com", "-quora.com"] // Denylist
```
### 9. Not Caching Search Results
Every uncached call performs a web search. At scale, duplicate queries burn budget.
```typescript
// BAD: same query hits API every time
app.get("/search", (req, res) => {
const result = await client.chat.completions.create({ ... });
res.json(result);
});
// GOOD: cache by query hash
const cache = new LRUCache({ max: 1000, ttl: 3600_000 });
app.get("/search", (req, res) => {
const key = hash(req.query.q);
if (cache.has(key)) return res.json(cache.get(key));
const result = await client.chat.completions.create({ ... });
cache.set(key, result);
res.json(result);
});
```
### 10. Wrong Base URL
The API is at `api.perplexity.ai`, not `api.perplexity.com`.
```typescript
// BAD
baseURL: "https://api.perplexity.com" // Wrong domain
// GOOD
baseURL: "https://api.perplexity.ai" // Correct
```
## Code Review Checklist
- [ ] Uses `openai` package, not fake `@perplexity/sdk`
- [ ] Base URL is `https://api.perplexity.ai`
- [ ] `max_tokens` set on every request
- [ ] Citations parsed from `response.citations` array
- [ ] `search_recency_filter` used for time-sensitive queries
- [ ] Caching implemented for repeated queries
- [ ] Model routing: sonar for simple, sonar-pro for complex
- [ ] Conversation history trimmed before sending
- [ ] PII sanitized from queries
- [ ] Domain filter uses only allowlist OR denylist, not both
## Error Handling
| Pitfall | Impact | Detection |
|---------|--------|-----------|
| No caching | 3-5x cost overrun | Check cache hit rate metric |
| Wrong model | Budget waste | Grep for `sonar-pro` in simple query paths |
| No max_tokens | Unpredictable costs | Grep for `create()` calls without `max_tokens` |
| PII in queries | Privacy violation | Run sanitization check in CI |
## Output
- Identified anti-patterns in existing code
- Applied fixes for each pitfall
- Code review checklist for ongoing quality
## Resources
- [Perplexity API Documentation](https://docs.perplexity.ai)
- [Perplexity Model Guide](https://docs.perplexity.ai/getting-started/models)
- [OpenAI Compatibility](https://docs.perplexity.ai/guides/chat-completions-guide)