idalib-analysis by williballenthin
Analyze binaries using IDA Pro's Python API (idalib) in headless mode. Use when examining program structure, functions, disassembly, cross-references, or strings without the GUI.
Content & Writing
152 Stars
28 Forks
Updated Jan 15, 2026, 04:02 PM
Why Use This
This skill provides specialized capabilities for williballenthin's codebase.
Use Cases
- Developing new features in the williballenthin repository
- Refactoring existing code to follow williballenthin standards
- Understanding and working with williballenthin's codebase structure
Install Guide
2 steps- 1
Skip this step if Ananke is already installed.
- 2
Skill Snapshot
Auto scan of skill assets. Informational only.
Valid SKILL.md
Checks against SKILL.md specification
Source & Community
Skill Stats
SKILL.md 250 Lines
Total Files 1
Total Size 0 B
License NOASSERTION
---
name: idalib-analysis
description: Analyze binaries using IDA Pro's Python API (idalib) in headless mode. Use when examining program structure, functions, disassembly, cross-references, or strings without the GUI.
---
# IDA Pro Headless Analysis with idalib
Use this skill to analyze binary files with IDA Pro's Python API in headless mode.
## Setup
First, ensure IDA Pro is installed by running:
```bash
$CLAUDE_PROJECT_DIR/.claude/skills/idalib-analysis/scripts/install-ida.sh
```
Wait for the script to complete before proceeding. This may take a few minutes on first run.
## Use the IDA Domain API
**Always prefer the IDA Domain API** over the legacy low-level IDA Python SDK. The Domain API provides a clean, Pythonic interface that is easier to use and understand.
### Documentation Resources
| Resource | URL |
|----------|-----|
| **LLM-optimized overview** | https://ida-domain.docs.hex-rays.com/llms.txt |
| **Getting Started** | https://ida-domain.docs.hex-rays.com/getting_started/index.md |
| **Examples** | https://ida-domain.docs.hex-rays.com/examples/index.md |
| **API Reference** | https://ida-domain.docs.hex-rays.com/ref/{module}/index.md |
Available API modules: `bytes`, `comments`, `database`, `entries`, `flowchart`, `functions`, `heads`, `hooks`, `instructions`, `names`, `operands`, `segments`, `signature_files`, `strings`, `types`, `xrefs`
**To fetch specific API documentation**, use URLs like:
- `https://ida-domain.docs.hex-rays.com/ref/functions/index.md` - Function analysis API
- `https://ida-domain.docs.hex-rays.com/ref/xrefs/index.md` - Cross-reference API
- `https://ida-domain.docs.hex-rays.com/ref/strings/index.md` - String analysis API
### Opening a Database
```python
from ida_domain import Database
from ida_domain.database import IdaCommandOptions
# Open with auto-analysis enabled and save database for faster subsequent runs
ida_options = IdaCommandOptions(auto_analysis=True, new_database=False)
with Database.open("path/to/binary", ida_options, save_on_close=True) as db:
# Your analysis here
pass
# Database is automatically closed and saved
```
### Key Database Properties
```python
with Database.open(path, ida_options) as db:
db.minimum_ea # Start address
db.maximum_ea # End address
db.metadata # Database metadata
db.architecture # Target architecture
db.functions # All functions (iterable)
db.strings # All strings (iterable)
db.segments # Memory segments
db.names # Symbols and labels
db.entries # Entry points
db.types # Type definitions
db.comments # All comments
db.xrefs # Cross-reference utilities
db.bytes # Byte manipulation
db.instructions # Instruction access
```
### Common Analysis Tasks
**List functions:**
```python
for func in db.functions:
name = db.functions.get_name(func)
print(f"{hex(func.start_ea)}: {name} ({func.size} bytes)")
```
**Get function disassembly and pseudocode:**
```python
func = next(f for f in db.functions if db.functions.get_name(f) == "main")
for line in db.functions.get_disassembly(func):
print(line)
# Pseudocode requires Hex-Rays decompiler license - handle gracefully
try:
for line in db.functions.get_pseudocode(func):
print(line)
except RuntimeError as e:
print(f"Decompilation unavailable: {e}")
```
**Find strings:**
```python
for s in db.strings:
print(f"{hex(s.address)}: {s}")
```
**Cross-references:**
```python
# References TO an address
for xref in db.xrefs.to_ea(target_addr):
print(f"Referenced from {hex(xref.from_ea)} (type: {xref.type.name})")
# References FROM an address
for xref in db.xrefs.from_ea(source_addr):
print(f"References {hex(xref.to_ea)}")
# Specific xref types
for xref in db.xrefs.calls_to_ea(func_addr):
print(f"Called from {hex(xref.from_ea)}")
```
**Read bytes:**
```python
byte_val = db.bytes.get_byte_at(addr)
dword_val = db.bytes.get_dword_at(addr)
disasm = db.bytes.get_disassembly_at(addr)
```
## Analysis Methodology
**Write and execute small, focused scripts** rather than reading large amounts of data from the binary. This approach is more efficient and produces better results:
1. **Form a hypothesis** about what you're looking for
2. **Design a script** to gather the minimum data needed to test the hypothesis
3. **Execute the script** and analyze the results
4. **Iterate** based on findings
### Example: Investigating a suspicious function
Instead of dumping all disassembly, write targeted scripts:
```python
# Script 1: Find functions that reference interesting strings
from ida_domain import Database
from ida_domain.database import IdaCommandOptions
ida_options = IdaCommandOptions(auto_analysis=True, new_database=False)
with Database.open("sample.exe", ida_options, save_on_close=True) as db:
for s in db.strings:
if "password" in str(s).lower():
print(f"\nString at {hex(s.address)}: {s}")
for xref in db.xrefs.to_ea(s.address):
print(f" Referenced from {hex(xref.from_ea)}")
```
```python
# Script 2: Analyze a specific function found in Script 1
with Database.open("sample.exe", ida_options, save_on_close=True) as db:
target_addr = 0x401234 # Address from previous script
for func in db.functions:
if func.start_ea <= target_addr < func.end_ea:
print(f"Function: {db.functions.get_name(func)}")
print(f"Signature: {db.functions.get_signature(func)}")
# Try pseudocode first (requires Hex-Rays license)
try:
print("\nPseudocode:")
for line in db.functions.get_pseudocode(func):
print(f" {line}")
except RuntimeError:
# Fall back to disassembly if decompiler unavailable
print("\nDisassembly (decompiler unavailable):")
for line in db.functions.get_disassembly(func):
print(f" {line}")
break
```
## Performance Tips
1. **Enable auto_analysis=True** on first open to let IDA analyze the binary
2. **Use save_on_close=True** to persist the analysis database (.idb/.i64)
3. **Subsequent opens are faster** because analysis results are cached in the .idb
4. **Write focused scripts** that gather specific data rather than iterating over everything
## Troubleshooting
- Check `/tmp/claude-idalib.log` for installation and setup issues
- Database files (.idb/.i64) are created alongside the binary
- If imports fail, verify IDA Pro is installed and IDADIR is set
### Decompilation Not Working
**Pseudocode/decompilation requires a Hex-Rays decompiler license**, which is separate from the IDA Pro base license. If `get_pseudocode()` or `get_microcode()` fails with `RuntimeError`, check the license status:
```python
import ida_hexrays
# Check if decompiler is available
def is_decompiler_available():
"""Check if Hex-Rays decompiler is licensed and available."""
if not ida_hexrays.init_hexrays_plugin():
return False
# Try a test decompilation - MERR_LICENSE (-23) means no license
import ida_funcs
for func_ea in range(db.minimum_ea, db.maximum_ea):
func = ida_funcs.get_func(func_ea)
if func:
hf = ida_hexrays.hexrays_failure_t()
cfunc = ida_hexrays.decompile(func.start_ea, hf)
if cfunc:
return True
# Error code -23 is MERR_LICENSE
if hf.code == -23:
return False
break
return False
```
**Error codes reference:**
- `MERR_LICENSE (-23)`: No valid Hex-Rays decompiler license
- `MERR_ONLY32 (-24)`: 32-bit decompiler not available (need hexx86 plugin)
- `MERR_ONLY64 (-25)`: 64-bit decompiler not available (need hexx64 plugin)
**Workaround when decompilation is unavailable:** Use disassembly analysis instead - the `get_disassembly()` method always works and provides assembly-level insight.
## Exploring the API at Runtime
When the documentation doesn't answer your question, explore the API directly:
```python
import inspect
from ida_domain import Database
from ida_domain.functions import Functions
# List all public methods on a class
for name, method in inspect.getmembers(Functions, predicate=inspect.isfunction):
if not name.startswith('_'):
print(f"{name}: {inspect.signature(method)}")
# Get docstring for a specific method
print(Functions.get_pseudocode.__doc__)
# Within a database context, explore available attributes
with Database.open(path, ida_options) as db:
# List all database properties
print([attr for attr in dir(db) if not attr.startswith('_')])
```
## Legacy API (Avoid)
The legacy `idc`, `idautils`, `ida_funcs` APIs still work but are harder to use. **Prefer the Domain API** for new analysis scripts. Only use legacy APIs when Domain API doesn't expose needed functionality.
Name Size