url-dump by huytieu
Quick capture URLs with automatic content extraction, insights, and categorization into knowledge booklets
Content & Writing
99 Stars
9 Forks
Updated Jan 19, 2026, 04:38 AM
Why Use This
This skill provides specialized capabilities for huytieu's codebase.
Use Cases
- Developing new features in the huytieu repository
- Refactoring existing code to follow huytieu standards
- Understanding and working with huytieu's codebase structure
Install Guide
2 steps- 1
Skip this step if Ananke is already installed.
- 2
Skill Snapshot
Auto scan of skill assets. Informational only.
Valid SKILL.md
Checks against SKILL.md specification
Source & Community
Skill Stats
SKILL.md 438 Lines
Total Files 1
Total Size 0 B
License NOASSERTION
---
name: url-dump
description: Quick capture URLs with automatic content extraction, insights, and categorization into knowledge booklets
roles: [all]
integrations: [web-fetch]
---
# COG URL Dump Skill
## Purpose
Transform raw URLs into structured, insightful knowledge entries through intelligent content extraction, categorization, and integration with the user's knowledge base. Quick capture with automatic insight generation.
## When to Invoke
- User shares a URL they want to save
- User says "save this link", "bookmark this", "url dump", or "save for later"
- User pastes a URL and wants to capture it
- User wants to organize web resources into their knowledge base
## Agent Mode Awareness
**Check `agent_mode` in `00-inbox/MY-PROFILE.md` frontmatter:**
- If `agent_mode: team` — delegate content extraction, analysis, and categorization to a sub-agent while handling user interaction directly. The sub-agent fetches URL content, generates insights, and returns structured results for filing.
- If `agent_mode: solo` (default) — handle everything directly in the conversation. No delegation.
## Pre-Flight Check
**Before executing, check for user profile:**
1. Look for `00-inbox/MY-PROFILE.md` in the vault
2. If NOT found:
```
Welcome to COG! It looks like this is your first time.
Before we start, let's quickly set up your profile (takes 2 minutes).
Would you like to run onboarding first, or should I proceed with default settings?
```
3. If found:
- Read the profile to get user's interests and projects
- Use interests to help with auto-categorization
- Check for existing booklet categories in `05-knowledge/booklets/`
## Process Flow
### 1. User Interaction & Input Collection
- Accept URL(s) from the user (single URL or batch)
- Optionally accept user's quick note about why they're saving this
- Accept any format: bare URL, markdown link, or with notes
**Prompt:**
```
What URL(s) would you like to save?
(You can paste one or more URLs, optionally with a note about why you're saving it)
```
### 2. URL Validation & Fetch
- Validate URL format
- Check if URL is accessible
- Detect duplicate URLs in existing knowledge base
- Fetch the web page content
#### Content Extraction
Extract from the page:
- **Page Title:** [extracted-title]
- **Meta Description:** [if available]
- **Author:** [if detected]
- **Published Date:** [if detected]
- **Word Count:** [estimated]
- **Read Time:** [X minutes]
- **Main Content:** [extracted body text]
- **Key Headings:** [list of H1/H2s]
### 3. Category Selection
**Default Categories:**
- **Articles & Blogs:** Long-form content, tutorials, opinion pieces
- **Tools & Resources:** Software, utilities, services, APIs
- **Reference:** Documentation, specs, standards
- **Research:** Papers, studies, academic content
- **Inspiration:** Design, ideas, creative references
- **Videos & Media:** YouTube, podcasts, multimedia
- **News & Updates:** Industry news, announcements
- **Project-Specific:** Related to a specific project (offer project list from MY-PROFILE.md)
- **To Review:** Unsure, save for later categorization
**Custom Categories:**
- Check `05-knowledge/booklets/` for existing custom categories
- Offer to create new category if needed
**Auto-suggestion:** Based on content analysis, suggest the most likely category but let user confirm or change.
### 4. Content Analysis and Processing
#### Phase 1: Content Classification
Determine:
- **Content Category:** [article|tool|reference|research|video|news|etc]
- **Primary Topics:** [topic1, topic2, topic3]
- **Tone:** [informative|opinion|tutorial|news|etc]
- **Quality Assessment:** [high|medium|low]
- **Credibility Indicators:** [author credentials, citations, etc]
#### Phase 2: Insight Extraction
Generate:
- **Executive Summary:** [2-3 sentences]
- **Key Insights:**
1. [Insight 1 with context]
2. [Insight 2 with context]
3. [Insight 3 with context]
- **Notable Quotes:** [if any stand out]
- **Action Items:** [practical takeaways]
#### Phase 3: Relevance Assessment
Analyze:
- **User Interest Match:** [high|medium|low] - [which interests from profile]
- **Project Relevance:** [project-name] - [why relevant]
- **Knowledge Gap:** [yes|no] - [what gap it fills]
- **Timeliness:** [evergreen|current|dated]
- **Uniqueness:** [novel|common|duplicate-adjacent]
#### Phase 4: Cross-Reference
Identify connections to:
- **Related Bookmarks:** [existing similar saves]
- **Related Braindumps:** [if content connects]
- **Related Projects:** [if applicable]
- **Suggested Tags:** [tag1, tag2, tag3]
### 5. Generate Structured Output
Create bookmark file with this structure:
```markdown
---
type: "url-bookmark"
category: "[category-name]"
domain: "[source-domain.com]"
date_saved: "YYYY-MM-DD"
date_accessed: "YYYY-MM-DD HH:MM"
url: "[original-url]"
title: "[page-title]"
author: "[author-if-available]"
published: "[publish-date-if-available]"
tags: ["#bookmark", "#category-tag", "#topic-tags"]
relevance: "[high|medium|low]"
status: "unread"
related_projects: ["project1", "project2"]
confidence: "[high|medium|low]"
---
# [Title]
## Quick Summary
[2-3 sentence summary of the content]
## Key Insights
- **Insight 1:** [description with context]
- **Insight 2:** [description with context]
- **Insight 3:** [description with context]
## Why This Matters
[Connection to user's interests/projects. What makes this worth saving?]
## User Note
[Original user note if provided, otherwise omit section]
## Content Highlights
[Key excerpts or quotes from the content - 200-400 words max]
## Practical Takeaways
- [ ] [Action item 1 if applicable] 📅 [YYYY-MM-DD = date +1 week from today]
- [ ] [Action item 2 if applicable] 📅 [YYYY-MM-DD = date +1 week from today]
## Related Knowledge
- **Similar Bookmarks:** [[bookmark1]], [[bookmark2]]
- **Connected Projects:** [[project1]]
- **Related Notes:** [[note1]], [[note2]]
## Source Details
| Field | Value |
|-------|-------|
| Domain | [domain] |
| Author | [author or "Unknown"] |
| Published | [date or "Unknown"] |
| Word Count | [~X words] |
| Read Time | [~X minutes] |
## Processing Notes
- **Extracted:** [timestamp]
- **Category Confidence:** [percentage]
- **Review Needed:** [yes|no] - [reason if yes]
---
*Processed by COG URL Curator*
```
Save to appropriate location:
- **Standard:** `05-knowledge/booklets/[category-slug]/[title-slug]-YYYY-MM-DD.md`
- **Project-specific:** `04-projects/[project-slug]/resources/[title-slug]-YYYY-MM-DD.md`
- **Mixed/Unclear:** `00-inbox/url-[title-slug]-YYYY-MM-DD.md`
### 6. Tool/Resource Special Handling
For tools and software, use enhanced template:
```markdown
---
type: "url-tool"
category: "tools"
domain: "[domain]"
url: "[url]"
title: "[tool-name]"
date_saved: "YYYY-MM-DD"
pricing: "[free|freemium|paid|enterprise]"
tags: ["#tool", "#category-tags"]
status: "to-evaluate"
---
# [Tool Name]
## What It Does
[1-2 sentence description]
## Key Features
- Feature 1
- Feature 2
- Feature 3
## Use Cases
- Use case 1
- Use case 2
## Pricing
[Pricing details if available]
## Why It's Relevant
[Connection to user's work/interests]
## Evaluation Status
- [ ] Sign up / try demo 📅 [YYYY-MM-DD = date +3 days from today]
- [ ] Test key features 📅 [YYYY-MM-DD = date +1 week from today]
- [ ] Compare with alternatives 📅 [YYYY-MM-DD = date +1 week from today]
- [ ] Decision: [use|pass|revisit] 📅 [YYYY-MM-DD = date +2 weeks from today]
## Notes
[Space for user's evaluation notes]
---
*Processed by COG URL Curator*
```
### 7. Batch Processing
For multiple URLs:
```
Processing [X] URLs...
1. [URL 1] → [category] → Saved to [path]
2. [URL 2] → [category] → Saved to [path]
3. [URL 3] → [category] → Saved to [path]
Summary:
- Articles: 2 saved
- Tools: 1 saved
- Total: 3 URLs processed
```
### 8. Confirm Completion
- Confirm file(s) created
- Show user: "URL saved to [file path]"
- Show quick summary: title, category, key insight preview
- Ask if they want to:
- Add another URL
- Deep-dive into the content
- Connect to specific project or braindump
## Booklet Structure
URLs are organized into "booklets" (category folders):
```
05-knowledge/
└── booklets/
├── articles/
│ ├── _index.md (category overview - auto-created)
│ └── [article-entries].md
├── tools/
│ ├── _index.md
│ └── [tool-entries].md
├── reference/
│ ├── _index.md
│ └── [reference-entries].md
├── research/
│ ├── _index.md
│ └── [research-entries].md
├── inspiration/
│ ├── _index.md
│ └── [inspiration-entries].md
├── videos/
│ ├── _index.md
│ └── [video-entries].md
└── [custom-category]/
├── _index.md
└── [entries].md
```
### Category Index Template
When creating a new category, also create an index file:
```markdown
---
type: "booklet-index"
category: "[category-name]"
created: "YYYY-MM-DD"
last_updated: "YYYY-MM-DD"
entry_count: 0
---
# [Category Name] Booklet
## Description
[What this category contains]
## Recent Additions
[Auto-updated list - most recent 10 entries]
## Top Entries
[Manually curated or most-accessed entries]
## Tags in This Category
[List of common tags used]
## Related Categories
- [[other-category-1]]
- [[other-category-2]]
```
## YAML Formatting Requirements
**CRITICAL:** All YAML frontmatter must use proper Obsidian-compatible formatting:
- All string values MUST be quoted with double quotes
- Arrays MUST use quoted strings: `["item1", "item2", "item3"]`
- URLs MUST be quoted to handle special characters
- Boolean values should NOT be quoted: `true` or `false`
- Ensure proper YAML syntax to prevent parsing errors in Obsidian
**Examples:**
```yaml
# CORRECT
type: "url-bookmark"
url: "https://example.com/path?query=value"
tags: ["#bookmark", "#article", "#ai"]
relevance: "high"
reviewed: false
# INCORRECT
type: url-bookmark
url: https://example.com/path?query=value
tags: [#bookmark, #article, #ai]
relevance: high
reviewed: "false"
```
## Verification Protocols
### Content Accuracy
- **Title Verification:** Ensure extracted title matches page
- **Author Attribution:** Verify author if stated
- **Date Accuracy:** Confirm publication date if shown
- **Summary Fidelity:** Ensure summary accurately represents content
### Categorization Verification
- **Category Fit:** Confirm content matches selected category
- **Tag Relevance:** Verify tags accurately describe content
- **Interest Alignment:** Confirm relevance assessment is accurate
- **Project Connection:** Verify project relevance if claimed
### Quality Checks
- **Completeness:** All required fields populated
- **Formatting:** Proper markdown and YAML syntax
- **Links:** All internal links valid
- **Metadata:** Frontmatter properly formatted
## Uncertainty Handling
### When Content is Unclear
- **Paywalled Content:** Note limitation, extract available preview
- **Dynamic Content:** Note if content may change
- **Complex Content:** Flag for manual review if needed
- **Non-English:** Note language, provide translation if possible
### Confidence Indicators
- **High Confidence (90%+):** Clear content with obvious categorization
- **Medium Confidence (70-89%):** Generally clear with some ambiguity
- **Low Confidence (50-69%):** Significant ambiguity requiring user input
- **Very Low Confidence (<50%):** Major uncertainty, save to inbox
Always explicitly state confidence levels and reasoning in processing notes.
## Integration with Other Skills
### Immediate Follow-up
After URL capture, suggest:
- `/braindump` - Capture thoughts about the URL
- `/knowledge-consolidation` - Integrate into knowledge frameworks
- Daily brief will surface relevant saved URLs
### Cross-Referencing
Automatically check for connections to:
- Active projects (from MY-PROFILE.md)
- Recent braindumps
- Competitive watchlist companies (if exists)
- User interests
## Success Metrics
- Speed of capture (< 30 seconds for single URL)
- Accurate categorization with user confirmation
- Useful insight extraction
- Proper integration with existing knowledge
- Easy retrieval and discovery later
- High confidence in extractions
## Learning and Adaptation
### Pattern Learning
- Track which bookmarks get revisited
- Learn user's categorization preferences
- Improve relevance scoring based on engagement
- Refine insight extraction based on what user finds useful
### Continuous Improvement
- Monitor categorization accuracy over time
- Adapt to user's preferred tag taxonomy
- Learn domain-specific terminology
- Improve cross-referencing accuracy
Name Size