README
Obsidian Vault Cleanup Scripts
Section titled “Obsidian Vault Cleanup Scripts”Python scripts to maintain and clean up your Obsidian vault.
Scripts
Section titled “Scripts”1. obsidian_vault_cleanup.py - Full Cleanup (All-in-One)
Section titled “1. obsidian_vault_cleanup.py - Full Cleanup (All-in-One)”Performs complete vault cleanup:
- Analyzes vault for broken links
- Creates notes for broken links (people →
People/, orgs →Organizations/) - Adds missing wiki links throughout the vault
# Dry run (see what would change)python scripts/obsidian_vault_cleanup.py --dry-run
# Full cleanuppython scripts/obsidian_vault_cleanup.py
# Analyze only (no changes)python scripts/obsidian_vault_cleanup.py --analyze-only
# Save analysis to JSONpython scripts/obsidian_vault_cleanup.py --analyze-only --output-json analysis.json2. analyze_vault.py - Vault Analysis
Section titled “2. analyze_vault.py - Vault Analysis”Quick analysis of vault health:
- Broken links (links to non-existent notes)
- Orphan notes (notes with no incoming links)
- Most linked notes
python scripts/analyze_vault.py3. add_missing_links.py - Add Missing Links
Section titled “3. add_missing_links.py - Add Missing Links”Scans notes and adds [wiki links](wiki%20links) where note names are mentioned but not linked.
# Dry runpython scripts/add_missing_links.py --dry-run
# Add linkspython scripts/add_missing_links.py
# Verbose outputpython scripts/add_missing_links.py -vConfiguration
Section titled “Configuration”Edit the configuration section at the top of each script to customize:
SKIP_DIRS
Section titled “SKIP_DIRS”Directories to exclude from scanning (e.g., .obsidian, attachments).
SKIP_LINKING
Section titled “SKIP_LINKING”Note names/terms that should NOT be auto-linked. Add common words, single first names, and generic terms here.
SKIP_LINKING = { 'model', 'planning', 'insurance', # Common words 'john', 'mike', 'sarah', # First names 'random notes', 'todo', # Generic terms}SKIP_LINK_PATTERNS
Section titled “SKIP_LINK_PATTERNS”Regex patterns for links to ignore (dates, images, URLs, etc.).
KNOWN_PEOPLE and KNOWN_ORGS
Section titled “KNOWN_PEOPLE and KNOWN_ORGS”Explicitly categorize names for note creation:
KNOWN_PEOPLE = {'John Smith', 'Jane Doe'}KNOWN_ORGS = {'Acme Corp', 'Example Inc'}How Auto-Linking Works
Section titled “How Auto-Linking Works”- Builds a map of all note names and aliases
- Sorts by length (longest first) to prevent partial matches
- For each note, finds mentions of other notes
- Links only the first occurrence of each term
- Skips terms that are already linked
- Skips the note’s own name (no self-links)
What Gets Linked
Section titled “What Gets Linked”- Multi-word proper nouns (e.g., “Microsoft Research”)
- Single words 6+ characters starting with a capital (e.g., “Anthropic”)
- Aliases defined in note frontmatter
What Gets Skipped
Section titled “What Gets Skipped”- Common English words
- Single first names (ambiguous)
- Short abbreviations (under 6 chars)
- Date-prefixed notes
- Terms already linked elsewhere in the note
- Always run with
--dry-runfirst to preview changes - Backup your vault before running live (or use git)
- Customize
SKIP_LINKINGfor your vault’s conventions - Run
analyze_vault.pyperiodically to check vault health - Add new people/orgs to
KNOWN_PEOPLE/KNOWN_ORGSfor better categorization
Requirements
Section titled “Requirements”- Python 3.7+
- No external dependencies (uses only standard library)