Skip to content

README

Python scripts to maintain and clean up your Obsidian vault.

1. obsidian_vault_cleanup.py - Full Cleanup (All-in-One)

Section titled “1. obsidian_vault_cleanup.py - Full Cleanup (All-in-One)”

Performs complete vault cleanup:

  1. Analyzes vault for broken links
  2. Creates notes for broken links (people → People/, orgs → Organizations/)
  3. Adds missing wiki links throughout the vault
Terminal window
# Dry run (see what would change)
python scripts/obsidian_vault_cleanup.py --dry-run
# Full cleanup
python scripts/obsidian_vault_cleanup.py
# Analyze only (no changes)
python scripts/obsidian_vault_cleanup.py --analyze-only
# Save analysis to JSON
python scripts/obsidian_vault_cleanup.py --analyze-only --output-json analysis.json

Quick analysis of vault health:

  • Broken links (links to non-existent notes)
  • Orphan notes (notes with no incoming links)
  • Most linked notes
Terminal window
python scripts/analyze_vault.py
Section titled “3. add_missing_links.py - Add Missing Links”

Scans notes and adds [wiki links](wiki%20links) where note names are mentioned but not linked.

Terminal window
# Dry run
python scripts/add_missing_links.py --dry-run
# Add links
python scripts/add_missing_links.py
# Verbose output
python scripts/add_missing_links.py -v

Edit the configuration section at the top of each script to customize:

Directories to exclude from scanning (e.g., .obsidian, attachments).

Note names/terms that should NOT be auto-linked. Add common words, single first names, and generic terms here.

SKIP_LINKING = {
'model', 'planning', 'insurance', # Common words
'john', 'mike', 'sarah', # First names
'random notes', 'todo', # Generic terms
}

Regex patterns for links to ignore (dates, images, URLs, etc.).

Explicitly categorize names for note creation:

KNOWN_PEOPLE = {'John Smith', 'Jane Doe'}
KNOWN_ORGS = {'Acme Corp', 'Example Inc'}
  1. Builds a map of all note names and aliases
  2. Sorts by length (longest first) to prevent partial matches
  3. For each note, finds mentions of other notes
  4. Links only the first occurrence of each term
  5. Skips terms that are already linked
  6. Skips the note’s own name (no self-links)
  • Multi-word proper nouns (e.g., “Microsoft Research”)
  • Single words 6+ characters starting with a capital (e.g., “Anthropic”)
  • Aliases defined in note frontmatter
  • Common English words
  • Single first names (ambiguous)
  • Short abbreviations (under 6 chars)
  • Date-prefixed notes
  • Terms already linked elsewhere in the note
  1. Always run with --dry-run first to preview changes
  2. Backup your vault before running live (or use git)
  3. Customize SKIP_LINKING for your vault’s conventions
  4. Run analyze_vault.py periodically to check vault health
  5. Add new people/orgs to KNOWN_PEOPLE/KNOWN_ORGS for better categorization
  • Python 3.7+
  • No external dependencies (uses only standard library)