WoT
Builders

Build on the world's most comprehensive classification graph

1,000+ classification systems, 1.2M+ nodes, and 321K+ crosswalk edges - available via REST API, MCP server, packaged AI skills (Claude Code, Anthropic, ChatGPT Custom GPT, portable), or directly from the open-source repo.

GitHub

Open source - fork, contribute, or self-host

Repository

colaberry/WorldOfTaxonomy

License

Open Source

Stack

Python + Next.js + PostgreSQL

GitHub Stars

2

Quick start

git clone https://github.com/colaberry/WorldOfTaxonomy.git
cd WorldOfTaxonomy

# Install backend dependencies
pip install -r requirements.txt

# Configure database (copy .env.example and fill in DATABASE_URL)
cp .env.example .env

# Run the API
python3 -m uvicorn world_of_taxonomy.api.app:create_app --factory --port 8000

# Run the frontend (separate terminal)
cd frontend && npm install && npm run dev
Open repository

Crosswalk Explorer

Interactive graph visualization of crosswalk relationships

Explore how classification systems connect through 321K+ crosswalk edges. The system-level graph shows all connected systems grouped by category. Click any edge to drill into the code-level view with individual mappings.

System view

Systems grouped by category, edges = crosswalks

Code view

Individual codes with exact/partial/broad edges

Powered by

Cytoscape.js graph library

Open Crosswalk Explorer

REST API

HTTP JSON API - 50 endpoints, no SDK needed

Base URL

/api/v1

Auth

Bearer token or API key

Rate limits

30/min anon, 1,000/min auth

Popular endpoints

GET/api/v1/search?q={term}Full-text search across 1.2M+ nodes
GET/api/v1/systems/{id}/nodes/{code}/equivalencesCrosswalk mappings to other systems
POST/api/v1/classifyClassify free-text against all systems (Pro+)
GET/api/v1/countries/{code}Country taxonomy profile
GET/api/v1/export/systems.jsonlBulk export as JSONL (Pro+)

Guides

Curated knowledge to use the data effectively

View all guides

MCP Server

Works with Claude, Cursor, VS Code, Windsurf, and any MCP client

The MCP (Model Context Protocol) server lets AI assistants like Claude query the taxonomy graph directly - searching codes, translating between systems, navigating hierarchies, and exploring country profiles - all from within a conversation.

Protocol

JSON-RPC over stdio

Transport

stdin / stdout

Tools

22 available

Quick start

# From the repo root (requires DATABASE_URL in environment)
python3 -m world_of_taxonomy mcp

Popular tools

search_classifications

Full-text search across all nodes

translate_code

Convert a code from one system to another

classify_business

Classify free-text against taxonomy systems

explore_industry_tree

Interactive hierarchy exploration

get_country_taxonomy_profile

Full taxonomy profile for a country

AI Skills

Drop-in skill bundles for Claude, ChatGPT, and any LLM agent

Four packaged integrations, all backed by the same REST API and MCP server. Pick the one that matches your agent runtime. Source lives in the /skills directory of the repo.

Claude Code Skill

Markdown skill file with frontmatter. Drop into ~/.claude/skills/ or reference from the repo. Auto-activates on classification, translation, and hierarchy queries.

skills/claude-code/worldoftaxonomy.md

Anthropic Claude Skill

Self-contained SKILL.md bundle for claude.ai agent skills. Includes auth, endpoints, response guidance, and invocation triggers.

skills/anthropic/SKILL.md

ChatGPT Custom GPT

OpenAPI Action schema + system prompt for ChatGPT. Includes an export script that trims the spec to the 10 endpoints a Custom GPT needs.

skills/openapi/

Portable LLM Skill

Plain markdown system prompt + JSON tool schemas. Works with Gemini, Llama, LangChain, LlamaIndex, or any function-calling agent.

skills/portable/

Shared capabilities

  • Classify free text (business, product, occupation, document) under standard codes across all systems
  • Translate codes between systems (NAICS -> ISIC, ICD-10-CM -> ICD-10-GM, SOC -> ISCO, HS -> CPC)
  • Walk hierarchies (children, ancestors, siblings) and audit crosswalk coverage between any two systems

Adding a New System

Contribute a classification system in ~10 steps using TDD

Every system follows the same TDD cycle: write a failing test first, implement the ingester to make it green, wire it into the CLI, then run the full suite to confirm no regressions. The detailed SOP lives in docs/adding-a-new-system.md.

New file

world_of_taxonomy/ingest/<system>.py

Test file

tests/test_ingest_<system>.py

Wire up

world_of_taxonomy/__main__.py

10-step checklist

  1. 1Write a failing test (test_ingest_<system>.py) - confirm it is red before continuing
  2. 2Create the ingester (ingest/<system>.py) - parse source data, build SYSTEM + NODES dicts
  3. 3Set is_leaf correctly - use codes_with_children = {parent for ... if parent} pattern
  4. 4Implement ingest(conn) - upsert system row, upsert nodes in dependency order
  5. 5Run the test green - minimum code to pass, nothing more
  6. 6Add crosswalk edges if a concordance table exists (ingest/crosswalk_<system>.py)
  7. 7Write a test for equivalences - confirm bidirectional edges are created
  8. 8Wire into __main__.py ingest command (add system id to the dispatch table)
  9. 9Run the full test suite - python3 -m pytest tests/ -v - all green before committing
  10. 10Update CLAUDE.md system table with name, region, and node count

Three implementation paths

Path A - NACE-derived

System shares all NACE Rev 2 codes (WZ, ONACE, NOGA, ATECO, NAF, PKD, SBI, etc.). Copy nodes from NACE and create 1:1 equivalence edges. ~15 lines of code.

see: nace_derived.py

Path B - ISIC-derived

System is a national adaptation of ISIC Rev 4 (CIIU, VSIC, BSIC, etc.). Copy ISIC nodes and create equivalences. Add country-specific codes if the source deviates.

see: isic_derived.py

Path C - Standalone

System has its own source file (CSV, XLSX, JSON, XML, PDF). Parse source, build hierarchy from parent codes, detect leaves via codes_with_children, upsert independently.

see: naics.py, loinc.py

Minimal standalone ingester template

# world_of_taxonomy/ingest/my_system.py
SYSTEM = {
    "id": "my_system_2024",
    "name": "My Classification System 2024",
    "authority": "Issuing Body",
    "region": "Global",
    "version": "2024",
    "description": "...",
}

# (code, title, description, parent_code)
NODES = [
    ("A", "Section A", "Agriculture", None),
    ("A01", "Crop production", "...", "A"),
    ...
]

async def ingest(conn) -> None:
    await conn.execute("""
        INSERT INTO classification_system (...) VALUES (...)
        ON CONFLICT (id) DO UPDATE SET ...
    """, *SYSTEM.values())

    # Compute leaf flags dynamically - never hard-code level == N
    codes_with_children = {parent for (_, _, _, parent) in NODES if parent is not None}

    for code, title, desc, parent in NODES:
        is_leaf = code not in codes_with_children
        await conn.execute("""
            INSERT INTO classification_node (...) VALUES (...)
            ON CONFLICT (system_id, code) DO UPDATE SET ...
        """, SYSTEM["id"], code, title, desc, parent, is_leaf)

Pricing

Free, Pro, and Enterprise plans available

The full knowledge graph is available on every plan. Paid tiers add higher limits, bulk export, classification API, and dedicated support.

View pricing

Contact Sales

Interested in Enterprise? Tell us about your use case and we'll get back to you.

Questions or contributions?

Open an issue or pull request on GitHub - all feedback welcome.

Open an issue