Code Pluginsource linked

SkillCompass OC Canary (Internal)v1.1.0-oc.8

SkillCompass plugin for OpenClaw — skill quality evaluation, usage tracking, and inbox suggestions

skillcompass-oc-canary·runtime skillcompass.oc·by @krishna-505
Community code plugin. Review compatibility and verification before install.
openclaw plugins install clawhub:skillcompass-oc-canary
Latest release: v1.1.0-oc.8Download zip

Capabilities

configSchema
Yes
Executes code
Yes
HTTP routes
0
Runtime ID
skillcompass.oc

Compatibility

Built With Open Claw Version
2026.3.24-beta.2
Min Gateway Version
2026.3.24-beta.2
Plugin Api Range
>=2026.3.24-beta.2
Plugin Sdk Version
2026.3.24-beta.2
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The codebase (validators, audit chain, integrity monitor, inbox engine, patterns, OC integration) matches the declared purpose (skill quality evaluation, usage tracking, inbox suggestions). No unexpected cloud credentials or unrelated binaries are requested; reading/writing a local .skill-compass state directory is coherent with the feature set.
!
Instruction Scope
SKILL.md instructs users to clone the repo, run npm install, and rsync into their agent skill directory. It explicitly tells users to approve 'node' commands persistently ('Allow always') so the tool can run locally and auto-trigger onboarding scans. The codebase also mentions a 'pre-eval-scan' with a shell wrapper and an 'update-checker' (network-capable components were not fully shown). These items expand the agent's file/command access surface: the skill will read many skill files, write .skill-compass state, and may run local Node scripts (and potentially shell wrappers) during scans. That behavior is expected for a local evaluator, but it is more powerful than a read-only linter and should be reviewed before granting persistent execution rights.
Install Mechanism
No registry install spec is present; SKILL.md recommends git clone and npm install. Running npm install will pull dependencies from npm (not shown in registry metadata). This is a moderate risk because third-party packages will be installed into the runtime environment; the registry metadata did not declare an install package or pinned dependency list in the skill manifest for review.
Credentials
The skill declares no required environment variables or credentials. Code references process.env.CLAUDE_PLUGIN_ROOT as an optional baseDir, which is appropriate. The skill persists local state under .skill-compass in the chosen baseDir; this local filesystem access is consistent with its stated role and does not request extra secrets or unrelated credentials.
Persistence & Privilege
The skill is not always:true and does not demand elevated platform privileges, but it creates persistent data under .skill-compass (audit logs, manifests, checksums, inbox.json) and can install hooks (post-tool hooks are mentioned). Combined with the SKILL.md guidance to grant persistent Node execution permission, this gives the skill ongoing capability to run scans and update local state. That persistence is plausible for this tool, but users should be aware of the long-lived nature of those files and hooks.
What to consider before installing
What to check before installing or granting persistent permission: - Inspect the two files I couldn't fully review: lib/pre-eval-scan.js (and any shell wrapper it references) and lib/update-checker.js (or any OC dist update/check code). These are the most likely places that could execute shell commands or make network calls. - If you will run npm install, review package.json and the full dependency tree (npm ls --all) in a safe environment. Consider running npm install in an isolated container or VM first. - Do not click 'Allow always' for Node execution until you have confirmed the code only performs static analysis and local file operations. Granting persistent execution means the skill can run Node scripts and hooks without a prompt. - Verify network behavior: confirm that the update checker or any telemetry is opt-in and that the default is 'no external network calls' (SKILL.md claims data stays local unless you explicitly request updates). If networked update/telemetry exists, ensure endpoints are trustworthy and optional. - Back up any important data before the first run; the skill writes to .skill-compass and may create audit/manifest files and snapshots. - If you want higher assurance, run the skill inside a sandbox (lightweight VM, container, or dedicated user account) and test behavior (what files are read/written, what processes are spawned, what outbound connections are made) before installing it into your normal agent environment. Why I flagged this as suspicious rather than benign: the project is largely coherent with its stated purpose, but a few components (pre-eval shell wrapper, update-checker and the npm-install path) increase the runtime power beyond simple static analysis. Those are plausible for the tool's goals, but they need explicit review and confirmation before you grant persistent execution rights or install into a production agent environment. Additional information that would raise confidence to 'high': the content of pre-eval-scan.js and update-checker.js demonstrating static-only behavior and an opt-in network/update flow, and a reviewed/pinned package.json for npm dependencies.
lib/update-checker.js:67
Shell command execution detected (child_process).
oc/dist/locale.js:59
Dynamic code execution detected.
lib/update-checker.js:164
Environment variable access combined with network send.
!
lib/update-checker.js:202
File read combined with network send (possible exfiltration).
Patterns worth reviewing
These patterns may indicate risky behavior. Check the VirusTotal and OpenClaw results above for context-aware analysis before installing.

Verification

Tier
source linked
Scope
artifact only
Summary
Validated package structure and linked the release to source metadata.
Commit
230fcdfe2857
Tag
feature/openclaw
Provenance
No
Scan status
pending

Tags

canary
1.1.0-oc.12
latest
1.1.0-oc.8
<h1 align="center">SkillCompass</h1> <p align="center"> <strong>Evaluate quality. Find the weakest link. Fix it. Prove it worked. Repeat.</strong> </p> <p align="center"> <a href="https://github.com/Evol-ai/SkillCompass">GitHub</a> &middot; <a href="SKILL.md">SKILL.md</a> &middot; <a href="schemas/">Schemas</a> &middot; <a href="CHANGELOG.md">Changelog</a> </p> <p align="center"> <a href="https://clawhub.ai/skill/skill-compass"><img src="https://img.shields.io/badge/ClawHub-skill--compass-orange.svg" alt="ClawHub" /></a> <img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT License" /> <img src="https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg" alt="Node >= 18" /> <img src="https://img.shields.io/badge/model-Claude%20Opus%204.6-purple.svg" alt="Claude Opus 4.6" /> </p>
What it isA local-first skill quality evaluator and management tool for Claude Code / OpenClaw. Six-dimension scoring, usage-driven suggestions, guided improvement, version tracking.
Pain it solvesTurns "tweak and hope" into diagnose → targeted fix → verified improvement. Turns "install and forget" into ongoing visibility over what's working, what's stale, and what's risky.
Use in 30 seconds/skillcompass — see your skill health at a glance. /eval-skill {path} — instant quality report showing exactly what's weakest and what to improve next.

Evaluate → find weakest link → fix it → prove it worked → next weakness → repeat. Meanwhile, Skill Inbox watches your usage and tells you what needs attention.


Who This Is For

<table> <tr><td width="50%">

For

  • Anyone maintaining agent skills and wanting measurable quality
  • Developers who want directed improvement — not guesswork, but knowing exactly which dimension to fix next
  • Teams needing a quality gate — any tool that edits a skill gets auto-evaluated
  • Users who install many skills and need visibility over what's actually used, what's stale, and what's risky
</td><td>

Not For

  • General code review or runtime debugging
  • Creating new skills from scratch (use skill-creator)
  • Evaluating non-skill files
</td></tr> </table>

Quick Start

Prerequisites: Claude Opus 4.6 (complex reasoning + consistent scoring) · Node.js v18+ (local validators)

Claude Code

git clone https://github.com/Evol-ai/SkillCompass.git
cd SkillCompass && npm install

# User-level (all projects)
rsync -a --exclude='.git'  . ~/.claude/skills/skill-compass/

# Or project-level (current project only)
rsync -a --exclude='.git'  . .claude/skills/skill-compass/

First run: SkillCompass auto-triggers a brief onboarding — scans your installed skills (~5 seconds), offers statusLine setup, then hands control back. Claude Code will request permission for node commands; select "Allow always" to avoid repeated prompts.

OpenClaw

git clone https://github.com/Evol-ai/SkillCompass.git
cd SkillCompass && npm install
# Follow OpenClaw skill installation docs for your setup
rsync -a --exclude='.git'  . <your-openclaw-skills-path>/skill-compass/

If your OpenClaw skills live outside the default scan roots, add them to skills.load.extraDirs in ~/.openclaw/openclaw.json:

{
  "skills": {
    "load": {
      "extraDirs": ["<your-openclaw-skills-path>"]
    }
  }
}

Usage

/skillcompass is the single entry point. Use it with a slash command or just talk naturally — both work:

/skillcompass                              → see what needs attention
/skillcompass evaluate my-skill            → six-dimension quality report
"improve the nano-banana skill"            → fix weakest dimension, verify, next
"what skills haven't I used recently?"     → usage-based insights
"security scan this skill"                 → D3 security deep-dive

What It Does

<p align="center"> <img src="assets/skill-quality-report.png" alt="SkillCompass — Skill Quality Report" width="380" /> </p>

The score isn't the point — the direction is. You instantly see which dimension is the bottleneck and what to do about it.

Each /eval-improve round follows a closed loop: fix the weakest → re-evaluate → verify improvement → next weakest. No fix is saved unless the re-evaluation confirms it actually helped.


Six-Dimension Evaluation Model

IDDimensionWeightWhat it evaluates
D1Structure10%Frontmatter validity, markdown format, declarations
D2Trigger15%Activation quality, rejection accuracy, discoverability
D3Security20%Secrets, injection, permissions, exfiltration, embedded shell
D4Functional30%Core quality, edge cases, output stability, error handling
D5Comparative15%Value over direct prompting (with vs without skill)
D6Uniqueness10%Overlap with similar skills, model supersession risk
overall_score = round((D1×0.10 + D2×0.15 + D3×0.20 + D4×0.30 + D5×0.15 + D6×0.10) × 10)
VerdictCondition
PASSscore >= 70 AND D3 pass
CAUTION50–69, or D3 High findings
FAILscore < 50, or D3 Critical (gate override)

Skill Inbox — Usage-Driven Suggestions

SkillCompass passively tracks which skills you actually use and surfaces suggestions when something needs attention — unused skills, stale evaluations, declining usage, available updates, and more. 9 built-in rules, all based on real invocation data.

  • Suggestions have a lifecycle: pending → acted / snoozed / dismissed, with auto-reactivation when conditions change
  • All data stays local — no network calls unless you explicitly request updates
  • Tracking is automatic via hooks (~one line per skill invocation), zero configuration

Features

Evaluate → Improve → Verify

/eval-skill scores six dimensions and pinpoints the weakest. /eval-improve targets that dimension, applies a fix, and re-evaluates — only saves when the target dimension improved and security/functionality didn't regress. Then move to the next weakness.

Skill Lifecycle

SkillCompass covers the full lifecycle of your skills — not just one-time evaluation.

Install — auto-scans your inventory, quick-checks security patterns across packages and sub-skills.

Ongoing — usage hooks passively track every invocation. Skill Inbox turns this into actionable insights: which skills are never used, which are declining, which are heavily used but never evaluated, which have updates available.

On edit — hooks auto-check structure + security on every SKILL.md write through Claude. Catches injection, exfiltration, embedded shell. Warns, never blocks.

On change — SHA-256 snapshots ensure any version is recoverable. D3 or D4 regresses after improvement? Snapshot restored automatically.

On update — update checker reads local git state passively; network only when you ask. Three-way merge preserves your local improvements region-by-region.

Scale

One skill or fifty — same workflow. /eval-audit scans a whole directory and ranks results worst-first so you fix what matters most. /eval-evolve chains multiple improve rounds automatically (default 6, stops at PASS or plateau). --ci flag outputs machine-readable JSON with exit codes for pipeline integration.


Works With Everything

No point-to-point integration needed. The Pre-Accept Gate intercepts all SKILL.md edits regardless of source.

ToolHow it works togetherGuide
ClaudeceptionExtracts skill → auto-evaluation catches security holes + redundancy → directed fixguide
Self-Improving AgentLogs errors → feed as signals → SkillCompass maps to dimensions and fixesguide

Design Principles

  • Local-first: All data stays on your machine. No network calls except when you explicitly request updates.
  • Read-only by default: Evaluation and reporting are read-only. Write operations (improve, merge, rollback) require explicit opt-in.
  • Passive tracking, active decisions: Hooks collect usage data silently. Suggestions are surfaced, never auto-acted on.
  • Dual-channel UX: Keyboard-selectable choices for actions, natural language for queries. Both always available.

Feedback Signal Standard

SkillCompass defines an open feedback-signal.json schema for any tool to report skill usage data:

/eval-skill ./my-skill/SKILL.md --feedback ./feedback-signals.json

Signals: trigger_accuracy, correction_count, correction_patterns, adoption_rate, ignore_rate, usage_frequency. The schema is extensible (additionalProperties: true) — any pipeline can produce or consume this format.


License

MIT — Use, modify, distribute freely. See LICENSE for details.