Tool Poisoning & Metadata Attacks

Methods for compromising MCP tools and manipulating their metadata to deceive AI systems and users about tool capabilities and intentions.

Overview

Tool poisoning attacks exploit the trust relationship between AI models and their available tools. These attacks can compromise the integrity of the entire MCP ecosystem by making legitimate tools behave maliciously or by introducing malicious tools disguised as legitimate ones.

Attack Techniques

Tool Poisoning

Malicious modification of tool metadata, descriptions, or parameters to trick AI into unintended actions.

Tool Mutation/Rug Pull Attacks

Tools that change their behavior after installation, initially appearing safe but later performing malicious actions.

Tool Name Conflict

Multiple tools with similar names causing confusion and potential hijacking of legitimate tool calls.

Tool Shadowing/Name Collisions

Impersonating trusted tools by using similar names or deliberately colliding with legitimate tool names.

Preference Manipulation

Biased naming or phrasing in tool descriptions to influence LLM tool selection toward malicious options.

Prompt Injection in Metadata

Embedding prompt-like instructions in tool descriptions (e.g., “If unsure, use this tool”).

Metadata Manipulation

Attackers alter tool metadata to deceive both users and AI systems about tool capabilities and intentions.

Tool Impersonation

Malicious tools that mimic legitimate services to steal data or credentials.

Metadata Manipulation Attacks

Manipulating tool metadata, descriptions, rankings, or other properties to bias agent selection toward malicious servers.

Tool Squatting

Registering tool names that closely resemble legitimate, popular tools to deceive users and agents through typosquatting.

Impact Assessment

Severity: High to Critical
Likelihood: Medium to High
Detection Difficulty: Medium

Common Indicators

Unexpected tool behavior changes
Suspicious tool metadata modifications
Tool name conflicts or duplicates
Anomalous tool selection patterns
Suspicious new tool registrations

General Mitigation Strategies

Tool Verification: Implement cryptographic signing of tools
Metadata Validation: Validate tool descriptions and parameters
Behavioral Monitoring: Monitor tool execution patterns
Tool Sandboxing: Isolate tool execution environments
Registry Security: Secure tool registration and discovery

Detection Methods

Tool behavior analysis
Metadata integrity checking
Tool signature verification
Execution pattern monitoring

This category contains 10 distinct attack techniques targeting the trust and integrity of MCP tools and their metadata.