Understand Your Code. Instantly.
Ever felt lost in a complex codebase? VerbalCodeAI is your personal code companion, leveraging advanced embedding techniques and Large Language Model (LLM) integration. It offers intelligent code analysis, helps you search and understand your project, and provides assistance directly within your command-line interface, making your development workflow smoother and more efficient.
Find relevant code snippets using natural language queries
Get insights about your codebase structure and dependencies
Let the AI explore and understand your codebase using various tools
Ask questions about your code and get detailed explanations
Search the web for code-related information without leaving the terminal
The AI remembers important information about your project
Analyze git history and changes
Generate concise descriptions of code files
Execute system commands with AI assistance
Integrate with other tools via the built-in HTTP API
Connect with Claude Desktop and other MCP-compatible AI assistants
Configure LLM providers, models, and behavior to suit your needs
When you first use VerbalCodeAI, it indexes your codebase to create searchable resources:
VerbalCodeAI analyzes your code files and generates vector embeddings, enabling semantic search capabilities.
Information about file types, sizes, and structure is collected to help with navigation and search.
Functions, classes, and imports are identified and indexed for quick reference.
VerbalCodeAI offers two powerful ways to interact with your code:
Have natural conversations about your code:
Let the AI use tools to explore your codebase:
💡 Pro Tip: Agent Mode is more cost-effective when using cloud-based LLM providers as it makes fewer API calls.
git clone https://github.com/vibheksoni/VerbalCodeAi.git
cd VerbalCodeAi
setup_windows.bat
git clone https://github.com/vibheksoni/VerbalCodeAi.git
cd VerbalCodeAi
chmod +x setup_linux.sh
./setup_linux.sh
If you prefer to set up manually:
python -m venv venv
venv\Scripts\activate
source venv/bin/activate
pip install -r requirements.txt
.env
file with your configuration (see .env.example
for
reference)After installation, activate your virtual environment and run:
python app.py
When you first start VerbalCodeAI, you'll be prompted to select a directory to index. This process analyzes your codebase and creates embeddings for efficient searching.
Agent Mode provides access to powerful tools:
embed_search
semantic_search
grep
regex_advanced_search
file_type_search
read_file
file_stats
directory_tree
get_file_description
get_file_metadata
find_functions
find_classes
find_usage
cross_reference
code_analysis
get_functions
get_classes
get_variables
get_imports
explain_code
git_history
version_control_search
search_imports
get_project_description
get_instructions
create_instructions_template
add_memory
get_memories
search_memories
run_command
read_terminal
kill_terminal
list_terminals
ask_buddy
(with context-aware second opinions)google_search
ddg_search
bing_news_search
fetch_webpage
get_base_knowledge
Agent Mode is the most cost-effective option when using cloud-based LLM providers. It makes fewer API calls compared to Chat Mode, which helps avoid rate limits and reduces costs. For the best experience with minimal expenses, consider using Agent Mode when working with paid API services.
VerbalCodeAI includes a built-in HTTP API server that allows you to access its functionality programmatically. This is useful for integrating VerbalCodeAI with other tools or creating custom interfaces.
To start the HTTP API server:
python app.py --serve [PORT]
Where [PORT]
is the port number you want the server to listen on (default is 8000).
The HTTP API server exposes several endpoints:
GET /api/health - Health check endpoint
POST /api/initialize - Initialize a directory
POST /api/ask - Ask a question about the code
POST /api/index/start - Start indexing a directory
GET /api/index/status - Get indexing status
Here's an example of how to use the API with cURL:
curl -X POST http://localhost:8000/api/ask \
-H "Content-Type: application/json" \
-d '{"question": "What does the main function do in this project?"}'
HTTP_ALLOW_ALL_ORIGINS
environment variable to TRUE
in your .env
file.
VerbalCodeAI supports the Model Context Protocol (MCP), allowing you to connect it to Claude Desktop and other MCP-compatible AI assistants. This integration enables Claude to directly interact with your codebase, providing a powerful AI-assisted development experience.
Claude Desktop MCP Integration
Claude analyzing code with VerbalCodeAI
The MCP server wraps the HTTP API server and provides tools for Claude to interact with VerbalCodeAI. Here's how to set it up:
First, start the HTTP API server if it's not already running:
python app.py --serve [PORT]
Where [PORT]
is the port number you want the server to listen on (default is 8000).
In a new terminal window, start the MCP server:
python mcp_server.py
The MCP server will automatically check if the HTTP API server is running and start it if needed.
You can configure the MCP server by setting the following environment variables in your .env
file:
MCP_API_URL=http://localhost:8000 # URL of the HTTP API server
MCP_HTTP_PORT=8000 # Port to run the HTTP API server on
To use VerbalCodeAI with Claude Desktop:
Ensure all necessary dependencies are installed by running the following command in the repository's root directory:
pip install -r requirements.txt
(Instructions for this step would typically involve placing the server files in a specific directory or using a Claude Desktop interface if available. Refer to Claude Desktop documentation for specifics.)
Restart Claude Desktop for the changes to take effect.
In Claude Desktop, you can now use the following tools:
set_api_url(url: str) -> str
health_check() -> Dict[str, str]
start_http_server_tool(port: int = None) -> Dict[str, str]
initialize_directory(directory_path: str) -> Dict[str, str]
ask_agent(question: str) -> Dict[str, str]
start_indexing(directory_path: str) -> Dict[str, str]
get_indexing_status() -> Dict[str, str]
Cursor is an AI-powered code editor that supports MCP. To use VerbalCodeAI with Cursor:
python mcp_server.py
VerbalCodeAI can be configured through the .env
file:
# Provider can be: ollama, google, openai, anthropic, groq, or openrouter
AI_CHAT_PROVIDER=ollama
AI_EMBEDDING_PROVIDER=ollama
AI_DESCRIPTION_PROVIDER=ollama
AI_AGENT_BUDDY_PROVIDER=ollama
# API Keys for each functionality (only needed if using that provider)
# The same key will be used for the selected provider in each category
AI_CHAT_API_KEY=None
AI_EMBEDDING_API_KEY=None
AI_DESCRIPTION_API_KEY=None
AI_AGENT_BUDDY_API_KEY=None
# Model names for each provider
# For ollama: llama2, codellama, mistral, etc. (embedding)
# For OpenAI: gpt-4, gpt-3.5-turbo, text-embedding-ada-002 (embedding)
# For OpenRouter: anthropic/claude-3-opus, openai/gpt-4-turbo, google/gemini-pro, etc.
# For Google: gemini-pro, gemini-pro-vision
# For Anthropic: claude-3-5-sonnet-latest, claude-3-opus-20240229, claude-3-haiku-20240307
# For Groq: llama3-8b-8192, llama3-70b-8192, mixtral-8x7b-32768
CHAT_MODEL=llama2
EMBEDDING_MODEL=all-minilm:33m
DESCRIPTION_MODEL=llama2
AI_AGENT_BUDDY_MODEL=llama3.2
# Optional: Site information for OpenRouter rankings
SITE_URL=http://localhost:3000
SITE_NAME=Local Development
# Performance settings (LOW, MEDIUM, MAX)
# LOW: Minimal resource usage, suitable for low-end systems
# MEDIUM: Balanced resource usage, suitable for most systems
# MAX: Maximum resource usage, suitable for high-end systems
PERFORMANCE_MODE=MEDIUM
# Maximum number of threads to use (will be calculated automatically if not set)
MAX_THREADS=16
# Cache size for embedding queries (higher values use more memory but improve performance)
EMBEDDING_CACHE_SIZE=1000
# Similarity threshold for embedding search (lower values return more results but may be less relevant)
EMBEDDING_SIMILARITY_THRESHOLD=0.05
# API Rate Limiting Settings
# Delay in milliseconds between embedding API calls to prevent rate limiting
# Recommended: 100ms for Google, 0ms for OpenAI/Ollama (set to 0 to disable)
EMBEDDING_API_DELAY_MS=100
# Delay in milliseconds between description generation API calls to prevent rate limiting
# Recommended: 100ms for Google, 0ms for OpenAI/Ollama (set to 0 to disable)
DESCRIPTION_API_DELAY_MS=100
# UI Settings
# Enable/disable markdown rendering (TRUE/FALSE)
ENABLE_MARKDOWN_RENDERING=TRUE
# Show thinking blocks in AI responses (TRUE/FALSE)
SHOW_THINKING_BLOCKS=FALSE
# Enable streaming mode for AI responses (TRUE/FALSE) # Tends to be slower for some reason # Broken for openrouter TODO: Fix this at some point !
ENABLE_STREAMING_MODE=FALSE
# Enable chat logging to save conversations (TRUE/FALSE)
CHAT_LOGS=FALSE
# Enable memory for AI conversations (TRUE/FALSE)
MEMORY_ENABLED=TRUE
# Maximum number of memory items to store
MAX_MEMORY_ITEMS=10
# Execute commands without confirmation (TRUE/FALSE)
# When FALSE, the user will be prompted to confirm before executing any command
# When TRUE, commands will execute automatically without confirmation
COMMANDS_YOLO=FALSE
# HTTP API Server Settings
# Allow connections from any IP address (TRUE/FALSE)
# When FALSE, the server only accepts connections from localhost (127.0.0.1)
# When TRUE, the server accepts connections from any IP address (0.0.0.0)
# WARNING: Setting this to TRUE may expose your API to the internet
HTTP_ALLOW_ALL_ORIGINS=FALSE
# MCP Server Settings
# URL of the HTTP API server
MCP_API_URL=http://localhost:8000
# Port to run the HTTP API server on
MCP_HTTP_PORT=8000
For the best local experience without any API costs, the developer recommends using these Ollama models:
gemma3
- Google's Gemma 3 model provides excellent code understanding and generationall-minilm
- Efficient and accurate embeddings for code search and retrieval# Install the recommended models
ollama pull gemma3
ollama pull all-minilm
# Configure in .env
AI_CHAT_PROVIDER=ollama
AI_EMBEDDING_PROVIDER=ollama
AI_DESCRIPTION_PROVIDER=ollama
CHAT_MODEL=gemma3
EMBEDDING_MODEL=all-minilm:33m
DESCRIPTION_MODEL=gemma3
Anthropic's Claude models are particularly strong at understanding and generating code. Available models include:
Note: Anthropic does not provide embedding capabilities, so you'll need to use a different provider for embeddings.
Groq provides ultra-fast inference for popular open-source models. Available models include:
Note: Groq does not provide embedding capabilities, so you'll need to use a different provider for embeddings.
VerbalCodeAi/
├── app.py # Main application entry point
├── mcp_server.py # MCP server wrapper
├── mcp_server_http.py # HTTP-based MCP server implementation
├── mods/ # Core modules
│ ├── banners.py # ASCII art banners
│ ├── http_api.py # HTTP API server implementation
│ ├── llms.py # LLM integration
│ ├── terminal_ui.py # Terminal UI components
│ ├── terminal_utils.py # Terminal utilities
│ └── code/ # Code processing modules
│ ├── agent_mode.py # Agent mode implementation
│ ├── decisions.py # AI decision making
│ ├── directory.py # Directory structure handling
│ ├── embed.py # Embedding generation and search
│ ├── indexer.py # File indexing
│ ├── memory.py # Memory management
│ ├── terminal.py # Terminal command execution
│ └── tools.py # Agent tools
├── integrations/ # IDE and tool integrations
│ ├── claude/ # Claude Desktop integration
│ ├── cursor/ # Cursor editor integration
│ └── README.md # Integration documentation
├── requirements.txt # Python dependencies
├── setup_windows.bat # Windows setup script
└── setup_linux.sh # Linux setup script
Join our growing community of developers using VerbalCodeAI!
Join our Discord server to:
Contributions are welcome! Please feel free to submit a Pull Request.
git checkout -b feature/amazing-feature
)git commit -m 'Add some amazing feature'
)git push origin feature/amazing-feature
)This project is licensed under the MIT License - see the LICENSE file for details.