Question 1

What is an AI agent?

Accepted Answer

An AI agent is software that uses a language model to decide what action to take next, executes that action, observes the result, and repeats until a goal is reached. Unlike a single LLM prompt-response cycle, an agent maintains state across multiple steps. It can call external tools — APIs, databases, browsers, code interpreters — and route its own workflow. The term entered mainstream developer usage in 2023 with projects like AutoGPT (30k GitHub stars in one week, April 2023) and BabyAGI. Production agent frameworks today include LangGraph, CrewAI, and Microsoft AutoGen. The distinction that matters: a chatbot answers questions, an agent completes tasks.

Question 2

What is RAG (retrieval-augmented generation)?

Accepted Answer

Retrieval-augmented generation is a pattern where a language model receives relevant documents alongside a user query, so it can ground its answers in specific data rather than relying solely on training knowledge. The architecture has three stages: indexing (chunking documents and storing embeddings in a vector database), retrieval (finding the most relevant chunks for a query), and generation (passing those chunks to the LLM as context). The term was coined by Meta researchers Lewis et al. in a 2020 paper. RAG became the dominant enterprise AI pattern in 2023-2024 because it lets organisations use proprietary data without fine-tuning a model. Common stack: a document loader, an embedding model, a vector database (Pinecone, Weaviate, Chroma), and an LLM.

Question 3

What is a vector database?

Accepted Answer

A vector database stores high-dimensional numerical representations (embeddings) of text, images, or other data and enables fast similarity search across them. When you search a vector database, you are finding the stored items closest to your query in embedding space — not matching keywords. Leading purpose-built vector databases include Pinecone (raised $138M Series B, 2024), Weaviate, Chroma, and Qdrant. PostgreSQL users can add vector search via the pgvector extension without a separate database. Vector databases are the retrieval layer in RAG pipelines. They matter because LLMs have finite context windows — you cannot pass an entire document corpus to a model, so you retrieve the relevant slices first.

Question 4

What is the difference between LangChain and LlamaIndex?

Accepted Answer

LangChain is a general-purpose framework for building LLM-powered applications — chains, agents, tool use, memory, and orchestration across multiple model providers. It reached 90,000 GitHub stars within 18 months of its October 2022 launch. LlamaIndex (formerly GPT Index) is narrower: it focuses specifically on connecting LLMs with external data sources for RAG workflows. It provides data connectors, indexing strategies, and query engines optimised for retrieval. In practice, many teams use both: LlamaIndex for data ingestion and retrieval, LangChain for orchestration and agent logic. LangChain is broader, LlamaIndex is deeper on the data-connection problem.

Question 5

Should I fine-tune or use RAG?

Accepted Answer

Use RAG when your data changes frequently, when you need source attribution, or when you want to avoid the cost and complexity of training. Use fine-tuning when you need to change the model's behaviour, tone, or output format consistently, or when you are working with a specialised domain where the base model underperforms. RAG is cheaper and faster to implement — you can have a working prototype in hours. Fine-tuning requires curated datasets, GPU compute (or API fine-tuning credits), and evaluation infrastructure. Most production systems in 2024-2025 use RAG. Fine-tuning is reserved for cases where prompting and retrieval are demonstrably insufficient. Some teams combine both: fine-tune a smaller model for domain-specific language, then augment it with RAG for up-to-date facts.

Question 6

How big is the AI tooling market?

Accepted Answer

Grand View Research estimated the global AI developer tools market at $7.2 billion in 2024, projecting 32% CAGR through 2030. Venture capital investment in AI infrastructure and tooling exceeded $25 billion in 2023-2024 combined, according to PitchBook data. GitHub Copilot alone generated over $100 million ARR by late 2023 with 1.8 million paid subscribers. The market spans code assistants, agent frameworks, RAG infrastructure, observability, vector databases, voice APIs, and image generation tools. These figures do not include the LLM providers themselves (OpenAI, Anthropic, Google) — they count only the tool and infrastructure layer built on top of models.

Question 7

What are embeddings?

Accepted Answer

Embeddings are numerical vectors — lists of floating-point numbers, typically 256 to 3,072 dimensions — that represent the semantic meaning of text, images, or other data. Two pieces of text with similar meaning will have embeddings that are close together in vector space, measured by cosine similarity or dot product. Embeddings are generated by specialised models: OpenAI's text-embedding-3-small, Cohere's embed-v3, or open-source models like BGE and E5. They are the bridge between human-readable content and machine-searchable space. Every RAG pipeline, semantic search engine, and recommendation system built on LLMs uses embeddings as its foundational data structure.

Question 8

Should I use a hosted or self-hosted agent platform?

Accepted Answer

Hosted platforms (LangSmith, AgentOps) reduce operational burden — you get logging, tracing, evaluation, and deployment without managing infrastructure. Self-hosted platforms (n8n, Flowise, Langflow) give you full control over data residency, cost, and customisation. The decision often comes down to data sensitivity and regulatory requirements. If you are processing PII, health data, or financial records, self-hosting may be required for compliance. If you are an early-stage team iterating quickly, hosted platforms save weeks of infrastructure work. EU-based teams often favour self-hosting to meet GDPR and AI Act requirements. Cost crossover typically happens around 10,000 agent runs per month — below that, hosted is cheaper; above it, self-hosted infrastructure amortises.

Question 9

What is production drift in AI tools?

Accepted Answer

Production drift occurs when an AI tool's behaviour changes without any modification to your code. This happens because the underlying LLM is updated (OpenAI has updated GPT-4 multiple times), because API pricing changes, because rate limits shift, or because the tool vendor changes default model routing. In March 2024, several Cursor users reported different code completion quality after an unannounced model switch. Drift is the core reason AI observability tools exist — LangSmith, Helicone, and AgentOps monitor output quality, latency, and cost over time so teams can detect when something changes upstream. sourc.dev tracks drift across tools and models as a first-class metric.

Question 10

What is function calling?

Accepted Answer

Function calling (also called tool use) is a capability where a language model outputs structured JSON describing which function to call and with what arguments, rather than generating free-form text. OpenAI introduced function calling in June 2023. Anthropic followed with tool use in Claude. Function calling is the mechanism that makes agents work — the model decides it needs to search a database, call an API, or run code, and it outputs a structured instruction that your application executes. Without function calling, agents would need brittle text parsing to extract actions from model output. It is now supported by OpenAI, Anthropic, Google, Mistral, and most open-source models via frameworks like Ollama and vLLM.

Question 11

Can I self-host AI tools in the EU?

Accepted Answer

Yes. Several production-grade AI tools are designed for self-hosting on EU infrastructure. n8n (Berlin-based, open source, 50k+ GitHub stars) provides workflow automation with AI nodes. Flowise and Langflow offer visual agent builders that run on your own servers. For vector databases, Qdrant (Berlin-based) and Weaviate (Amsterdam-based) both offer self-hosted deployment. Open-source LLMs like Llama 3, Mistral, and Mixtral can be served via Ollama or vLLM on EU cloud providers (OVHcloud, Hetzner, Scaleway). The EU AI Act, effective August 2024, creates compliance requirements that make self-hosting attractive for high-risk AI applications. The full stack — model, vector database, orchestration, and observability — can run entirely on EU soil.

Question 12

Should I use no-code or code-first AI tools?

Accepted Answer

No-code tools (Flowise, Langflow, n8n, Dify) let non-engineers build AI workflows visually. They are excellent for prototyping, internal tools, and teams without dedicated AI engineers. Code-first tools (LangChain, LlamaIndex, Haystack) offer full control over prompts, retrieval strategies, error handling, and deployment. The tradeoff is development speed versus production flexibility. No-code tools hit limits when you need custom retrieval logic, complex error recovery, or integration with internal systems. A common pattern: prototype in a visual builder, then rewrite in code for production. Teams with strong engineering culture tend to start code-first. Teams solving business problems with constrained scope do well with no-code.

Question 13

What is Model Context Protocol (MCP)?

Accepted Answer

Model Context Protocol (MCP) is an open standard introduced by Anthropic in November 2024 for connecting AI models to external data sources and tools. It defines a universal interface — similar to how USB standardised hardware connections — so that any MCP-compatible model can access any MCP-compatible data source without custom integration code. Before MCP, every tool-model connection required bespoke implementation. MCP provides a client-server architecture where MCP servers expose resources (files, databases, APIs) and MCP clients (AI applications) consume them. Adoption accelerated in early 2025 with support from Cursor, Windsurf, and other code assistants. MCP matters because it reduces the integration cost of connecting AI tools to enterprise data from weeks to hours.

Question 14

How much venture capital has gone into AI tooling?

Accepted Answer

PitchBook data shows over $25 billion in venture capital invested in AI infrastructure and tooling companies during 2023-2024. Key rounds include: Weaviate ($50M Series B, 2023), Pinecone ($138M Series B, 2024), ElevenLabs ($80M Series B at $1.1B valuation, 2024), LangChain ($25M Series A, 2023), and Stability AI ($101M Seed, 2022). The AI tooling layer receives roughly 15-20% of total AI venture investment — the majority goes to foundation model companies. Europe-based tooling companies have raised significant rounds: Mistral AI ($415M Series A, 2023), n8n ($16M Series A, 2023), and Qdrant ($28M Series A, 2024). The pace has not slowed — Q1 2025 saw continued investment in agent infrastructure and observability.

Question 15

How do I evaluate AI tools for production use?

Accepted Answer

Evaluate on five axes: reliability (uptime, error rates, SLA guarantees), cost predictability (per-token pricing, rate limits, overage charges), model dependency (which LLMs does the tool use, and what happens when those models change), data handling (where is data processed, stored, and logged), and lock-in risk (can you export data, switch providers, or self-host). Run a proof-of-concept with production-like data, not demo datasets. Measure latency at your expected throughput, not in isolation. Check the tool's LLM dependency chain — if it defaults to a single model provider, a price increase or outage affects you directly. Read the terms of service for data retention and training policies. sourc.dev tracks these dimensions for every tool in the directory.

Question 16

When did the AI tooling category emerge?

Accepted Answer

The AI tooling category emerged in late 2022 and early 2023, triggered by two events: the release of ChatGPT (November 2022) and the availability of the GPT-3.5 and GPT-4 APIs. LangChain launched in October 2022 as a Python library for chaining LLM calls and reached 90,000 GitHub stars by mid-2024. GitHub Copilot launched its paid tier in June 2022. Pinecone, founded in 2019 as a vector database, saw explosive growth only after RAG became a standard pattern in 2023. The category matured through 2024 with the emergence of agent frameworks, observability platforms, and standardised protocols like MCP. By 2025, AI tooling was a recognised infrastructure category with its own venture capital thesis, conference tracks, and job titles.

AI Tools and Developer Infrastructure

AI Tool Category Map

The AI tooling layer

Six categories

1. Code assistants

2. Agent frameworks

3. RAG infrastructure

4. Voice and audio

5. Image generation

6. Observability

The LLM dependency map

EU and open source angle

Frequently asked questions