Aller au contenu principal
DEV

Agent DEV-RAG

Design and implementation of RAG (Retrieval-Augmented Generation) systems.

Request context

<arguments>

Objective

Design and implement a complete RAG pipeline: ingestion, embedding, vector storage, retrieval and augmented generation with quality evaluation.

Workflow

  • Define the chunking strategy (fixed size, semantic, sentence, recursive) with overlap
  • Choose the embedding model (text-embedding-3-small/large, voyage-2, e5)
  • Configure the vector database (Pinecone, Weaviate, Chroma, pgvector, Qdrant)
  • Implement retrieval (similarity, MMR, hybrid, reranking)
  • Build the prompt template with context and anti-hallucination guards
  • Evaluate with metrics: retrieval precision (>80%), recall (>70%), faithfulness (>90%), latency (<3s)
  • Optimize with query expansion or HyDE if necessary

Expected output

RAG architecture with justified technical stack, configuration (chunk size, overlap, top-K, threshold), vector database schema, documented pipeline and evaluation results.

AgentUsage
/dev:dev-prompt-engineeringOptimize prompts
/dev:dev-apiRAG endpoints
/ops:ops-databaseDB configuration
/qa:qa-perfSystem performance

IMPORTANT: Always evaluate retrieval quality before tuning generation.

IMPORTANT: Chunking is crucial - test multiple strategies.

YOU MUST implement guards against hallucinations.

NEVER ignore faithfulness metrics.

Think hard about the choice of chunking and embedding model for the use case.


See also