检索增强生成（Retrieval Augmented Generation, RAG）是一种有用的技术，用于克服大型语言模型在长篇内容、事实准确性和上下文感知方面的局限性。

Spring AI 通过提供模块化架构来支持 RAG，该架构允许您自行构建自定义 RAG 流，或使用 Advisor API 使用开箱即用的 RAG 流。

1. Advisor

Spring AI 使用 Advisor API 为常见的 RAG 流提供开箱即用的支持。

要使用 QuestionAnswerAdvisor 或 VectorStoreChatMemoryAdvisor，您需要在项目中添加 spring-ai-advisors-vector-store 依赖。

<dependency>
   <groupId>org.springframework.ai</groupId>
   <artifactId>spring-ai-advisors-vector-store</artifactId>
</dependency>

1.1. QuestionAnswerAdvisor

向量数据库存储 AI 模型不知道的数据。当用户问题发送到 AI 模型时，QuestionAnswerAdvisor 会查询向量数据库以获取与用户问题相关的文档。

向量数据库的响应会附加到用户文本中，为 AI 模型生成响应提供上下文。

假设您已经将数据加载到 VectorStore 中，您可以通过向 ChatClient 提供 QuestionAnswerAdvisor 实例来执行检索增强生成（RAG）。

ChatResponse response = ChatClient.builder(chatModel)
        .build().prompt()
        .advisors(QuestionAnswerAdvisor.builder(vectorStore).build())
        .user(userText)
        .call()
        .chatResponse();

在此示例中，QuestionAnswerAdvisor 将对向量数据库中的所有文档执行相似性搜索。为了限制搜索的文档类型，SearchRequest 接受一个类似 SQL 的过滤表达式，该表达式在所有 VectorStore 中都是可移植的。

此过滤表达式可以在创建 QuestionAnswerAdvisor 时配置，因此它将始终适用于所有 ChatClient 请求，或者可以在运行时根据每个请求提供。

以下是如何创建 QuestionAnswerAdvisor 实例，其中阈值为 0.8 并返回前 6 个结果。

var qaAdvisor = QuestionAnswerAdvisor.builder(vectorStore)
        .searchRequest(SearchRequest.builder().similarityThreshold(0.8d).topK(6).build())
        .build();

1.1.1. 动态过滤表达式

使用 FILTER_EXPRESSION advisor 上下文参数在运行时更新 SearchRequest 过滤表达式。

ChatClient chatClient = ChatClient.builder(chatModel)
    .defaultAdvisors(QuestionAnswerAdvisor.builder(vectorStore)
        .searchRequest(SearchRequest.builder().build())
        .build())
    .build();

// Update filter expression at runtime
String content = this.chatClient.prompt()
    .user("Please answer my question XYZ")
    .advisors(a -> a.param(QuestionAnswerAdvisor.FILTER_EXPRESSION, "type == 'Spring'"))
    .call()
    .content();

FILTER_EXPRESSION 参数允许您根据提供的表达式动态过滤搜索结果。

1.1.2. 自定义模板

QuestionAnswerAdvisor 使用默认模板来使用检索到的文档增强用户问题。您可以通过 .promptTemplate() 构建器方法提供自己的 PromptTemplate 对象来定制此行为。

此处提供的 PromptTemplate 自定义了 advisor 如何将检索到的上下文与用户查询合并。这与在 ChatClient 本身（使用 .templateRenderer()）上配置 TemplateRenderer 不同，后者会影响 advisor 运行**之前**初始用户/系统提示内容的渲染。有关客户端级别模板渲染的更多详细信息，请参阅ChatClient 提示模板。

自定义 PromptTemplate 可以使用任何 TemplateRenderer 实现（默认情况下，它使用基于 StringTemplate 引擎的 StPromptTemplate）。重要的要求是模板必须包含以下两个占位符。

一个 query 占位符，用于接收用户问题。
一个 question_answer_context 占位符用于接收检索到的上下文。

PromptTemplate customPromptTemplate = PromptTemplate.builder()
    .renderer(StTemplateRenderer.builder().startDelimiterToken('<').endDelimiterToken('>').build())
    .template("""
            <query>

            Context information is below.

			---------------------
			<question_answer_context>
			---------------------

			Given the context information and no prior knowledge, answer the query.

			Follow these rules:

			1. If the answer is not in the context, just say that you don't know.
			2. Avoid statements like "Based on the context..." or "The provided information...".
            """)
    .build();

    String question = "Where does the adventure of Anacletus and Birba take place?";

    QuestionAnswerAdvisor qaAdvisor = QuestionAnswerAdvisor.builder(vectorStore)
        .promptTemplate(customPromptTemplate)
        .build();

    String response = ChatClient.builder(chatModel).build()
        .prompt(question)
        .advisors(qaAdvisor)
        .call()
        .content();

QuestionAnswerAdvisor.Builder.userTextAdvise() 方法已弃用，取而代之的是使用 .promptTemplate() 以实现更灵活的自定义。

1.2. RetrievalAugmentationAdvisor

https://docs.springframework.org.cn/spring-ai/reference/api/retrieval-augmented-generation.html#_retrievalaugmentationadvisor

2. 模块

https://docs.springframework.org.cn/spring-ai/reference/api/retrieval-augmented-generation.html#modules

ETL 管道

https://docs.springframework.org.cn/spring-ai/reference/api/etl-pipeline.html

目录CONTENT

检索增强生成（RAG）