如何为chatGPT增加网络访问功能
问题背景
我们都知道chatGPT无法访问较新的数据,比如我希望询问硅谷银行的事情时,得到的回答是这样的:
当然如果使用newbing 可以达到获得我们想要的结果,那么除了newbing,我们能否自己为chatGPT增加网络访问功能呢?
解决方案
要解决这个问题,我们很容易想到需要搜索引擎这个工具,那么现在我们有2个工具1个是chatGPT,1个是搜索引擎, 接下来我们看什么样的流程可以完成我们的目标。
提供Context
如果我们可以在给chatGPT 提问时带上这些GPT所不知道的信息,chatGPT是可以给出不错的结果的。比如:
这些信息可以通过我们的工具搜索引擎获得,但这里的问题是如何得到搜索的关键字,因为我们并不知道GPT需要知道哪些信息。
让GPT自动生成要搜索的关键字
这里很好的1个思路是使用GPT产生要搜索关键字,这里应该是有借助GPT的推理能力,这也是最为关键的prompt
prompt:
'''Answer the following questions as best you can.
You have access to the following tools:\n\nBing Search:
A wrapper around Bing Search. Useful for when you need to answer questions
about current events. Input should be a search query.\n\nUse the following
format:\n\nQuestion: the input question you must answer\nThought:
you should always think about what to do\nAction: the action to take,
should be one of [Bing Search]\nAction Input: the input to the action\n
Observation: the result of the action\n... (this Thought/Action/Action
Input/Observation can repeat N times)\nThought:
I now know the final answer\nFinal Answer:
the final answer to the original input question\n\nBegin!\n\nQuestion:
${问题}\nThought:'''
当我们使用这个prompt时,GPT就会输出它需要知道的搜索关键字,也就是prompt中的 Action Input
然后我们将GPT产生的搜索关键字,使用搜索引擎API,查到最新的信息,最后将搜索引擎返回的最新结果和查询结果一同输入到GPT里面,得到最终结果。
这里注意,我们只是叙述1个最简模型,当我们询问的问题中有较多信息GPT不知道时,GPT可能会查询多次。
完整流程
画出我们的流程图:
实现
从0实现
按照这个思路我们使用cursor 写了1个nodejs express应用实现这个功能, 以下是服务端部分(此代码部分由cursor完成)。
const express = require('express');
const app = express();
const cors = require('cors');
const axios = require('axios');
app.use(cors());
app.use(express.json());
app.get('/test', (req, res) => {
console.log('hello ai!');
res.send('hello ai!');
});
app.post('/chat_ai_with_internal', async (req, res) => {
//处理post请求
const { query_message, pre_message } = req.body;
if(!query_message || query_message === '') {
console.log('error:', "queryMessage cannot be empty ");
res.status(400).send({ error: "queryMessage cannot be empty" });
return;
}
console.log('queryMessage:', query_message);
let result = await getFinalAnswer(query_message, pre_message);
result = await getAnswerFromGPT("Please translate the following statements into Chinese: " + result);
console.log('result:', result);
res.send({ role: "assistant", content: result});
});
async function getFinalAnswer(question, preMessage) {
let answer = '';
let searchResult = '';
let prompt = '';
while (!isFinalAnswer(answer)) {
if(prompt === '') {
prompt = questionToPrompt(question, searchResult, answer, preMessage);
} else {
prompt = handlePrompt(prompt, searchResult, answer);
}
answer = await getAnswerFromGPT(prompt);
if (!isFinalAnswer(answer)) {
const processedAnswer = await gptResToQuestion(answer);
searchResult = await searchAPI(processedAnswer);
console.log('searchResult:', searchResult);
}
}
return answerHandler(answer);
}
function answerHandler(answer) {
const regex = /Final Answer: (.*)/;
const match = answer.match(regex);
let result = '';
if (match) {
result = match[1];
}
return result;
}
async function gptResToQuestion(answer) {
console.log('gptResToQuestion input:' + answer);
const regex = /Action Input: (.*)/;
const match = answer.match(regex);
let result = '';
if (match) {
result = match[1];
}
console.log('gptResToQuestion output:' + result);
return result;
}
function handlePrompt(prompt, searchResult, lastAnswer) {
if(searchResult && searchResult !== "" && lastAnswer && lastAnswer !== "") {
prompt += lastAnswer + "\nObservation: " + searchResult + "\nThought: ";
}
return prompt;
}
function questionToPrompt(question, searchResult, lastAnswer, preMessage) {
if(preMessage && preMessage.length > 0) {
question = "histroy message: " + JSON.stringify(preMessage) + "\nnew question: " + question;
}
let prompt = "Answer the following questions as best you can. You have access to the following tools:\n\nBing Search: A wrapper around Bing Search. Useful for when you need to answer questions about current events. Input should be a search query.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [Bing Search]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: " + question+ "\nThought: ";
if(searchResult && searchResult !== "" && lastAnswer && lastAnswer !== "") {
prompt += lastAnswer + "\nObservation: " + searchResult + "\nThought: ";
}
return prompt;
}
function isFinalAnswer(answer) {
return answer && answer.includes('Final Answer: ');
}
async function getAnswerFromGPT(prompt) {
console.log("\n\nprompt is: " + prompt)
const message = {role:"user", "content": prompt};
const messageList = [message];
console.log('messageList:', JSON.stringify(messageList));
const OPEN_AI_KEY = "填入openaikey";
const response = await axios.post('https://api.openai.com/v1/chat/completions', {
messages: messageList,
stop: ["\nObservation:", "\n\tObservation:"],
model: 'gpt-3.5-turbo',
}, {
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${OPEN_AI_KEY}`,
},
});
console.log('getAnswerFromGPT result:', response.data.choices[0].message.content.trim());
return response.data.choices[0].message.content.trim();
}
async function searchAPI(question) {
if(!question || question === '') {
throw new Error('Question cannot be empty');
}
const BING_SEARCH_API_KEY = "填入bing search key"
const response = await axios.get('https://api.bing.microsoft.com/v7.0/search', {
headers: {
'Ocp-Apim-Subscription-Key': BING_SEARCH_API_KEY,
},
params: {
q: question,
count: 10,
textDecorations: true,
textFormat: "HTML",
},
});
const snippets = response.data.webPages?.value.map(page => page.snippet);
return snippets.join('');
}
app.listen(6666, () => {
console.log('服务器已启动,本地地址:http://localhost:6666');
});
代码解释
cursor对以上代码自动生成的解释:
这段代码是一个基于express框架的node.js后端服务,主要提供了两个接口:/test和/chat_ai_with_internal。其中/test接口是一个get请求,返回"hello ai!"字符串;/chat_ai_with_internal接口是一个post请求,接收一个json格式的请求体,包含query_message和pre_message两个字段。其中query_message是一个字符串类型的参数,表示用户的问题;pre_message是一个数组类型的参数,表示之前的对话记录。接口的主要逻辑在getFinalAnswer函数中,该函数通过调用getAnswerFromGPT函数获取AI的回答,然后根据回答是否为最终答案来决定是否需要调用searchAPI函数进行搜索。如果回答不是最终答案,则需要将回答转化为问题,然后调用searchAPI函数进行搜索,获取搜索结果。最终答案的判断是通过判断回答中是否包含"Final Answer: "字符串来实现的。
效果展示
下面是效果,可以看到已经可以比较好的回答了
可以在命令行中看到GPT如何将自己不知道的信息转化未搜索关键字:
借助LangChain实现
我们是参考langchain来解析如何为chatGPT增加网络访问功能,其实我们如果直接使用langchain可以使用更加简洁的代码实现这个应用,以下是实现:
from flask import Flask,request
from flask_cors import CORS
import search
import json
app = Flask(__name__)
CORS(app)
@app.route('/test')
def index():
return 'Hello, ai!'
@app.route('/chat_ai_with_internal', methods=["POST"])
def searchPOST():
data = json.loads(request.data);
queryMessage = data['query_message'];
preMessage = data['pre_message'];
if len(queryMessage):
res = {'role': 'assistant', 'content': search.query(queryMessage, preMessage)};
return json.dumps(res);
return "search error"
if __name__ == '__main__':
app.run(port=8115)
import os
os.environ["BING_SUBSCRIPTION_KEY"] = "填入 search key"
os.environ["BING_SEARCH_URL"] = "https://api.bing.microsoft.com/v7.0/search"
os.environ['OPENAI_API_KEY'] = "填入openai key"
from langchain.utilities import BingSearchAPIWrapper
search = BingSearchAPIWrapper(k=5)
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAIChat
llm = OpenAIChat()
tool_names = ["bing-search"]
tools = load_tools(tool_names)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
def query(queryKey:str, preMessage: str) -> str:
print("queryKey is %s", queryKey);
print("preMessage is %s", preMessage)
if len(preMessage):
llm.prefix_messages = preMessage
res = agent.run(queryKey);
print(res);
translationRes = llm.generate(["请将下面语句翻译为中文:" + res]).generations[0][0].text
return translationRes;
效果基本是一致的:
注:
本文参考LangChain实现,prompt也是参考LangChain。 本文大部分代码是借助cursor自动生成。
结尾
AI大模型时代已来,一定会对我们的生活带来极大的改变,无论是底层的模型层,还是如何使用这些底层模型的应用层都有很多事情值得我们研究,LangChain就为我们在使用大模型上提供了很好的启示。