Frolicking with AWS Bedrock with Knowledge Bases
I've been meaning to play with AWS' Bedrock's Knowledge Bases feature for a while now. The ability to load relevant organizational data, and have it available through an AWS-powered service, is very appealing. Many enterprise customers are reluctant to run anything more homegrown, like Ollama, but being able to point at an AWS service and say that it does a thing, is a far easier sell than spinning up something yourself, that you then need to maintain.
For those using Bedrock for the first time, you actually need to request individual model access through the Console, which confuses me; I prefer it when everything in AWS land just works out of the box. I understand why you might need to request access for a third-party model, but the Amazon ones should just work out of the box. Anyone who has tried to use a new AWS region knows what I’m talking about; I’m looking at you `ap-southeast-4`. Anyway, I've added the latest `Anthropic` and the `Titan Text Embeddings V2` (you'll see why shortly) models in the screenshot below. Get yourself a cup of coffee, and hopefully, they'll be enabled when you return.
Once you've been granted keys to the kingdom, it's time to look at the Knowledge Bases feature. `Knowledge Bases` have effectively two main purposes. Firstly, to ETL data from a data source (S3, WebCrawler, Custom API) into a Vector Store. Secondly, to surface that Vector Store data is part of a Retrieval-Augmented Generation (RAG) process when interacting with your targeted model.
The RAG process helps to improve accuracy, and relevance of text returned by the model, by providing data that is (hopefully) factually accurate so it can provide a tailored response. Think of it like this; the underlying model knows a bunch of general facts that have been wired into it as part of its training, but it knows nothing about the specifics of your organisation. For example, your organisation might use specific acronyms for project names. By providing those acronym references as part of a RAG, the model will be able to respond with those org-specific acronyms.
The AWS Console gives a great overview of the parts you need to initialize a Knowledge Base, but I prefer to assemble it myself via AWS' Cloud Development Kit (CDK). It gives you a greater appreciation for what actually is required to get it running, and doesn't abstract away what it's doing in the background. It also makes it much easier to spin up and down resources, which is a great way of reducing costs during development. For example, there is no point leaving things running overnight, when nobody is using them.
The CDK constructs at the time of writing (2024-12-03), were only Level 1, meaning you need to hand-wire in all the dependencies yourself. Not only do you need to wire in the dependencies yourself, but you also need to provide a mechanism to load up the index into the OpenSearch Vector database. This was all getting a little too hard.
Luckily, Lee James Gilmore has an excellent post describing the use of Level 3 Construct from AWS Labs. No more crying in CDK, and the ability to deploy and destroy a Knowledge Base stack, relatively easily. Although it does hide away some of what it’s doing with the token parsing, it’s all there on your machine for you to look at, instead of hiding throughout the Console.
import * as s3 from "aws-cdk-lib/aws-s3";
import { bedrock } from "@cdklabs/generative-ai-cdk-constructs";
const kb = new bedrock.KnowledgeBase(this, "KnowledgeBase", {
embeddingsModel: bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V2_1024,
instruction: "Use this knowledge base to answer questions about books. " + "It contains the full text of novels.",
});
const docBucket = new s3.Bucket(this, "DocBucket");
new bedrock.S3DataSource(this, "DataSource", {
bucket: docBucket,
knowledgeBase: kb,
dataSourceName: "books",
chunkingStrategy: bedrock.ChunkingStrategy.fixedSize({
maxTokens: 500,
overlapPercentage: 20,
}),
});
const crawlDataSource = new bedrock.WebCrawlerDataSource(this, "IpponWebsite", {
knowledgeBase: kb,
crawlingRate: 10,
sourceUrls: ["https://www.ipponaustralia.com.au/"],
chunkingStrategy: bedrock.ChunkingStrategy.fixedSize({
maxTokens: 500,
overlapPercentage: 20,
}),
})
You'll notice a couple of things with this CDK snippet:
- I've added the
TITAN_EMBED_TEXT_V2_1024
as the embedding model. This is used as part of the data parsing process, and must be enabled via the model selector. - There are two data sources for our knowledge base. There is an S3 and amazingly an inbuilt Web-Crawler option.
Once you've deployed your Knowledge Base, you can load some files into your target S3 bucket, for your Knowledge Base. I uploaded some PDFs of our fancy new practice decks to see how it would go with those. It supports a wide range of other document formats, but would suggest probably using something that isn’t a PDF in more of a production setting, just to avoid the complexity that be present in some PDFs.
Data is not automatically pulled into your Knowledge Base, you need to sync it. The Console or the CLI can do that for you.
Once the data has been loaded, you can test the model via the Console Test Knowledge Base feature, once you specify a model.
Then, ask away via the Chatbot interface. You can see it pulling information from the documents we've loaded.
The test interface is good for getting a feel for the kind of information it can extract. If you turn off the Generate Responses toggle, you can see the kinds of information extracted from the Vector Search database. The Vector Search db contains effectively snippets of text, which is then passed on to the LLM. The formatting in my PDFs was a little off (I printed them from Google Slides), but I expect it would be better with a text-first medium such as Confluence. It was easy to have it give long-winded answers, suggesting it needed help with the formatting. Simple questions, like How many people work at Ippon?, could be easily extracted.
But how do I use it?
Although logging onto the Console is super fun, chat-based interfaces need to be accessible to be useful. Slack is a great avenue for that integration, and there is no shortage of tutorials from AWS themselves on integrating Slack Chatbots. But what if we got Claude 3.5 Sonnet
to "write" a quick and dirty Lambda function for us?
Here's the final prompt I ended up with.
Write me a NodeJS AWS Lambda function to handle Slack messages. Have it access all Secrets via the AWS Secret Manager and verify that the inbound Slack request is valid. Once it receives a message, it sends it to AWS Bedrock, using the `Claude 3.5` model, and using a Knowledge base called: XXXX KnowledgeBaseId.
Although the code it generated looked plausibly correct at first glance, it had some issues.
- The first version didn’t validate the inbound Slack token at all (it just accepted it) and ignored the outbound token. Correcting it via prompts made it fix the issue.
- It mixed up the body parsing of incoming requests. Instead of parsing the JSON body, it tried to parse it as a query string.
- The expected body from Slack wasn’t exactly what it was expecting so that needed to be massaged.
- It defaulted to using an older NodeJS version and AWS SDK, which won’t do.
- Slack gets unhappy if you don’t respond within 3 seconds to their requests. Due to startup speeds, it’s sometimes unable to respond in time. I ended up using response streaming and swapping out the implementation suggested from the model.
I hammered it into something that looks like the following (⅔ written by AI, with a lot of debugging by me):
const {
SecretsManagerClient,
GetSecretValueCommand,
} = require("@aws-sdk/client-secrets-manager");
const {
RetrieveAndGenerateCommand,
BedrockAgentRuntimeClient,
} = require("@aws-sdk/client-bedrock-agent-runtime");
const crypto = require("crypto");
const secretsManager = new SecretsManagerClient({ region: "us-east-1" }); // Replace with your region
const bedrockRuntime = new BedrockAgentRuntimeClient({ region: "us-east-1" }); // Replace with your region
let SLACK_SIGNING_SECRET;
let SLACK_BOT_TOKEN;
async function getSecrets() {
if (SLACK_SIGNING_SECRET && SLACK_BOT_TOKEN) {
return;
}
try {
const secretName = "slack-bedrock-secrets";
const response = await secretsManager.send(
new GetSecretValueCommand({ SecretId: secretName })
);
const secrets = JSON.parse(response.SecretString);
SLACK_SIGNING_SECRET = secrets.SLACK_SIGNING_SECRET;
SLACK_BOT_TOKEN = secrets.SLACK_BOT_TOKEN;
} catch (error) {
console.error("Error retrieving secrets:", error);
throw error;
}
}
function verifySlackRequest(event) {
const slackSignature = event.headers["x-slack-signature"];
const timestamp = event.headers["x-slack-request-timestamp"];
const body = event.body;
const baseString = `v0:${timestamp}:${body}`;
const hmac = crypto.createHmac("sha256", SLACK_SIGNING_SECRET);
const computedSignature = `v0=${hmac.update(baseString).digest("hex")}`;
return crypto.timingSafeEqual(
Buffer.from(computedSignature),
Buffer.from(slackSignature)
);
}
async function invokeBedrockModel(prompt) {
const retrieveAndGen = new RetrieveAndGenerateCommand({
input: { text: prompt },
retrieveAndGenerateConfiguration: {
type: "KNOWLEDGE_BASE",
knowledgeBaseConfiguration: {
knowledgeBaseId: process.env.KNOWLEDGE_BASE_ID,
modelArn: process.env.MODEL_ARN,
},
},
});
try {
const response = await bedrockRuntime.send(retrieveAndGen);
return response.output.text;
} catch (error) {
console.error("Error invoking Bedrock model:", error);
throw error;
}
}
async function respondToSlack(channel, text) {
console.log("Responding to slack: ", channel, ", text:", text);
const response = await fetch("https://slack.com/api/chat.postMessage", {
method: "POST",
headers: {
Authorization: `Bearer ${SLACK_BOT_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ channel, text }),
});
if (!response.ok) {
throw new Error(`Failed to send message to Slack: ${response.statusText}`);
}
}
function writeStreamResponse(responseStream, statusCode, body) {
const httpResponseMetadata = {
statusCode: statusCode,
headers: {},
};
responseStream = awslambda.HttpResponseStream.from(
responseStream,
httpResponseMetadata
);
responseStream.write(body);
responseStream.end();
}
exports.handler = awslambda.streamifyResponse(
async (event, responseStream, _context) => {
try {
await getSecrets();
if (!verifySlackRequest(event)) {
writeStreamResponse(responseStream, 401, "Unauthorized");
return {};
}
const body = JSON.parse(event.body);
console.log("Request: ", body);
const { channel, text, bot_profile } = body.event;
if (body.type === "url_verification") {
writeStreamResponse(responseStream, 200, body.challenge);
return {};
}
// This check is a big hamfisted. Ideally, it should check if the message is coming from the bot, and then ignore
// but this works for a PoC.
if (
body.type === "event_callback" &&
body.event.type === "message" &&
!bot_profile
) {
console.log("[Function] Returning response to client");
writeStreamResponse(responseStream, 200, {});
const aiResponse = await invokeBedrockModel(text);
console.log("Generated response: ", aiResponse);
await respondToSlack(channel, aiResponse);
}
return {};
} catch (error) {
console.error("Error processing request:", error);
console.dir(error);
writeStreamResponse(responseStream, 500, "Internal Server Error");
return {};
}
}
);
This is an incredibly simple request/response demo and doesn’t deal with any of the standard conversational features you’d need for an actual working chatbot. It doesn’t maintain any context between the Slack conversation and the Claude calls, so it effectively “forgets” what you’ve asked. You’d definitely need to look into storing that data in something like DynamoDB, to maintain the context between event calls from Slack. For a simple question/answer approach though, it works well. At one point, I got it stuck in a loop while it was responding to itself. Watch it burn those tokens (Don’t tell Duncan).
Actual success with the code above! I generally will try to iterate on the code within the LLM chat for as long as possible, because it makes it much harder to partially debug portions of it if you’re constantly needing to manually merge any changes you make with the LLM output.
Conclusion
Once you get your head around the concepts, and the various parts that need to be glued together, the AWS Bedrock offering, with Knowledge Base, is an incredibly interesting proposition, and a great way to start integrating these very powerful tools into your products. As with a lot of these AI offerings, it’s important not to slap it on everything just to see what sticks. Have fun, and stay safe out there.
Feb 13, 2025 12:45:00 AM
Comments