The Path to AI - Build vs Buy

Written by Lucas Ward | December 11, 2024

A Decision Framework of Sorts

So, you want to leverage AI to achieve some amazing use case that provides immense value to your business? Don’t we all! Getting started, however, can be an exercise in futility if you don’t know where to start. This blog aims to help guide you through the process of deciding whether to build or buy and what questions to ask yourself to help narrow down the number of potential solutions for your use case. Do you know what your use case is?

No matter where you are in the process, Ippon Technologies is well suited to help guide you, having implemented several different AI solutions for different clients. We have worked on everything from “old school AI,” like Optical Character Recognition using machine learning (OCR, machine vision, etc.), all the way to state-of-the-art knowledge bases leveraging Large Language Models (LLMs), cloud data platforms, and different patterns in Retrieval Augmented Generation.

Machine Learning or Large Language Model

The first step on your journey to leverage AI is to seek to better understand what you mean by AI. We have at our disposal, as engineers, Machine Learning algorithms that can achieve amazing results in a variety of different areas, from image recognition to speech recognition or even just simple statistical analysis. On the other hand, we have large language models that have proven to be useful for a wide range of tasks, some of them overlapping or exceeding what the Machine Learning algorithms have been solving for years. Even within the space of Large Language Models, we have so many different classifications, like general purpose models, fine-tuned models for specific purposes, small models for local use, and so many more.

The key to understanding when to use it is to take a good, hard look at the problem statement you are trying to solve. Once you know what the problem is, then you can start exploring the various solutions, narrowing it down by implementation cost, product cost, ongoing maintenance cost, and your organization's appetite for risk. Let’s dive into a few examples to help illustrate this process.

Use Case: Data Enrichment

The data enrichment use case is a very popular use case that leverages Large Language Models. Ippon worked with a client who had a substantial amount of “call notes” from their customer service department (as most companies do). The application they used to track their customers and their calls had several different fields built in to classify the calls. In most cases, each call could have a category and a subcategory selected. What if someone calls in about more than just one issue? Since each call can have only a single category assignment, some calls will be miscategorized. What if you need to look back over years of data and report on what customers “really” meant? This is where the LLM came in to flex its muscles. Specifically, Snowflake’s Cortex model. Ippon Engineers were able to feed the LLM all of the call notes with carefully designed prompts and receive highly accurate classifications of the calls. They helped in reviewing and categorizing what the customers had called in about. Something that would have taken a human month to complete was automated using Snowflake’s Cortex within a two-week sprint.

So, how did they decide to use this technology on this platform? The reasoning went something like this:

Do we need ML or LLM?
For this task, it was pretty clear that a large language model would provide the most value to the customer, because summarization, language understanding, and natural language processing are what LLMs excel at.
Which LLM do we use, and how do we run it?
This was simple enough for our client to decide. They were already leveraging Snowflake as a cloud data platform, and in most cases, it makes the most sense to use whatever AI platform is closest to the data. Alternatively, they could have leveraged an offering from one of the other big cloud providers or other companies that provide Large Language Model inference as a service, but they may have incurred significant charges when moving the data across network boundaries.

As you can see, the decision process is pretty straightforward. The first step is to identify the problem that you need to solve. Adopting a product mindset and using traditional product methodologies is a great aid for managing this step. Next comes identifying the type of technology required. Then, finally, understand what fits into your existing architecture and suite of products. This isn’t always the case. What if you aren’t leveraging a platform with easily accessible LLM / ML capabilities? Or, what if your organization is multi-cloud and has some options? Let’s take a look at another use case where the decision was not so cut and dry.

Use Case: Knowledge Base Chat Bot (Ippon GPT)

If you have been following the “AI Explosion” in any capacity, you have likely heard the term “Retrieval Augmented Generation” or “knowledge bases” a time or two. Retrieval Augmented Generation, RAG for short, is a pattern where data that you own is collected and then sent to a large language model along with your prompt to improve answers that the LLM can output by augmenting the conversation with additional context. This is a nice alternative to fine-tuning a base model on your data and is relatively easy to set up, even from scratch. That doesn’t mean that several large cloud providers haven’t made a managed service out of it (Amazon Kendra comes to mind).

Ippon had the bright idea to create Ippon GPT which is an internal knowledge base that leverages RAG to better aid our consultants with understanding capabilities and sales materials as well as internal processes. When it came to implementing this solution, the decision process looked a little bit like this:

Do we need ML or LLM?
For this task, Large Language Models are well suited. They can digest and understand a huge amount of context and “reason” about the underlying meaning.
Which LLM do we use, and how do we run it?
a.) Ippon is primarily an AWS and Snowflake Shop, though we do have competency with all the major cloud and data platforms and the tooling around them.
b.) Our data was on or accessible from AWS, but we didn't need and weren't using a Data Lake or warehouse solution for this use case. Instead, we had PDFs, Word docs, Google docs, wikis, and years of case studies—stacks of office-type material that's a cinch to drop on S3, etc.
c.) We decided to build mostly from scratch and leverage Amazon Bedrock.

With our team having such a wide variety of skills and expertise in this field, we had a lot of options to choose from as far as the preferred technology stack goes. At the time of this project, Amazon Bedrock was chosen, but it’s worth noting this same result could have been achieved using Snowflake. Of course, we could have used Amazon Kendra for a “managed experience," but we knew that we would have unique requirements, we wanted to keep the cost low, and we wanted a useful “living experiment” that our team could continue to evolve. Another thing that stood out during this project is the insanely rapid rate of change of these technologies. One of our engineers, Theo Lebrun (you may have heard of him), was quoted as saying, “Back in the early days when Amazon Bedrock was first released, it was a very different product; the capabilities and therefore the use cases are evolving rapidly." Amazon Bedrock became generally available on September 28, 2023. Early days indeed!

Use Case: Digital Twin

Creating digital employees has been around for a while, especially with the rise of “Robotic Process Automation”, also known as RPA. But what if, instead of creating a digital employee through automation, we could augment an existing employee’s workflow with creative data retrieval using an LLM? Call it a digital twin. This is exactly what Ippon engineers set out to do with a Large finance company in the Investment banking space.

To increase productivity of analysts, enabling them to handle more cases faster without reducing their success rate, we needed to come up with a mix between prompt engineering and structured information retrieval - which would also be scalable for hundreds of users. The digital twin's job was to process earning's call transcripts, ask targeted questions of the data, and record the responses in a consistent, insightful format. A retrieval-augmented generation (RAG) design, supported by embeddings and a vector store, helped store and retrieve analysis quickly and cost-effectively, and Cohere ultimately became the model of choice. Let’s talk a bit about how these technical decisions were made.

This process is normally handled by an assistant analyst.. Analysis capacity is capped by the number of humans analyzing.
The choice of Cohere came down to its practical fit for the project’s needs. It provided a context window that could accommodate the analysis and delivered more accurate results for this specific business use case after testing various options. Though price might have been a factor, the decision ultimately rested on finding a model that best matched the project’s accuracy requirements. In terms of security, all models were privately hosted on Azure AI, so security was consistent across options, meaning it wasn’t a significant differentiator.
For the RAG process, as previously mentioned, we decided to use a vector store with embeddings, as it allowed the LLM to navigate an extensive library of documents spanning multiple quarters or even years. This approach meant we could search and retrieve relevant information without feeding every document into each run, thereby minimizing the token count and keeping retrieval efficient (and cost low). While this method is not as exhaustive as passing all documents every time, it offers a practical balance between speed, cost, and accuracy by retrieving information based on meaning rather than strict semantic matches.

The key takeaway from this decision-making process is that technical solutions should address specific project needs and existing systems, not just follow the latest trend in AI. It’s essential to experiment, conduct research, and if needed, consult experts to determine the best technology fit for your organization. Just because you can build something, doesn’t mean you should!

Conclusion

Artificial intelligence is hot right now, but don’t skip out on the decision-making process and give in to hype. Although this blog looked at three cases where it made sense to use a large language model, more traditional machine learning and data science are still often the right way to go for a lot of problems. If you or the organization you work for are looking to leverage LLMs or ML and don’t know where to start, then drop us a line at sales@ipponusa.com; we would love to partner with you!

View full post