A Four-Stack Approach to Unlocking the Potential of Large Language Models

4 min readOct 27, 2023

I’ve been delving deep into the fascinating world of Large Language Models (LLMs), and I can’t wait to share what I’ve learned with you.

Let me walk you through a framework that harnesses the power of LLMs to create advanced applications and agents.

It’s all about understanding the four fundamental stacks that make these models tick.

The LLM Stack

At the core of any LLM application lies the LLM stack. Whether you’re working with OpenAI’s GPT-4, ChatGPT 3.5, or LLaMA-2, this is where the magic begins.

You need to consider how the model was created, the pre-training, fine-tuning, and whether any Reinforcement Learning from Human Feedback (RLHF) has been applied.

Think about customizing and fine-tuning the model to suit your specific needs and optimizing its serving strategy. The LLM stack is the foundation upon which you’ll build your intelligent agent.

When using open source software in the LLM stack, it's important to consider the method of serving. Will you opt for cloud hosting?

If you are using LLaMA-2 in the cloud and paying per token, it is important to consider whether you will be paying by the amount of compute you use.

It may be worthwhile to think about running the process locally, using one of the four-bit models and a low-resolution model.

However, if you are looking to use LLM for more than just a simple chatbot and require logic or mathematics, full resolution models may perform better than quantization models.

The Search/Memory/Data Stack

The second stack is all about data. You’ll be dealing with semantic search and vector stores, making decisions about which tools and technologies to use.

Will you opt for Faiss or explore hosted solutions like ChromaDB or Pinecone?

You’ll also need to think about data access, extraction from conventional databases, and knowledge graphs.

Now, obviously, the language model that you’ve chosen will have a big impact on this. If you’ve gone for a language model with a 16K context window, now you can actually afford to inject a lot more into it.

Another key component of this stack is to consider aspects such as search functionality.

If you’re using Google or Duck Duck Go to go out and fetch information off the web or scrape that information from somewhere in real-time, that’s something that you also want to think about in this stack as well.

Finally, another key part that’s really key for building agents is the whole concept of memory of basically having some kind of memory that you can save to and retrieve from, for injecting information into your prompts via in context learning. So this is a key part of this stack as well.

This stack is all about connecting your LLM with the information it needs to provide valuable insights.

The Reasoning and Action Stack

As your application evolves, you’ll find yourself exploring the reasoning and action stack. This is where the decision-making happens, often with the help of your LLM.

Whether you’re using React or other tools, you need to consider the logic behind your system’s choices.

Remember that all the stacks interact with each other. So, if you are doing something with React or OpenAI functions or something like that, that would fit into this.

Mathematical and logic solvers can provide stability and structure, enhancing your application’s decision-making capabilities.

This stack will continue to grow as we explore new ways of reasoning beyond just LLMs.

The Personalization Stack

The top layer of this framework is all about personalization. This is where your application develops its unique personality, style, and brand.

You might be tracking user interactions or customizing the conversation’s tone.

Think of it as fine-tuning the output via prompts and other manipulations.

This is the level of the stack that people talk about and a lot more people have done really interesting experiments in trying to get a large language model to be like a certain personality or to act in a certain way and respond in a certain way.

Personalization is key if you want your LLM-powered agent to connect with users on a more personal level.

A Final Word

When designing your LLM application or agent, consider the importance of each of these stacks. If you’re aiming for rich, role-playing interactions, invest in personalization.

If data integration is central to your project, the search/memory/data stack is your focus. Each layer offers different tools and opportunities to make your application shine.

As you’re architecting your application, remember that it’s not just about where each stack fits but also how they interact. Your choices at each level will impact the final product.

Whether you’re creating a mobile app or a sophisticated agent, think carefully about your tools, reasoning, and the heuristics your agent uses for decision-making.

This framework will empower you to create LLM chat apps, summarization apps, and LLM agents that can tackle diverse tasks.

Each stack has its unique role in shaping your application, and understanding them will be your key to success.

If you’re as intrigued by this topic as I am, feel free to share your thoughts in the comments. And if you found this post valuable, let me know.

Stay tuned for more exciting updates in the world of Large Language Models. Bye for now!

Call to Action

Know someone on the hunt for a top-notch development partner to bring their MVP dream to life? Connect them with me on LinkedIn — your referrals are the fuel that drives innovation! 🚀

You can also find me here on Github!