How to Add a New LLM Implementation
This guide demonstrates how to add support for a new Language Model (LLM) implementation, using OpenAI as an example.
Architecture
Large Language Models are mainly responsible for generating the answers based on the user query and retrieved nodes, injected as a context. They are also used in the evaluation process. Additionally, they can be used in various components e.g. AutoRetriever
.
Implementation
Step 1: Dependencies
Add the required packages to pyproject.toml
:
[project.optional-dependencies]
augmentation = [
"llama-index-llms-openai>=0.3.25",
...
]
Step 2: LLM Enum
LLM configuration is scoped by provider. Each provider, such as OpenAI, requires its own Pydantic configuration class. Begin by assigning a meaningful name to the new provider in the LLMProviderName
enumeration in llm_configuration.py:
class LLMProviderName(str, Enum):
...
OPENAI = "openai"
Step 3: LLM Configuration And Secrets
Create a new directory src/augmentation/components/llms/openai
and create a configuration.py
file in it. This configuration file will contain necessary fields and secrets for setup.
from typing import Literal
from pydantic import ConfigDict, Field, SecretStr
from augmentation.bootstrap.configuration.components.llm_configuration import (
LLMConfiguration,
LLMProviderName,
)
from core.base_configuration import BaseSecrets
class OpenAILLMConfiguration(LLMConfiguration):
class Secrets(BaseSecrets):
model_config = ConfigDict(
env_file_encoding="utf-8",
env_prefix="RAG__LLMS__OPENAI__",
env_nested_delimiter="__",
extra="ignore",
)
api_key: SecretStr = Field(
..., description="API key for the model provider."
)
provider: Literal[LLMProviderName.OPENAI] = Field(
..., description="The name of the language model provider."
)
secrets: Secrets = Field(
None, description="The secrets for the language model."
)
The first part is to create a configuration that extends LLMConfiguration
. provider
field constraints the value to LLMProviderName.OPENAI
, which serves as an indicator for pydantic validator. The Secrets
inner class defines secret fields that will be present in the environment secret file under the RAG__LLMS__OPENAI__
prefix. Add the corresponding environment variables to configurations/secrets.{environment}.env
:
RAG__LLMS__OPENAI__API_KEY=<openai_api_key>
Step 4: LLM Implementation
In the llm.py
file, create singleton LLM factory. It provides a framework, where LLM can be retrieved through OpenaAILLMFactory
and is initialized only once per runtime, saving up the memory (e.g. in cases of small in-memory LLMs). To do so, define expected _configuration_class
type and provide _create_instance
implementation using llamaindex
.
from typing import Type
from llama_index.llms.openai import OpenAI
from augmentation.components.llms.openai.configuration import (
OpenAILLMConfiguration,
)
from core import SingletonFactory
class OpenaAILLMFactory(SingletonFactory):
_configuration_class: Type = OpenAILLMConfiguration
@classmethod
def _create_instance(cls, configuration: OpenAILLMConfiguration) -> OpenAI:
return OpenAI(
api_key=configuration.secrets.api_key.get_secret_value(),
model=configuration.name,
max_tokens=configuration.max_tokens,
max_retries=configuration.max_retries,
)
Step 5: LLM Output Extractor
Human feedback feature between Chainlit and Langfuse require extraction of information about LLM response. Each provider returns the differently structured output dictionary. Therefore, we need to implement an extractor of required fields. Create output_extractor.py
:
from typing import Type
from langfuse.api.resources.commons.types.trace_with_details import (
TraceWithDetails,
)
from augmentation.components.llms.core.base_output_extractor import (
BaseLlamaindexLLMOutputExtractor,
)
from augmentation.components.llms.openai.configuration import (
OpenAILLMConfiguration,
)
from core.base_factory import Factory
class OpenAILlamaindexLLMOutputExtractor(BaseLlamaindexLLMOutputExtractor):
def get_text(self, trace: TraceWithDetails) -> str:
return trace.output["blocks"][0]["text"]
def get_generated_by_model(self, trace: TraceWithDetails) -> str:
return self.configuration.name
Implemented interface BaseLlamaindexLLMOutputExtractor
, provide sufficient extractor for ChainlitFeedbackService
purposes. Now just add correspodning factory:
class OpenAILlamaindexLLMOutputExtractorFactory(Factory):
_configuration_class: Type = OpenAILLMConfiguration
@classmethod
def _create_instance(
cls, configuration: OpenAILLMConfiguration
) -> OpenAILlamaindexLLMOutputExtractor:
return OpenAILlamaindexLLMOutputExtractor(configuration)
Step 6: LLM Integration
Create an __init__.py
file as follows:
from augmentation.bootstrap.configuration.components.llm_configuration import (
LLMConfigurationRegistry,
LLMProviderName,
)
from augmentation.components.llms.openai.configuration import (
OpenAILLMConfiguration,
)
from augmentation.components.llms.openai.llm import OpenaAILLMFactory
from augmentation.components.llms.openai.output_extractor import (
OpenAILlamaindexLLMOutputExtractorFactory,
)
from augmentation.components.llms.registry import (
LlamaindexLLMOutputExtractorRegistry,
LLMRegistry,
)
def register() -> None:
LLMRegistry.register(LLMProviderName.OPENAI, OpenaAILLMFactory)
LLMConfigurationRegistry.register(
LLMProviderName.OPENAI, OpenAILLMConfiguration
)
LlamaindexLLMOutputExtractorRegistry.register(
LLMProviderName.OPENAI, OpenAILlamaindexLLMOutputExtractorFactory
)
The initialization file includes a register()
method responsible for registering our configuration, output extractor and LLM factories. Registries are used to dynamically inform the system about available implementations. This way, with the following OpenAI configuration in configurations/configuration.{environment}.json
file:
"augmentation":
{
"chat_engine":
{
"llm": {
"provider": "openai",
"name": "gpt-4o", // any model name compatible with OpenAI API
}
}
...
}
Note: You can use any name
exposed by OpenAI
We can dynamically retrieve the corresponding LLM implementation by using the name specified in the configuration:
llm_config = read_llm_from_config()
llm_model = LLMRegistry.get(llm_config.name).create(llm_config)
This mechanism is later used by the chat engine to initialize the llm defined in the configuration. These steps conclude the implementation, resulting in the following file structure:
src/
└── augmentation/
└── components/
└── llms/
└── openai/
├── __init__.py
├── configuration.py
├── llm.py
└── output_extractor.py