Embedding_model

This module contains functionality related to the the embedding_model module for embedding.embedding_models.hugging_face.

Embedding_model

HuggingFaceEmbeddingModelFactory

Bases: SingletonFactory

Factory for creating configured HuggingFace embedding models.

This singleton factory creates and configures HuggingFaceEmbedding instances based on the provided configuration.

Attributes:
  • _configuration_class (Type) –

    The configuration class used for creating instances.

Source code in src/embedding/embedding_models/hugging_face/embedding_model.py
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class HuggingFaceEmbeddingModelFactory(SingletonFactory):
    """Factory for creating configured HuggingFace embedding models.

    This singleton factory creates and configures HuggingFaceEmbedding instances
    based on the provided configuration.

    Attributes:
        _configuration_class (Type): The configuration class used for creating instances.
    """

    _configuration_class: Type = HuggingFaceEmbeddingModelConfiguration

    @classmethod
    def _create_instance(
        cls, configuration: HuggingFaceEmbeddingModelConfiguration
    ) -> HuggingFaceEmbedding:
        """Creates a HuggingFaceEmbedding instance based on provided configuration.

        Args:
            configuration: HuggingFace embedding model configuration.

        Returns:
            HuggingFaceEmbedding: Configured embedding model instance.
        """
        return HuggingFaceEmbedding(
            model_name=configuration.name,
            embed_batch_size=configuration.batch_size,
        )

HuggingFaceEmbeddingModelTokenizerFactory

Bases: SingletonFactory

Factory for creating HuggingFace tokenizer functions.

This singleton factory creates and configures tokenizer functions for HuggingFace models based on the provided configuration.

Source code in src/embedding/embedding_models/hugging_face/embedding_model.py
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
class HuggingFaceEmbeddingModelTokenizerFactory(SingletonFactory):
    """Factory for creating HuggingFace tokenizer functions.

    This singleton factory creates and configures tokenizer functions for HuggingFace models
    based on the provided configuration.
    """

    _configuration_class: Type = HuggingFaceEmbeddingModelConfiguration

    @classmethod
    def _create_instance(
        cls, configuration: HuggingFaceEmbeddingModelConfiguration
    ) -> Callable:
        """Creates a tokenizer function based on provided configuration.

        Args:
            configuration: HuggingFace embedding model configuration.

        Returns:
            Callable: A tokenize function from the configured tokenizer.
        """
        return AutoTokenizer.from_pretrained(
            configuration.tokenizer_name
        ).tokenize