Base

This module contains functionality related to the the base module for embedding.orchestrators.

Base

BaseDatasourceOrchestrator

Bases: ABC

Abstract base class for datasource orchestration.

Defines interface for managing content extraction, embedding generation, and vector storage operations across datasources.

Note

All implementing classes must provide concrete implementations of extract, embed, save and update methods.

Source code in src/embedding/orchestrators/base.py
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class BaseDatasourceOrchestrator(ABC):
    """Abstract base class for datasource orchestration.

    Defines interface for managing content extraction, embedding generation,
    and vector storage operations across datasources.

    Note:
        All implementing classes must provide concrete implementations
        of extract, embed, save and update methods.
    """

    @abstractmethod
    async def extract(self) -> None:
        """Extract content from configured datasources.

        Performs asynchronous content extraction from all configured
        datasource implementations.
        """
        pass

    @abstractmethod
    def embed(self) -> None:
        """Generate embeddings for extracted content.

        Processes extracted content through embedding model to
        generate vector representations.
        """
        pass

    @abstractmethod
    def save_to_vector_storage(self) -> None:
        """Persist embedded content to vector store.

        Saves generated embeddings and associated content to
        configured vector storage backend.
        """
        pass

    @abstractmethod
    def update_vector_storage(self) -> None:
        """Update existing vector store content.

        Updates or replaces existing embeddings in vector storage
        with newly generated ones.
        """
        pass

embed() abstractmethod

Generate embeddings for extracted content.

Processes extracted content through embedding model to generate vector representations.

Source code in src/embedding/orchestrators/base.py
24
25
26
27
28
29
30
31
@abstractmethod
def embed(self) -> None:
    """Generate embeddings for extracted content.

    Processes extracted content through embedding model to
    generate vector representations.
    """
    pass

extract() abstractmethod async

Extract content from configured datasources.

Performs asynchronous content extraction from all configured datasource implementations.

Source code in src/embedding/orchestrators/base.py
15
16
17
18
19
20
21
22
@abstractmethod
async def extract(self) -> None:
    """Extract content from configured datasources.

    Performs asynchronous content extraction from all configured
    datasource implementations.
    """
    pass

save_to_vector_storage() abstractmethod

Persist embedded content to vector store.

Saves generated embeddings and associated content to configured vector storage backend.

Source code in src/embedding/orchestrators/base.py
33
34
35
36
37
38
39
40
@abstractmethod
def save_to_vector_storage(self) -> None:
    """Persist embedded content to vector store.

    Saves generated embeddings and associated content to
    configured vector storage backend.
    """
    pass

update_vector_storage() abstractmethod

Update existing vector store content.

Updates or replaces existing embeddings in vector storage with newly generated ones.

Source code in src/embedding/orchestrators/base.py
42
43
44
45
46
47
48
49
@abstractmethod
def update_vector_storage(self) -> None:
    """Update existing vector store content.

    Updates or replaces existing embeddings in vector storage
    with newly generated ones.
    """
    pass