Base_splitter
This module contains functionality related to the the base_splitter
module for embedding.splitters
.
Base_splitter
BaseSplitter
Bases: ABC
, Generic[DocType]
Abstract base class for document splitter.
This class defines a common interface for document splitters that transform various document types into text nodes for further processing. It leverages generic typing to support different document formats while maintaining type safety.
Implementations should handle the specific logic required to parse and split different document types into meaningful text chunks.
Source code in src/embedding/splitters/base_splitter.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
split(document)
abstractmethod
Split a document into a text node.
This method processes a single document and converts it into a TextNode representation suitable for embedding or other processing. Implementing classes should define the specific logic for parsing different document types.
Parameters: |
|
---|
Returns: |
|
---|
Source code in src/embedding/splitters/base_splitter.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|