Langfuse
This module contains functionality related to the the langfuse
module for evaluation.evaluators
.
Langfuse
LangfuseEvaluator
Evaluator that tracks RAG performance metrics in Langfuse.
Integrates chat engine execution with RAGAS evaluation and publishes quality metrics to Langfuse for monitoring and analysis.
Source code in src/evaluation/evaluators/langfuse.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
|
__init__(chat_engine, langfuse_dataset_service, ragas_evaluator, run_metadata)
Initialize the Langfuse evaluator with required components.
Parameters: |
|
---|
Source code in src/evaluation/evaluators/langfuse.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
evaluate(dataset_name)
Run evaluation on a dataset and record results in Langfuse.
Processes each item in the dataset, generates responses using the chat engine, calculates evaluation metrics, and uploads all results to Langfuse for monitoring.
Parameters: |
|
---|
Note
Records scores for answer relevancy, context recall, faithfulness, and harmfulness metrics when they are available (not NaN values).
Source code in src/evaluation/evaluators/langfuse.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
|
LangfuseEvaluatorFactory
Bases: Factory
Factory for creating LangfuseEvaluator instances.
Creates properly configured evaluators based on the provided configuration.
Source code in src/evaluation/evaluators/langfuse.py
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
|