Description
Describe the Feature
As cost for test data generation is too high, thats why I want to use huggingface opensource model for test data generation. But it is not compatible with current version.
Why is the feature important for you?
from langchain.llms import HuggingFacePipeline
from transformers import pipeline
Load HuggingFace model using transformers pipeline
generator = pipeline("text-generation", model="gpt2")
Create a HuggingFacePipeline LLM instance
llm = HuggingFacePipeline(pipeline=generator)
from langchain.embeddings import HuggingFaceEmbeddings
from sentence_transformers import SentenceTransformer
Load the HuggingFace embedding model (e.g., a sentence transformer model)
model_name = "all-MiniLM-L6-v2" # A common model for sentence embeddings
Create an embedding instance using HuggingFaceEmbeddings, providing the model_name
embedding = HuggingFaceEmbeddings(model_name=model_name)
from langchain_core.language_models import BaseLanguageModel
from langchain_core.embeddings import Embeddings
make sure to wrap them with wrappers
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
langchain_llm = LangchainLLMWrapper(llm)
langchain_embeddings = LangchainEmbeddingsWrapper(embedding)
! git clone https://huggingface.co/datasets/explodinggradients/prompt-engineering-guide-papers
from langchain_community.document_loaders import DirectoryLoader
loader = DirectoryLoader("./prompt-engineering-guide-papers/", glob="*.pdf")
documents = loader.load()
for document in documents:
document.metadata["filename"] = document.metadata["source"]
docs = [doc for doc in documents if len(doc.page_content.split()) > 5000]
from ragas.testset import TestsetGenerator
generator with openai models
generator_llm = langchain_llm
critic_llm = langchain_llm
embeddings = langchain_embeddings
generator = TestsetGenerator.from_langchain(llm=generator_llm, embedding_model=embeddings)
generate testset
testset = generator.generate_with_langchain_docs(documents[:2], testset_size=3)
for this code I am getting an error message as follows:
AttributeError: 'LangchainLLMWrapper' object has no attribute 'agenerate_prompt'
ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'LangchainLLMWrapper' object has no attribute 'agenerate_prompt'
ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'LangchainLLMWrapper' object has no attribute 'agenerate_prompt'
ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'headlines' property not found in this node
ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'headlines' property not found in this node
ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'LangchainLLMWrapper' object has no attribute 'agenerate_prompt'
ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'LangchainLLMWrapper' object has no attribute 'agenerate_prompt'
ERROR:ragas.testset.transforms.engine:unable to apply transformation: node.property('summary') must be a string, found '<class 'NoneType'>'
ERROR:ragas.testset.transforms.engine:unable to apply transformation: node.property('summary') must be a string, found '<class 'NoneType'>'
ERROR:ragas.testset.transforms.engine:unable to apply transformation: Node 3da710f8-fc20-40ef-97bf-7f11fb7be538 has no summary_embedding
Additional context