598 - Fix pydantic validation error #664

LorenzoPaleari · 2024-09-13T00:06:42Z

Pydantic validation error happens when using IterativeGraph module.
Problem arises when the IterativeGraph module tries to use the same instantiation of a class for multiple execution, causing the class to not pass correctly llm into his functions.

Modified Search graphs to move instantiation inside IterativeGraph
IterativeGraph now creates multiple instances of the class we want to iterate over. This can be made better.

Changed instatiation location of iterated graph classes

rjbks · 2024-09-13T00:26:34Z

@LorenzoPaleari

Openai models, 4o and 4o-mini, do not throw errors but also do not respect the pydantic models:
Models:

class MatchedSchool(BaseModel):
    name: str = Field(description="The name of the input candidate medical school.")
    alternate_names: List[str] = Field(description="A list of alternate names referencing this school. Could be abbreviations, or fully spelled out names, as well as names of individual departments responsible for the Medical Curriculum within the school.")
    is_med_school: Literal["true", "false"] = Field(description="Whether or not the input school is actually a medical school or has a medical program.")
    city: Optional[str] = Field(description="The city where the matched medical school campus/program facility is located, if available.")
    state: Optional[str] = Field(description="The state (or if international, the geographic/political region within the country) where the matched medical school campus/program facility is located, if available.")
    country: Optional[str] = Field(description="The country where the matched medical school campus/program facility is located, if available.")
    source: str = Field(description="Source URL where this match was found.")

class Matches(BaseModel):
    matches: List[MatchedSchool] = Field(description="A list of matched medical schools.")

Output:

{
 "name": "Centro Universit\u00e1rio Franciscano (UNIFRA)",
 "alternate_names": [
   "UNIFRA",
   "Centro Universit\u00e1rio Franciscano"
 ],
 "is_med_school": "true",
 "city": "Santa Maria",
 "state": "Rio Grande do Sul",
 "country": "Brazil",
 "sources": [
   "https://caper.ca/sites/default/files/pdf/CAPER_MedicalSchools_September_2022.xlsx",
   "https://www.facebook.com/engmatUNIFRA/",
   "https://unifra.academia.edu/ClariceMachado"
 ]
}

Gemini 1.5 flash works well with proper formatting, but gemini 1.5 pro still has the formatting issue:

File "/opt/anaconda3/envs/med_device/lib/python3.12/site-packages/langchain_core/output_parsers/json.py", line 87, in parse_result
    raise OutputParserException(msg, llm_output=text) from e
langchain_core.exceptions.OutputParserException: Invalid json output: ```json
{'matches': []}
\```

As detailed here #598

LorenzoPaleari · 2024-09-13T00:55:44Z

OpenAi

Schema is not correctly passed to Final Node: MergeAnswerNode.
Fixing!

Gemini

The error is not related to a coding problem. As can be observed by your log, the llm returned an empty matches list {'matches':[]}.
This cannot be parsed with the provided schema since the schema do not allow for empty lists.

See these:

https://stackoverflow.com/questions/61468548/check-if-list-is-not-empty-with-pydantic-in-an-elegant-way

pydantic/pydantic#367

Consider changing to pydantic if you are using langchain_core.pydantic_v1. It is an old pydantic version and the referenced links will not be of any help.

VinciGit00 · 2024-09-13T07:00:14Z

thank you

github-actions · 2024-09-13T07:01:36Z

🎉 This PR is included in version 1.19.0-beta.10 🎉

The release is available on:

v1.19.0-beta.10
GitHub release

Your semantic-release bot 📦🚀

github-actions · 2024-09-14T08:55:28Z

🎉 This PR is included in version 1.20.0-beta.1 🎉

The release is available on:

v1.20.0-beta.1
GitHub release

Your semantic-release bot 📦🚀

github-actions · 2024-09-19T08:13:26Z

🎉 This PR is included in version 1.21.0 🎉

The release is available on:

v1.21.0
GitHub release

Your semantic-release bot 📦🚀

fix: Fixed pydantic error on SearchGraphs

039ba2e

Changed instatiation location of iterated graph classes

LorenzoPaleari mentioned this pull request Sep 13, 2024

[1.14.0+] pydantic ValidationError with SmartScraperGraph #598

Closed

LorenzoPaleari added 2 commits September 13, 2024 04:18

fix: Added support for nested structure

66ea166

fix: update all nodes that were using MergeNode or IteratorNode

a92dddb

VinciGit00 merged commit 2ae26e9 into ScrapeGraphAI:pre/beta Sep 13, 2024

github-actions bot added the released on @dev label Sep 13, 2024

github-actions bot added the released on @stable label Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

598 - Fix pydantic validation error #664

598 - Fix pydantic validation error #664

Uh oh!

LorenzoPaleari commented Sep 13, 2024

Uh oh!

rjbks commented Sep 13, 2024 •

edited

Loading

Uh oh!

LorenzoPaleari commented Sep 13, 2024 •

edited

Loading

Uh oh!

VinciGit00 commented Sep 13, 2024

Uh oh!

github-actions bot commented Sep 13, 2024

Uh oh!

github-actions bot commented Sep 14, 2024

Uh oh!

github-actions bot commented Sep 19, 2024

Uh oh!

Uh oh!

Uh oh!

598 - Fix pydantic validation error #664

598 - Fix pydantic validation error #664

Uh oh!

Conversation

LorenzoPaleari commented Sep 13, 2024

Uh oh!

rjbks commented Sep 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LorenzoPaleari commented Sep 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAi

Gemini

Uh oh!

VinciGit00 commented Sep 13, 2024

Uh oh!

github-actions bot commented Sep 13, 2024

Uh oh!

github-actions bot commented Sep 14, 2024

Uh oh!

github-actions bot commented Sep 19, 2024

Uh oh!

Uh oh!

rjbks commented Sep 13, 2024 •

edited

Loading

LorenzoPaleari commented Sep 13, 2024 •

edited

Loading