ScrapeGraphAI
diff --git a/‎CHANGELOG.md
Lines changed: 38 additions & 0 deletions b/‎CHANGELOG.md
Lines changed: 38 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 11 additions & 11 deletions b/‎README.md
Lines changed: 11 additions & 11 deletions
diff --git a/‎docs/chinese.md
Lines changed: 58 additions & 47 deletions b/‎docs/chinese.md
Lines changed: 58 additions & 47 deletions
diff --git a/‎examples/anthropic/csv_scraper_graph_multi_haiku.py
Lines changed: 55 additions & 0 deletions b/‎examples/anthropic/csv_scraper_graph_multi_haiku.py
Lines changed: 55 additions & 0 deletions
diff --git a/‎examples/anthropic/json_scraper_multi_haiku.py
Lines changed: 36 additions & 0 deletions b/‎examples/anthropic/json_scraper_multi_haiku.py
Lines changed: 36 additions & 0 deletions
diff --git a/‎examples/anthropic/pdf_scraper_graph_haiku.py
Lines changed: 3 additions & 1 deletion b/‎examples/anthropic/pdf_scraper_graph_haiku.py
Lines changed: 3 additions & 1 deletion
@@ -1,3 +1,41 @@
+## [1.6.0-beta.6](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.5...v1.6.0-beta.6) (2024-06-04)
+
+
+### Features
+
+* refactoring of abstract graph ([fff89f4](https://github.com/VinciGit00/Scrapegraph-ai/commit/fff89f431f60b5caa4dd87643a1bb8895bf96d48))
+
+## [1.6.0-beta.5](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.4...v1.6.0-beta.5) (2024-06-04)
+
+
+### Features
+
+* refactoring of an in if ([244aada](https://github.com/VinciGit00/Scrapegraph-ai/commit/244aada2de1f3bc88782fa90e604e8b936b79aa4))
+
+## [1.6.0-beta.4](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.3...v1.6.0-beta.4) (2024-06-03)
+
+
+### Features
+
+* fix an if ([c8d556d](https://github.com/VinciGit00/Scrapegraph-ai/commit/c8d556da4e4b8730c6c35f1d448270b8e26923f2))
+
+## [1.6.0-beta.3](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.2...v1.6.0-beta.3) (2024-06-03)
+
+
+### Features
+
+* removed a bug ([8de720d](https://github.com/VinciGit00/Scrapegraph-ai/commit/8de720d37958e31b73c5c89bc21f474f3303b42b))
+
+## [1.6.0-beta.2](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.1...v1.6.0-beta.2) (2024-06-03)
+
+
+### Features
+
+* add csv scraper and xml scraper multi ([b408655](https://github.com/VinciGit00/Scrapegraph-ai/commit/b4086550cc9dc42b2fd91ee7ef60c6a2c2ac3fd2))
+* add json multiscraper ([5bda918](https://github.com/VinciGit00/Scrapegraph-ai/commit/5bda918a39e4b50d86d784b4c592cc2ea1a68986))
+* add pdf scraper multi graph ([f5cbd80](https://github.com/VinciGit00/Scrapegraph-ai/commit/f5cbd80c977f51233ac1978d8450fcf0ec2ff461))
+* removed rag node ([930f673](https://github.com/VinciGit00/Scrapegraph-ai/commit/930f67374752561903462a25728c739946f9449b))
+
 ## [1.6.0-beta.1](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.5-beta.1...v1.6.0-beta.1) (2024-06-02)
 
 
 
@@ -1,6 +1,6 @@
 
 # 🕷️ ScrapeGraphAI: You Only Scrape Once
-[English](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/README.md) | [中国人](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/docs/chinese.md)
+[English](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/README.md) | [中文](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/docs/chinese.md)
 
 [![Downloads](https://static.pepy.tech/badge/scrapegraphai)](https://pepy.tech/project/scrapegraphai)
 [![linting: pylint](https://img.shields.io/badge/linting-pylint-yellowgreen)](https://github.com/pylint-dev/pylint)
@@ -164,6 +164,16 @@ print(result)
 
 The output will be an audio file with the summary of the projects on the page.
 
+## Sponsors
+<div style="text-align: center;">
+  <a href="https://serpapi.com?utm_source=scrapegraphai">
+    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/serp_api_logo.png" alt="SerpAPI" style="width: 10%;">
+  </a>
+  <a href="https://dashboard.statproxies.com/?refferal=scrapegraph">
+    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/transparent_stat.png" alt="Stats" style="width: 15%;">
+  </a>
+</div>
+
 ## 🤝 Contributing
 
 Feel free to contribute and join our Discord server to discuss with us improvements and give us suggestions!
@@ -182,16 +192,6 @@ Wanna visualize the roadmap in a more interactive way? Check out the [markmap](h
 ## ❤️ Contributors
 [![Contributors](https://contrib.rocks/image?repo=VinciGit00/Scrapegraph-ai)](https://github.com/VinciGit00/Scrapegraph-ai/graphs/contributors)
 
-## Sponsors
-<div style="text-align: center;">
-  <a href="https://serpapi.com?utm_source=scrapegraphai">
-    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/serp_api_logo.png" alt="SerpAPI" style="width: 10%;">
-  </a>
-  <a href="https://dashboard.statproxies.com/?refferal=scrapegraph">
-    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/transparent_stat.png" alt="Stats" style="width: 15%;">
-  </a>
-</div>
-
 ## 🎓 Citations
 If you have used our library for research purposes please quote us with the following reference:
 ```text
 
@@ -21,34 +21,36 @@ Scrapegraph-ai 的参考页面可以在 PyPI 的官方网站上找到: [pypi](ht
 ```bash
 pip install scrapegraphai
 ```
-注意: 建议在虚拟环境中安装该库，以避免与其他库发生冲突 🐱
+**注意**: 建议在虚拟环境中安装该库，以避免与其他库发生冲突 🐱
 
-🔍 演示
+## 🔍 演示
 
 官方 Streamlit 演示：
 
-
+[![My Skills](https://skillicons.dev/icons?i=react)](https://scrapegraph-ai-web-dashboard.streamlit.app)
 
 在 Google Colab 上直接尝试：
 
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1sEZBonBMGP44CtO6GQTwAlL0BGJXjtfd?usp=sharing)
+
 ## 📖 文档
 
-ScrapeGraphAI 的文档可以在这里找到。
+ScrapeGraphAI 的文档可以在[这里](https://scrapegraph-ai.readthedocs.io/en/latest/)找到。
 
-还可以查看 Docusaurus 这里。
+还可以查看 Docusaurus 的[版本](https://scrapegraph-doc.onrender.com/)。
 
 ## 💻 用法
 
 有三种主要的爬取管道可用于从网站（或本地文件）提取信息：
 
-SmartScraperGraph: 单页爬虫，只需用户提示和输入源；
-SearchGraph: 多页爬虫，从搜索引擎的前 n 个搜索结果中提取信息；
-SpeechGraph: 单页爬虫，从网站提取信息并生成音频文件。
-SmartScraperMultiGraph: 多页爬虫，给定一个提示
-可以通过 API 使用不同的 LLM，如 OpenAI，Groq，Azure 和 Gemini，或者使用 Ollama 的本地模型。
+- `SmartScraperGraph`: 单页爬虫，只需用户提示和输入源；
+- `SearchGraph`: 多页爬虫，从搜索引擎的前 n 个搜索结果中提取信息；
+- `SpeechGraph`: 单页爬虫，从网站提取信息并生成音频文件。
+- `SmartScraperMultiGraph`: 多页爬虫，给定一个提示
+可以通过 API 使用不同的 LLM，如 **OpenAI**，**Groq**，**Azure** 和 **Gemini**，或者使用 **Ollama** 的本地模型。
 
-案例 1: 使用本地模型的 SmartScraper
-请确保已安装 Ollama 并使用 ollama pull 命令下载模型。
+### 案例 1: 使用本地模型的 SmartScraper
+请确保已安装 [Ollama](https://ollama.com/) 并使用 `ollama pull` 命令下载模型。
 
 ``` python
 from scrapegraphai.graphs import SmartScraperGraph
@@ -68,23 +70,24 @@ graph_config = {
 }
 
 smart_scraper_graph = SmartScraperGraph(
-    prompt="列出所有项目及其描述",
+    prompt="List me all the projects with their descriptions",
     # 也接受已下载的 HTML 代码的字符串
     source="https://perinim.github.io/projects",
     config=graph_config
 )
 
 result = smart_scraper_graph.run()
 print(result)
-``` 
+```
 
 输出将是一个包含项目及其描述的列表，如下所示：
 
-python
-Copia codice
-{'projects': [{'title': 'Rotary Pendulum RL', 'description': '开源项目，旨在使用 RL 算法控制现实中的旋转摆'}, {'title': 'DQN Implementation from scratch', 'description': '开发了一个深度 Q 网络算法来训练简单和双摆'}, ...]}
-案例 2: 使用混合模型的 SearchGraph
-我们使用 Groq 作为 LLM，使用 Ollama 作为嵌入模型。
+```python
+{'projects': [{'title': 'Rotary Pendulum RL', 'description': 'Open Source project aimed at controlling a real life rotary pendulum using RL algorithms'}, {'title': 'DQN Implementation from scratch', 'description': 'Developed a Deep Q-Network algorithm to train a simple and double pendulum'}, ...]}
+```
+
+### 案例 2: 使用混合模型的 SearchGraph
+我们使用 **Groq** 作为 LLM，使用 **Ollama** 作为嵌入模型。
 
 ```python
 from scrapegraphai.graphs import SearchGraph
@@ -105,7 +108,7 @@ graph_config = {
 
 # 创建 SearchGraph 实例
 search_graph = SearchGraph(
-    prompt="列出所有来自基奥贾的传统食谱",
+    prompt="List me all the traditional recipes from Chioggia",
     config=graph_config
 )
 
@@ -118,9 +121,12 @@ print(result)
 
 ```python
 {'recipes': [{'name': 'Sarde in Saòre'}, {'name': 'Bigoli in salsa'}, {'name': 'Seppie in umido'}, {'name': 'Moleche frite'}, {'name': 'Risotto alla pescatora'}, {'name': 'Broeto'}, {'name': 'Bibarasse in Cassopipa'}, {'name': 'Risi e bisi'}, {'name': 'Smegiassa Ciosota'}]}
-案例 3: 使用 OpenAI 的 SpeechGraph
-您只需传递 OpenAI API 密钥和模型名称。
 ```
+
+### 案例 3: 使用 OpenAI 的 SpeechGraph
+
+您只需传递 OpenAI API 密钥和模型名称。
+
 ```python
 from scrapegraphai.graphs import SpeechGraph
 
@@ -142,7 +148,7 @@ graph_config = {
 # ************************************************
 
 speech_graph = SpeechGraph(
-    prompt="详细总结这些项目并生成音频。",
+    prompt="Make a detailed audio summary of the projects.",
     source="https://perinim.github.io/projects/",
     config=graph_config,
 )
@@ -152,36 +158,38 @@ print(result)
 ```
 输出将是一个包含页面上项目摘要的音频文件。
 
-## 🤝 贡献
+## 赞助商
 
-欢迎贡献并加入我们的 Discord 服务器与我们讨论改进和提出建议！
+<div style="text-align: center;">
+  <a href="https://serpapi.com?utm_source=scrapegraphai">
+    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/serp_api_logo.png" alt="SerpAPI" style="width: 10%;">
+  </a>
+  <a href="https://dashboard.statproxies.com/?refferal=scrapegraph">
+    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/transparent_stat.png" alt="Stats" style="width: 15%;">
+  </a>
+</div>
 
-请参阅贡献指南。
+## 🤝 贡献
 
+欢迎贡献并加入我们的 Discord 服务器与我们讨论改进和提出建议！
 
+请参阅[贡献指南](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/CONTRIBUTING.md)。
 
+[![My Skills](https://skillicons.dev/icons?i=discord)](https://discord.gg/uJN7TYcpNa)
+[![My Skills](https://skillicons.dev/icons?i=linkedin)](https://www.linkedin.com/company/scrapegraphai/)
+[![My Skills](https://skillicons.dev/icons?i=twitter)](https://twitter.com/scrapegraphai)
 
 
-📈 路线图
+## 📈 路线图
 
-查看项目路线图这里! 🚀
+在[这里](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/docs/README.md)查看项目路线图! 🚀
 
-想要以更互动的方式可视化路线图？请查看 markmap 通过将 markdown 内容复制粘贴到编辑器中进行可视化！
+想要以更互动的方式可视化路线图？请查看 [markmap](https://markmap.js.org/repl) 通过将 markdown 内容复制粘贴到编辑器中进行可视化！
 
 ## ❤️ 贡献者
+[![Contributors](https://contrib.rocks/image?repo=VinciGit00/Scrapegraph-ai)](https://github.com/VinciGit00/Scrapegraph-ai/graphs/contributors)
 
 
-赞助商
-
-<div style="text-align: center;">
-  <a href="https://serpapi.com?utm_source=scrapegraphai">
-    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/serp_api_logo.png" alt="SerpAPI" style="width: 10%;">
-  </a>
-  <a href="https://dashboard.statproxies.com/?refferal=scrapegraph">
-    <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/transparent_stat.png" alt="Stats" style="width: 15%;">
-  </a>
-</div>
-
 ## 🎓 引用
 
 如果您将我们的库用于研究目的，请引用以下参考文献：
@@ -199,16 +207,19 @@ print(result)
 <p align="center">
   <img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/logo_authors.png" alt="Authors_logos">
 </p>
+
 ## 联系方式
+|                    | Contact Info         |
+|--------------------|----------------------|
+| Marco Vinciguerra  | [![Linkedin Badge](https://img.shields.io/badge/-Linkedin-blue?style=flat&logo=Linkedin&logoColor=white)](https://www.linkedin.com/in/marco-vinciguerra-7ba365242/)    |
+| Marco Perini       | [![Linkedin Badge](https://img.shields.io/badge/-Linkedin-blue?style=flat&logo=Linkedin&logoColor=white)](https://www.linkedin.com/in/perinim/)   |
+| Lorenzo Padoan     | [![Linkedin Badge](https://img.shields.io/badge/-Linkedin-blue?style=flat&logo=Linkedin&logoColor=white)](https://www.linkedin.com/in/lorenzo-padoan-4521a2154/)  |
 
-Marco Vinciguerra	
-Marco Perini	
-Lorenzo Padoan	
 ## 📜 许可证
 
-ScrapeGraphAI 采用 MIT 许可证。更多信息请查看 LICENSE 文件。
+ScrapeGraphAI 采用 MIT 许可证。更多信息请查看 [LICENSE](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/LICENSE) 文件。
 
-鸣谢
+## 鸣谢
 
-我们要感谢所有项目贡献者和开源社区的支持。
-ScrapeGraphAI 仅用于数据探索和研究目的。我们不对任何滥用该库的行为负责。
+- 我们要感谢所有项目贡献者和开源社区的支持。
+- ScrapeGraphAI 仅用于数据探索和研究目的。我们不对任何滥用该库的行为负责。
@@ -0,0 +1,55 @@
+"""
+Basic example of scraping pipeline using CSVScraperMultiGraph from CSV documents
+"""
+
+import os
+from dotenv import load_dotenv
+import pandas as pd
+from scrapegraphai.graphs import CSVScraperMultiGraph
+from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
+
+load_dotenv()
+# ************************************************
+# Read the CSV file
+# ************************************************
+
+FILE_NAME = "inputs/username.csv"
+curr_dir = os.path.dirname(os.path.realpath(__file__))
+file_path = os.path.join(curr_dir, FILE_NAME)
+
+text = pd.read_csv(file_path)
+
+# ************************************************
+# Define the configuration for the graph
+# ************************************************
+
+graph_config = {
+    "llm": {
+        "api_key": os.getenv("ANTHROPIC_API_KEY"),
+        "model": "claude-3-haiku-20240307",
+        "max_tokens": 4000},
+}
+
+# ************************************************
+# Create the CSVScraperMultiGraph instance and run it
+# ************************************************
+
+csv_scraper_graph = CSVScraperMultiGraph(
+    prompt="List me all the last names",
+    source=[str(text), str(text)],
+    config=graph_config
+)
+
+result = csv_scraper_graph.run()
+print(result)
+
+# ************************************************
+# Get graph execution info
+# ************************************************
+
+graph_exec_info = csv_scraper_graph.get_execution_info()
+print(prettify_exec_info(graph_exec_info))
+
+# Save to json or csv
+convert_to_csv(result, "result")
+convert_to_json(result, "result")
@@ -0,0 +1,36 @@
+"""
+Module for showing how JSONScraperMultiGraph multi works
+"""
+import os
+import json
+from dotenv import load_dotenv
+from scrapegraphai.graphs import JSONScraperMultiGraph
+
+load_dotenv()
+
+graph_config = {
+    "llm": {
+        "api_key": os.getenv("ANTHROPIC_API_KEY"),
+        "model": "claude-3-haiku-20240307",
+        "max_tokens": 4000
+        },
+}
+
+FILE_NAME = "inputs/example.json"
+curr_dir = os.path.dirname(os.path.realpath(__file__))
+file_path = os.path.join(curr_dir, FILE_NAME)
+
+with open(file_path, 'r', encoding="utf-8") as file:
+    text = file.read()
+
+sources = [text, text]
+
+multiple_search_graph = JSONScraperMultiGraph(
+    prompt= "List me all the authors, title and genres of the books",
+    source= sources,
+    schema=None,
+    config=graph_config
+)
+
+result = multiple_search_graph.run()
+print(json.dumps(result, indent=4))
@@ -1,10 +1,12 @@
+""" 
+Module for showing how PDFScraper multi works
+"""
 import os, json
 from dotenv import load_dotenv
 from scrapegraphai.graphs import PDFScraperGraph
 
 load_dotenv()
 
-
 # ************************************************
 # Define the configuration for the graph
 # ************************************************