Skip to content

feat(browser_base_fetch): add async_mode to support both synchronous … #644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 18 additions & 2 deletions scrapegraphai/docloaders/browser_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ def browser_base_fetch(api_key: str, project_id: str, link: List[str], text_cont
- `api_key`: The API key provided by BrowserBase.
- `project_id`: The ID of the project on BrowserBase where you want to fetch data from.
- `link`: The URL or link that you want to fetch data from.
- `text_content`: A boolean flag to specify whether to return only the text content (True) or the full HTML (False).
- `async_mode`: A boolean flag that determines whether the function runs asynchronously (True) or synchronously (False, default).

It initializes a Browserbase object with the given API key and project ID,
then uses this object to load the specified link.
Expand All @@ -35,6 +37,8 @@ def browser_base_fetch(api_key: str, project_id: str, link: List[str], text_cont
api_key (str): The API key provided by BrowserBase.
project_id (str): The ID of the project on BrowserBase where you want to fetch data from.
link (str): The URL or link that you want to fetch data from.
text_content (bool): Whether to return only the text content (True) or the full HTML (False). Defaults to True.
async_mode (bool): Whether to run the function asynchronously (True) or synchronously (False). Defaults to False.

Returns:
object: The result of the loading operation.
Expand All @@ -49,7 +53,19 @@ def browser_base_fetch(api_key: str, project_id: str, link: List[str], text_cont
browserbase = Browserbase(api_key=api_key, project_id=project_id)

result = []
for l in link:
result.append(browserbase.load(l, text_content=text_content))
async def _async_fetch_link(l):
return await asyncio.to_thread(browserbase.load, l, text_content=text_content)

if async_mode:
async def _async_browser_base_fetch():
for l in link:
result.append(await _async_fetch_link(l))
return result

result = asyncio.run(_async_browser_base_fetch())
else:
for l in link:
result.append(browserbase.load(l, text_content=text_content))


return result
Loading