-
Notifications
You must be signed in to change notification settings - Fork 3
Harvesting
Content from the knowledge base is being harvested from various sources and stored in AWS S3 buckets for inspection and, if relevant, ingestion into the knowledge base file system. We are currently focus is on OpenAPI, with AsyncAPI, gRPC, Websocket, GraphQL, and Postman Collections on the roadmap.
We are searching GitHub using a common vocabulary of works (ie. Products+OpenAPI), rotating through thousands of words and drip searching and pulling any potential OpenAPI we find.
Collections:
There are a handful of collections in use to harvest OpenAPIs from the web using ScrapingBee as the platform for searching and harvesting, then we validate and publish OpenAPIs to S3.
Collections:
We are manually harvesting the OpenAPIs available via the Postman network, pulling any API defined as an OpenAPI and publishing as part of the knowledge base. This is not automated and will need re-runnning and overhauling at some point.