Add examples/o3 Web Crawler

This commit is contained in:
Aparup Ganguly
2025-04-22 21:48:38 +05:30
parent cf84324534
commit d05274ef0b
5 changed files with 375 additions and 0 deletions
+59
View File
@@ -0,0 +1,59 @@
# O3 Web Crawler
A Python tool that uses OpenAI's o3 model and Firecrawl to intelligently crawl websites based on specific objectives.
## Features
- Maps website URLs to identify the most relevant pages for your objective
- Uses OpenAI's o3 model to analyze and rank pages by relevance
- Extracts specific information from web pages based on your objective
- Provides detailed, color-coded terminal output to track progress
## Prerequisites
- Python 3.6+
- Firecrawl API key
- OpenAI API key
## Installation
1. Clone this repository
2. Install dependencies:
```
pip install -r requirements.txt
```
3. Create a `.env` file based on `.env.example` with your API keys
## Usage
Run the script:
```
python o3-web-crawler.py
```
You will be prompted to:
1. Enter a website URL to crawl
2. Specify your objective (what information you want to extract)
The script will:
- Analyze your objective to determine optimal search parameters
- Map the website to find relevant pages
- Rank pages by relevance to your objective
- Scrape and analyze top pages to extract the requested information
- Display results in JSON format
## Example
```
Enter the website to crawl: https://example.com
Enter your objective: Find the company's contact information and headquarters location
```
The script will intelligently crawl the website and extract the requested information.
## License
MIT