80ee708348
- Added `extract_listings.py` for extracting stock listings from TSX, TSXV, CSE, and CBOE using Playwright. - Created `main.py` to orchestrate the entire stock intelligence system, including extraction, database import, financial scraping, news scraping, and report generation. - Developed `populate_database.py` to populate the database with existing JSON data. - Introduced `scrape_nasdaq_tsx_only.py` for focused scraping of NASDAQ and TSX stocks. - Added `setup.py` for initial setup and testing of the system. - Created `watchlist.txt` template for user-defined stock tracking. - Generated `final_test_output.txt` to log the results of the test run.
197 lines
5.4 KiB
Markdown
197 lines
5.4 KiB
Markdown
# Stock Intelligence System - NASDAQ & TSX Focus
|
|
|
|
## Summary (November 6, 2025)
|
|
|
|
### ✅ Completed Tasks
|
|
|
|
1. **Fixed Quote Data Extraction**
|
|
- Corrected CSS selectors in Yahoo Finance scraper
|
|
- Fixed whitespace handling
|
|
- Added regex-based date extraction
|
|
- Fixed statistics merge logic to prevent overwriting
|
|
|
|
2. **Database Enhancement**
|
|
- Added `stock_quotes` table to store real-time price data
|
|
- Added `insert_stock_quote()` function
|
|
- Quote data now persists in database
|
|
|
|
3. **Report Generation**
|
|
- All NASDAQ and TSX stocks now show complete quote data in reports:
|
|
- ✅ Date
|
|
- ✅ Open
|
|
- ✅ High
|
|
- ✅ Low
|
|
- ✅ Close
|
|
- ✅ Volume
|
|
|
|
4. **Automated Daily Updates**
|
|
- Created `scrape_nasdaq_tsx_only.py` - focused scraper for quality data
|
|
- Updated `daily_run.sh` - daily execution script
|
|
- Set up cron job for **12:00 PM daily**
|
|
- Logs saved to `logs/daily_run_YYYYMMDD_HHMMSS.log`
|
|
|
|
### 📊 Current Stock Coverage
|
|
|
|
**NASDAQ Stocks (2):**
|
|
- AAPL - Apple Inc.
|
|
- MSFT - Microsoft Corporation
|
|
|
|
**TSX Stocks (1):**
|
|
- SHOP.TO - Shopify Inc.
|
|
|
|
**Total: 3 stocks** (CSE stocks excluded due to data quality issues on Yahoo Finance)
|
|
|
|
### 📁 Generated Reports
|
|
|
|
For each stock, the following files are generated:
|
|
|
|
1. **Markdown Report**: `data/reports/{TICKER}_full_report.md`
|
|
- Complete consolidated report
|
|
- All financials, metrics, news, filings
|
|
- Quote data merged into statistics section
|
|
|
|
2. **PDF Report**: `data/reports/{TICKER}_full_report.pdf`
|
|
- Professional formatted PDF
|
|
- Ready for management presentation
|
|
|
|
3. **CSV Exports**: `data/exports/`
|
|
- `stocks_export.csv` - Master stock list
|
|
- `stocks_detailed.csv` - All metrics and financials
|
|
- `news_summary.csv` - News articles
|
|
- `filings_summary.csv` - Regulatory filings
|
|
|
|
### 🤖 Daily Automation
|
|
|
|
**Schedule:** Every day at 12:00 PM
|
|
|
|
**What it does:**
|
|
1. Scrapes latest data from Yahoo Finance for all NASDAQ/TSX stocks
|
|
2. Extracts real-time quote data (date, open, high, low, close, volume)
|
|
3. Saves quote data to database
|
|
4. Generates consolidated Markdown and PDF reports for each stock
|
|
5. Exports all data to CSV files
|
|
6. Logs everything to `logs/` directory
|
|
|
|
**Cron Entry:**
|
|
```bash
|
|
0 12 * * * /Users/macbook/Desktop/Victor/daily_run.sh
|
|
```
|
|
|
|
### 🔧 Manual Operations
|
|
|
|
**Run immediately:**
|
|
```bash
|
|
cd /Users/macbook/Desktop/Victor
|
|
./daily_run.sh
|
|
```
|
|
|
|
**Scrape specific exchanges only:**
|
|
```bash
|
|
python3 scrape_nasdaq_tsx_only.py
|
|
```
|
|
|
|
**Generate report for specific stock:**
|
|
```bash
|
|
python3 generate_company_report.py --ticker AAPL
|
|
```
|
|
|
|
**Check cron status:**
|
|
```bash
|
|
crontab -l
|
|
```
|
|
|
|
**Remove cron job:**
|
|
```bash
|
|
crontab -e
|
|
# Delete the line with 'daily_run.sh'
|
|
```
|
|
|
|
### 📝 Files Created/Modified
|
|
|
|
**New Scripts:**
|
|
- `scrape_nasdaq_tsx_only.py` - NASDAQ/TSX focused scraper
|
|
- `rescrape_all_and_generate_reports.py` - Original full scraper (not used)
|
|
- `quick_batch_rescrape.py` - Quick test scraper
|
|
- `daily_run.sh` - Daily automation script
|
|
- `setup_daily_automation.sh` - Cron job installer
|
|
|
|
**Modified Scripts:**
|
|
- `scrape_yahoo_finance.py` - Fixed quote data extraction
|
|
- `database.py` - Added stock_quotes table
|
|
- `main_robust.py` - Added quote data insertion
|
|
- `generate_company_report.py` - Fixed statistics merge
|
|
|
|
**Documentation:**
|
|
- `QUOTE_DATA_EXTRACTION_FIX.md` - Technical details of the fix
|
|
- `WHY_NO_SEDAR_FOR_AAPL.md` - Explanation of SEDAR+ vs SEC
|
|
- `QUOTE_DATA_FIX.md` - Earlier fix attempts
|
|
- `NASDAQ_TSX_AUTOMATION_SUMMARY.md` - This file
|
|
|
|
### 🎯 Next Steps (Optional)
|
|
|
|
1. **Add More Stocks:**
|
|
- Add more NASDAQ/TSX stocks to `stocks_master` table
|
|
- They'll automatically be included in daily runs
|
|
|
|
2. **Email Notifications:**
|
|
- Uncomment the mail command in `daily_run.sh`
|
|
- Configure email settings
|
|
|
|
3. **Enhanced Metrics:**
|
|
- Add custom calculations in `financial_calculator.py`
|
|
- Metrics auto-update daily
|
|
|
|
4. **Dashboard:**
|
|
- Build web dashboard using the CSV exports
|
|
- Real-time visualization
|
|
|
|
### ⚠️ Important Notes
|
|
|
|
1. **Mac Sleep:** Ensure your Mac is awake at 12 PM for cron to run
|
|
2. **CSE Stocks:** Excluded due to unreliable Yahoo Finance data
|
|
3. **Logs:** Check `logs/` directory if something fails
|
|
4. **Quote Data:** Shows previous day's closing data (Yahoo updates after market close)
|
|
|
|
### 📊 Database Structure
|
|
|
|
**Tables:**
|
|
- `stocks_master` - Stock listings
|
|
- `stock_quotes` - Real-time price data (NEW!)
|
|
- `financial_metrics` - Calculated ratios
|
|
- `news_articles` - News and press releases
|
|
- `filings` - SEC/SEDAR+ filings
|
|
- `coverage_report` - Data coverage tracking
|
|
|
|
### ✅ Verification
|
|
|
|
All systems tested and verified:
|
|
- ✅ Quote data extraction working
|
|
- ✅ Database insertion working
|
|
- ✅ Report generation working
|
|
- ✅ PDF generation working
|
|
- ✅ CSV exports working
|
|
- ✅ Cron job installed
|
|
- ✅ Daily automation configured
|
|
|
|
**Last successful run:** November 6, 2025 at 11:01 AM
|
|
**Next scheduled run:** November 7, 2025 at 12:00 PM
|
|
|
|
---
|
|
|
|
## Ready for Management Submission! 🚀
|
|
|
|
All NASDAQ and TSX stocks now have:
|
|
- Complete quote data (date, open, high, low, close, volume)
|
|
- Comprehensive consolidated reports (Markdown + PDF)
|
|
- Automated daily updates at 12 PM
|
|
- Full database persistence
|
|
- CSV exports for analysis
|
|
|
|
**Report files for submission:**
|
|
- `data/reports/AAPL_full_report.pdf`
|
|
- `data/reports/MSFT_full_report.pdf`
|
|
- `data/reports/SHOP.TO_full_report.pdf`
|
|
- `data/exports/stocks_detailed.csv`
|
|
- `data/exports/news_summary.csv`
|
|
- `data/exports/filings_summary.csv`
|