feat: enhanced Magnit scraper with streaming mode and retry logic

- Add streaming mode for memory-efficient large catalog scraping
- Implement retry logic with exponential backoff
- Add auto session reinitialization on 403 errors
- Add configurable options (pageSize, maxProducts, rateLimitDelay)
- Add maxIterations protection against infinite loops
- Add retry.ts utility module with withRetry and withRetryAndReinit
- Update .env.example with new scraping options
- Add pgAdmin and CloudBeaver to docker-compose

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2026-01-21 22:14:04 +05:00
parent 19c0426cdc
commit 9164527f58
5 changed files with 585 additions and 74 deletions

View File

@@ -19,6 +19,36 @@ services:
timeout: 5s
retries: 5
pgadmin:
image: dpage/pgadmin4:latest
container_name: supermarket-pgadmin
restart: unless-stopped
environment:
PGADMIN_DEFAULT_EMAIL: admin@admin.com
PGADMIN_DEFAULT_PASSWORD: admin
PGADMIN_CONFIG_SERVER_MODE: 'False'
ports:
- "5050:80"
volumes:
- pgadmin_data:/var/lib/pgadmin
depends_on:
postgres:
condition: service_healthy
cloudbeaver:
image: dbeaver/cloudbeaver:latest
container_name: supermarket-cloudbeaver
restart: unless-stopped
ports:
- "8978:8978"
volumes:
- cloudbeaver_data:/opt/cloudbeaver/workspace
depends_on:
postgres:
condition: service_healthy
volumes:
postgres_data:
pgadmin_data:
cloudbeaver_data: