Files
supermarket/AGENTS.md
2026-04-05 01:18:08 +05:00

159 lines
4.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AGENTS.md
Guidelines for AI coding agents working on this repository.
## Project Overview
TypeScript-based scraper for Russian supermarkets (Magnit). Uses Playwright for sessions, Axios for API, PostgreSQL with Drizzle ORM.
## Build & Run Commands
**Package Manager**: Use `pnpm` (not npm/yarn)
```bash
pnpm install # Install dependencies
pnpm exec playwright install chromium # Install browsers (once)
pnpm type-check # Type checking (validation)
pnpm build # Build TypeScript to dist/
pnpm dev # Run main scraper
pnpm enrich # Run product enrichment
pnpm test-db # Test database connection
```
### Drizzle Commands
```bash
pnpm db:generate # Generate migration files
pnpm db:migrate # Apply migrations
pnpm db:push # Push schema changes directly (dev only)
pnpm db:studio # Open database GUI
```
### Running Scripts Directly
```bash
tsx src/scripts/scrape-magnit-products.ts
MAGNIT_STORE_CODE=992301 tsx src/scripts/scrape-magnit-products.ts
```
## Testing
No test framework configured. Manual testing via `pnpm test-db`, `pnpm dev`, Prisma Studio.
## Code Style
### Imports
1. External packages first, then internal modules
2. **Always include `.js` extension** for local imports (ESM)
3. Use named imports from Drizzle schema
```typescript
import { chromium, Browser } from 'playwright';
import axios from 'axios';
import { Logger } from '../../../utils/logger.js';
import { db } from '../../../config/database.js';
import { products, stores, categories } from '../../../db/schema.js';
import { eq, and, asc } from 'drizzle-orm';
```
### Naming Conventions
| Type | Convention | Example |
|------|------------|---------|
| Classes/Interfaces | PascalCase | `MagnitApiScraper`, `CreateProductData` |
| Functions/variables | camelCase | `scrapeAllProducts`, `deviceId` |
| Constants | UPPER_SNAKE_CASE | `ACTUAL_API_PAGE_SIZE` |
| Class files | PascalCase | `MagnitApiScraper.ts` |
| Util files | camelCase | `logger.ts`, `errors.ts` |
### TypeScript Patterns
- **Strict mode** - all types explicit
- Interfaces for data, optional props with `?`, `readonly` for constants
```typescript
export interface MagnitScraperConfig {
storeCode: string;
headless?: boolean;
}
```
### Error Handling
Use custom error classes from `src/utils/errors.ts`:
- `ScraperError` - scraping failures
- `DatabaseError` - database operations
- `APIError` - HTTP/API failures (includes statusCode)
```typescript
try {
// operation
} catch (error) {
Logger.error('Ошибка операции:', error);
throw new APIError(
`Не удалось: ${error instanceof Error ? error.message : String(error)}`,
statusCode
);
}
```
### Logging
Use static `Logger` class from `src/utils/logger.ts`:
```typescript
Logger.info('Message'); // Always shown
Logger.error('Error:', error); // Always shown
Logger.debug('Debug'); // Only when DEBUG=true
```
### Async/Class Patterns
- All async methods return `Promise<T>` with explicit return types
- Class order: private props -> constructor -> public methods -> private methods
- Lifecycle: `initialize()` -> operations -> `close()`
### Services Pattern
- Services receive `db` (Drizzle instance) via constructor (DI)
- Use `getOrCreate` for idempotent operations
- Never call Drizzle directly from scrapers
### Database Patterns
- Upsert via composite unique constraint on `(externalId, storeId)`
- Batch processing: 50 items per batch
- Prices: Decimal (rubles), stored as decimal type
- Use `.select().from().where()` for queries
- Use `.insert().values()` for inserts
- Use `.update().set().where()` for updates
- Use `.delete().where()` for deletes
### Comments
- JSDoc for public methods, inline comments in Russian
```typescript
/** Инициализация сессии через Playwright */
async initialize(): Promise<void> { }
```
## Cursor Rules
### Requestly API Tests (`.requestly-supermarket/**/*.json`)
- Use `rq.test()` for tests, `rq.expect()` for assertions
- Access response via `rq.response.body` (parse as JSON)
- Prices in kopecks (24999 = 249.99 rubles)
See `.cursor/rules/requestly-test-rules.mdc` for full docs.
## Environment Variables
```bash
DATABASE_URL=postgresql://user:password@localhost:5432/supermarket
MAGNIT_STORE_CODE=992301
DEBUG=true
```