mirror of
https://gitea.gofwd.group/Forward_Group/ballistic-builder-spring.git
synced 2025-12-05 18:46:44 -05:00
readme docs
This commit is contained in:
213
importLogic.md
Normal file
213
importLogic.md
Normal file
@@ -0,0 +1,213 @@
|
||||
# Ballistic Import Pipeline
|
||||
A high-level overview of how merchant data flows through the Spring ETL system.
|
||||
|
||||
---
|
||||
|
||||
## Purpose
|
||||
|
||||
This document explains how the Ballistic backend:
|
||||
|
||||
1. Fetches merchant product feeds (CSV/TSV)
|
||||
2. Normalizes raw data into structured entities
|
||||
3. Updates products and offers in an idempotent way
|
||||
4. Supports two sync modes:
|
||||
- Full Import
|
||||
- Offer-Only Sync
|
||||
|
||||
---
|
||||
|
||||
# 1. High-Level Flow
|
||||
|
||||
## ASCII Diagram
|
||||
|
||||
```
|
||||
┌──────────────────────────┐
|
||||
│ /admin/imports/{id} │
|
||||
│ (Full Import Trigger) │
|
||||
└─────────────┬────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────┐
|
||||
│ importMerchantFeed(merchantId)│
|
||||
└─────────────┬────────────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ readFeedRowsForMerchant() │
|
||||
│ - auto-detect delimiter │
|
||||
│ - parse CSV/TSV → MerchantFeedRow objects │
|
||||
└─────────────────┬──────────────────────────────────────┘
|
||||
│ List<MerchantFeedRow>
|
||||
▼
|
||||
┌──────────────────────────────────────┐
|
||||
│ For each MerchantFeedRow row: │
|
||||
│ resolveBrand() │
|
||||
│ upsertProduct() │
|
||||
│ - find existing via brand+mpn/upc │
|
||||
│ - update fields (mapped partRole) │
|
||||
│ upsertOfferFromRow() │
|
||||
└──────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# 2. Full Import Explained
|
||||
|
||||
Triggered by:
|
||||
|
||||
```
|
||||
POST /admin/imports/{merchantId}
|
||||
```
|
||||
|
||||
### Step 1 — Load merchant
|
||||
Using `merchantRepository.findById()`.
|
||||
|
||||
### Step 2 — Parse feed rows
|
||||
`readFeedRowsForMerchant()`:
|
||||
- Auto-detects delimiter (`\t`, `,`, `;`)
|
||||
- Validates required headers
|
||||
- Parses each row into `MerchantFeedRow`
|
||||
|
||||
### Step 3 — Process each row
|
||||
|
||||
For each parsed row:
|
||||
|
||||
#### a. resolveBrand()
|
||||
- Finds or creates brand
|
||||
- Defaults to “Aero Precision” if missing
|
||||
|
||||
#### b. upsertProduct()
|
||||
Dedupes by:
|
||||
|
||||
1. Brand + MPN
|
||||
2. Brand + UPC (currently SKU placeholder)
|
||||
|
||||
If no match → create new product.
|
||||
|
||||
Then applies:
|
||||
- Name + slug
|
||||
- Descriptions
|
||||
- Images
|
||||
- MPN/identifiers
|
||||
- Platform inference
|
||||
- Category mapping
|
||||
- Part role inference
|
||||
|
||||
#### c. upsertOfferFromRow()
|
||||
Creates or updates a ProductOffer:
|
||||
- Prices
|
||||
- Stock
|
||||
- Buy URL
|
||||
- lastSeenAt
|
||||
- firstSeenAt when newly created
|
||||
|
||||
Idempotent — does not duplicate offers.
|
||||
|
||||
---
|
||||
|
||||
# 3. Offer-Only Sync
|
||||
|
||||
Triggered by:
|
||||
|
||||
```
|
||||
POST /admin/imports/{merchantId}/offers-only
|
||||
```
|
||||
|
||||
Does NOT:
|
||||
- Create products
|
||||
- Update product fields
|
||||
|
||||
It only updates:
|
||||
- price
|
||||
- originalPrice
|
||||
- inStock
|
||||
- buyUrl
|
||||
- lastSeenAt
|
||||
|
||||
If the offer does not exist, it is skipped.
|
||||
|
||||
---
|
||||
|
||||
# 4. Auto-Detecting CSV/TSV Parser
|
||||
|
||||
The parser:
|
||||
|
||||
- Attempts multiple delimiters
|
||||
- Validates headers
|
||||
- Handles malformed or short rows
|
||||
- Never throws on missing columns
|
||||
- Returns clean MerchantFeedRow objects
|
||||
|
||||
Designed for messy merchant feeds.
|
||||
|
||||
---
|
||||
|
||||
# 5. Entities Updated During Import
|
||||
|
||||
### Product
|
||||
- name
|
||||
- slug
|
||||
- short/long description
|
||||
- main image
|
||||
- mpn
|
||||
- upc (future)
|
||||
- platform
|
||||
- rawCategoryKey
|
||||
- partRole
|
||||
|
||||
### ProductOffer
|
||||
- merchant
|
||||
- product
|
||||
- avantlinkProductId (SKU placeholder)
|
||||
- price
|
||||
- originalPrice
|
||||
- inStock
|
||||
- buyUrl
|
||||
- lastSeenAt
|
||||
- firstSeenAt
|
||||
|
||||
### Merchant
|
||||
- lastFullImportAt
|
||||
- lastOfferSyncAt
|
||||
|
||||
---
|
||||
|
||||
# 6. Extension Points
|
||||
|
||||
You can extend the import pipeline in these areas:
|
||||
|
||||
- Add per-merchant column mapping
|
||||
- Add true UPC parsing
|
||||
- Support multi-platform parts
|
||||
- Improve partRole inference
|
||||
- Implement global deduplication across merchants
|
||||
|
||||
---
|
||||
|
||||
# 7. Quick Reference: Main Methods
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| importMerchantFeed | Full product + offer import |
|
||||
| readFeedRowsForMerchant | Detect delimiter + parse feed |
|
||||
| resolveBrand | Normalize brand names |
|
||||
| upsertProduct | Idempotent product write |
|
||||
| updateProductFromRow | Apply product fields |
|
||||
| upsertOfferFromRow | Idempotent offer write |
|
||||
| syncOffersOnly | Offer-only sync |
|
||||
| upsertOfferOnlyFromRow | Update existing offers |
|
||||
| detectCsvFormat | Auto-detect delimiter |
|
||||
| fetchFeedRows | Simpler parser for offers |
|
||||
|
||||
---
|
||||
|
||||
# 8. Summary
|
||||
|
||||
The Ballistic importer is:
|
||||
|
||||
- Robust against bad data
|
||||
- Idempotent and safe
|
||||
- Flexible for multiple merchants
|
||||
- Extensible for long-term scaling
|
||||
|
||||
This pipeline powers the product catalog and offer data for the Ballistic ecosystem.
|
||||
Reference in New Issue
Block a user