mirror of
https://gitea.gofwd.group/Forward_Group/ballistic-builder-spring.git
synced 2025-12-06 02:56:44 -05:00
213 lines
5.3 KiB
Markdown
213 lines
5.3 KiB
Markdown
# Ballistic Import Pipeline
|
|
A high-level overview of how merchant data flows through the Spring ETL system.
|
|
|
|
---
|
|
|
|
## Purpose
|
|
|
|
This document explains how the Ballistic backend:
|
|
|
|
1. Fetches merchant product feeds (CSV/TSV)
|
|
2. Normalizes raw data into structured entities
|
|
3. Updates products and offers in an idempotent way
|
|
4. Supports two sync modes:
|
|
- Full Import
|
|
- Offer-Only Sync
|
|
|
|
---
|
|
|
|
# 1. High-Level Flow
|
|
|
|
## ASCII Diagram
|
|
|
|
```
|
|
┌──────────────────────────┐
|
|
│ /admin/imports/{id} │
|
|
│ (Full Import Trigger) │
|
|
└─────────────┬────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────────────────┐
|
|
│ importMerchantFeed(merchantId)│
|
|
└─────────────┬────────────────┘
|
|
│
|
|
▼
|
|
┌────────────────────────────────────────────────────────┐
|
|
│ readFeedRowsForMerchant() │
|
|
│ - auto-detect delimiter │
|
|
│ - parse CSV/TSV → MerchantFeedRow objects │
|
|
└─────────────────┬──────────────────────────────────────┘
|
|
│ List<MerchantFeedRow>
|
|
▼
|
|
┌──────────────────────────────────────┐
|
|
│ For each MerchantFeedRow row: │
|
|
│ resolveBrand() │
|
|
│ upsertProduct() │
|
|
│ - find existing via brand+mpn/upc │
|
|
│ - update fields (mapped partRole) │
|
|
│ upsertOfferFromRow() │
|
|
└──────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
# 2. Full Import Explained
|
|
|
|
Triggered by:
|
|
|
|
```
|
|
POST /admin/imports/{merchantId}
|
|
```
|
|
|
|
### Step 1 — Load merchant
|
|
Using `merchantRepository.findById()`.
|
|
|
|
### Step 2 — Parse feed rows
|
|
`readFeedRowsForMerchant()`:
|
|
- Auto-detects delimiter (`\t`, `,`, `;`)
|
|
- Validates required headers
|
|
- Parses each row into `MerchantFeedRow`
|
|
|
|
### Step 3 — Process each row
|
|
|
|
For each parsed row:
|
|
|
|
#### a. resolveBrand()
|
|
- Finds or creates brand
|
|
- Defaults to “Aero Precision” if missing
|
|
|
|
#### b. upsertProduct()
|
|
Dedupes by:
|
|
|
|
1. Brand + MPN
|
|
2. Brand + UPC (currently SKU placeholder)
|
|
|
|
If no match → create new product.
|
|
|
|
Then applies:
|
|
- Name + slug
|
|
- Descriptions
|
|
- Images
|
|
- MPN/identifiers
|
|
- Platform inference
|
|
- Category mapping
|
|
- Part role inference
|
|
|
|
#### c. upsertOfferFromRow()
|
|
Creates or updates a ProductOffer:
|
|
- Prices
|
|
- Stock
|
|
- Buy URL
|
|
- lastSeenAt
|
|
- firstSeenAt when newly created
|
|
|
|
Idempotent — does not duplicate offers.
|
|
|
|
---
|
|
|
|
# 3. Offer-Only Sync
|
|
|
|
Triggered by:
|
|
|
|
```
|
|
POST /admin/imports/{merchantId}/offers-only
|
|
```
|
|
|
|
Does NOT:
|
|
- Create products
|
|
- Update product fields
|
|
|
|
It only updates:
|
|
- price
|
|
- originalPrice
|
|
- inStock
|
|
- buyUrl
|
|
- lastSeenAt
|
|
|
|
If the offer does not exist, it is skipped.
|
|
|
|
---
|
|
|
|
# 4. Auto-Detecting CSV/TSV Parser
|
|
|
|
The parser:
|
|
|
|
- Attempts multiple delimiters
|
|
- Validates headers
|
|
- Handles malformed or short rows
|
|
- Never throws on missing columns
|
|
- Returns clean MerchantFeedRow objects
|
|
|
|
Designed for messy merchant feeds.
|
|
|
|
---
|
|
|
|
# 5. Entities Updated During Import
|
|
|
|
### Product
|
|
- name
|
|
- slug
|
|
- short/long description
|
|
- main image
|
|
- mpn
|
|
- upc (future)
|
|
- platform
|
|
- rawCategoryKey
|
|
- partRole
|
|
|
|
### ProductOffer
|
|
- merchant
|
|
- product
|
|
- avantlinkProductId (SKU placeholder)
|
|
- price
|
|
- originalPrice
|
|
- inStock
|
|
- buyUrl
|
|
- lastSeenAt
|
|
- firstSeenAt
|
|
|
|
### Merchant
|
|
- lastFullImportAt
|
|
- lastOfferSyncAt
|
|
|
|
---
|
|
|
|
# 6. Extension Points
|
|
|
|
You can extend the import pipeline in these areas:
|
|
|
|
- Add per-merchant column mapping
|
|
- Add true UPC parsing
|
|
- Support multi-platform parts
|
|
- Improve partRole inference
|
|
- Implement global deduplication across merchants
|
|
|
|
---
|
|
|
|
# 7. Quick Reference: Main Methods
|
|
|
|
| Method | Purpose |
|
|
|--------|---------|
|
|
| importMerchantFeed | Full product + offer import |
|
|
| readFeedRowsForMerchant | Detect delimiter + parse feed |
|
|
| resolveBrand | Normalize brand names |
|
|
| upsertProduct | Idempotent product write |
|
|
| updateProductFromRow | Apply product fields |
|
|
| upsertOfferFromRow | Idempotent offer write |
|
|
| syncOffersOnly | Offer-only sync |
|
|
| upsertOfferOnlyFromRow | Update existing offers |
|
|
| detectCsvFormat | Auto-detect delimiter |
|
|
| fetchFeedRows | Simpler parser for offers |
|
|
|
|
---
|
|
|
|
# 8. Summary
|
|
|
|
The Ballistic importer is:
|
|
|
|
- Robust against bad data
|
|
- Idempotent and safe
|
|
- Flexible for multiple merchants
|
|
- Extensible for long-term scaling
|
|
|
|
This pipeline powers the product catalog and offer data for the Ballistic ecosystem. |