# Ballistic Import Pipeline A high-level overview of how merchant data flows through the Spring ETL system. --- ## Purpose This document explains how the Ballistic backend: 1. Fetches merchant product feeds (CSV/TSV) 2. Normalizes raw data into structured entities 3. Updates products and offers in an idempotent way 4. Supports two sync modes: - Full Import - Offer-Only Sync --- # 1. High-Level Flow ## ASCII Diagram ``` ┌──────────────────────────┐ │ /admin/imports/{id} │ │ (Full Import Trigger) │ └─────────────┬────────────┘ │ ▼ ┌──────────────────────────────┐ │ importMerchantFeed(merchantId)│ └─────────────┬────────────────┘ │ ▼ ┌────────────────────────────────────────────────────────┐ │ readFeedRowsForMerchant() │ │ - auto-detect delimiter │ │ - parse CSV/TSV → MerchantFeedRow objects │ └─────────────────┬──────────────────────────────────────┘ │ List ▼ ┌──────────────────────────────────────┐ │ For each MerchantFeedRow row: │ │ resolveBrand() │ │ upsertProduct() │ │ - find existing via brand+mpn/upc │ │ - update fields (mapped partRole) │ │ upsertOfferFromRow() │ └──────────────────────────────────────┘ ``` --- # 2. Full Import Explained Triggered by: ``` POST /admin/imports/{merchantId} ``` ### Step 1 — Load merchant Using `merchantRepository.findById()`. ### Step 2 — Parse feed rows `readFeedRowsForMerchant()`: - Auto-detects delimiter (`\t`, `,`, `;`) - Validates required headers - Parses each row into `MerchantFeedRow` ### Step 3 — Process each row For each parsed row: #### a. resolveBrand() - Finds or creates brand - Defaults to “Aero Precision” if missing #### b. upsertProduct() Dedupes by: 1. Brand + MPN 2. Brand + UPC (currently SKU placeholder) If no match → create new product. Then applies: - Name + slug - Descriptions - Images - MPN/identifiers - Platform inference - Category mapping - Part role inference #### c. upsertOfferFromRow() Creates or updates a ProductOffer: - Prices - Stock - Buy URL - lastSeenAt - firstSeenAt when newly created Idempotent — does not duplicate offers. --- # 3. Offer-Only Sync Triggered by: ``` POST /admin/imports/{merchantId}/offers-only ``` Does NOT: - Create products - Update product fields It only updates: - price - originalPrice - inStock - buyUrl - lastSeenAt If the offer does not exist, it is skipped. --- # 4. Auto-Detecting CSV/TSV Parser The parser: - Attempts multiple delimiters - Validates headers - Handles malformed or short rows - Never throws on missing columns - Returns clean MerchantFeedRow objects Designed for messy merchant feeds. --- # 5. Entities Updated During Import ### Product - name - slug - short/long description - main image - mpn - upc (future) - platform - rawCategoryKey - partRole ### ProductOffer - merchant - product - avantlinkProductId (SKU placeholder) - price - originalPrice - inStock - buyUrl - lastSeenAt - firstSeenAt ### Merchant - lastFullImportAt - lastOfferSyncAt --- # 6. Extension Points You can extend the import pipeline in these areas: - Add per-merchant column mapping - Add true UPC parsing - Support multi-platform parts - Improve partRole inference - Implement global deduplication across merchants --- # 7. Quick Reference: Main Methods | Method | Purpose | |--------|---------| | importMerchantFeed | Full product + offer import | | readFeedRowsForMerchant | Detect delimiter + parse feed | | resolveBrand | Normalize brand names | | upsertProduct | Idempotent product write | | updateProductFromRow | Apply product fields | | upsertOfferFromRow | Idempotent offer write | | syncOffersOnly | Offer-only sync | | upsertOfferOnlyFromRow | Update existing offers | | detectCsvFormat | Auto-detect delimiter | | fetchFeedRows | Simpler parser for offers | --- # 8. Summary The Ballistic importer is: - Robust against bad data - Idempotent and safe - Flexible for multiple merchants - Extensible for long-term scaling This pipeline powers the product catalog and offer data for the Ballistic ecosystem.