Files
ballistic-builder-spring/importLogic.md
2025-12-02 07:21:23 -05:00

5.3 KiB

Ballistic Import Pipeline

A high-level overview of how merchant data flows through the Spring ETL system.


Purpose

This document explains how the Ballistic backend:

  1. Fetches merchant product feeds (CSV/TSV)
  2. Normalizes raw data into structured entities
  3. Updates products and offers in an idempotent way
  4. Supports two sync modes:
    • Full Import
    • Offer-Only Sync

1. High-Level Flow

ASCII Diagram

                         ┌──────────────────────────┐
                         │   /admin/imports/{id}     │
                         │  (Full Import Trigger)    │
                         └─────────────┬────────────┘
                                       │
                                       ▼
                        ┌──────────────────────────────┐
                        │ importMerchantFeed(merchantId)│
                        └─────────────┬────────────────┘
                                      │
                                      ▼
           ┌────────────────────────────────────────────────────────┐
           │ readFeedRowsForMerchant()                               │
           │  - auto-detect delimiter                               │
           │  - parse CSV/TSV → MerchantFeedRow objects             │
           └─────────────────┬──────────────────────────────────────┘
                             │ List<MerchantFeedRow>
                             ▼
          ┌──────────────────────────────────────┐
          │ For each MerchantFeedRow row:        │
          │  resolveBrand()                      │
          │  upsertProduct()                     │
          │    - find existing via brand+mpn/upc │
          │    - update fields (mapped partRole) │
          │  upsertOfferFromRow()                │
          └──────────────────────────────────────┘

2. Full Import Explained

Triggered by:

POST /admin/imports/{merchantId}

Step 1 — Load merchant

Using merchantRepository.findById().

Step 2 — Parse feed rows

readFeedRowsForMerchant():

  • Auto-detects delimiter (\t, ,, ;)
  • Validates required headers
  • Parses each row into MerchantFeedRow

Step 3 — Process each row

For each parsed row:

a. resolveBrand()

  • Finds or creates brand
  • Defaults to “Aero Precision” if missing

b. upsertProduct()

Dedupes by:

  1. Brand + MPN
  2. Brand + UPC (currently SKU placeholder)

If no match → create new product.

Then applies:

  • Name + slug
  • Descriptions
  • Images
  • MPN/identifiers
  • Platform inference
  • Category mapping
  • Part role inference

c. upsertOfferFromRow()

Creates or updates a ProductOffer:

  • Prices
  • Stock
  • Buy URL
  • lastSeenAt
  • firstSeenAt when newly created

Idempotent — does not duplicate offers.


3. Offer-Only Sync

Triggered by:

POST /admin/imports/{merchantId}/offers-only

Does NOT:

  • Create products
  • Update product fields

It only updates:

  • price
  • originalPrice
  • inStock
  • buyUrl
  • lastSeenAt

If the offer does not exist, it is skipped.


4. Auto-Detecting CSV/TSV Parser

The parser:

  • Attempts multiple delimiters
  • Validates headers
  • Handles malformed or short rows
  • Never throws on missing columns
  • Returns clean MerchantFeedRow objects

Designed for messy merchant feeds.


5. Entities Updated During Import

Product

  • name
  • slug
  • short/long description
  • main image
  • mpn
  • upc (future)
  • platform
  • rawCategoryKey
  • partRole

ProductOffer

  • merchant
  • product
  • avantlinkProductId (SKU placeholder)
  • price
  • originalPrice
  • inStock
  • buyUrl
  • lastSeenAt
  • firstSeenAt

Merchant

  • lastFullImportAt
  • lastOfferSyncAt

6. Extension Points

You can extend the import pipeline in these areas:

  • Add per-merchant column mapping
  • Add true UPC parsing
  • Support multi-platform parts
  • Improve partRole inference
  • Implement global deduplication across merchants

7. Quick Reference: Main Methods

Method Purpose
importMerchantFeed Full product + offer import
readFeedRowsForMerchant Detect delimiter + parse feed
resolveBrand Normalize brand names
upsertProduct Idempotent product write
updateProductFromRow Apply product fields
upsertOfferFromRow Idempotent offer write
syncOffersOnly Offer-only sync
upsertOfferOnlyFromRow Update existing offers
detectCsvFormat Auto-detect delimiter
fetchFeedRows Simpler parser for offers

8. Summary

The Ballistic importer is:

  • Robust against bad data
  • Idempotent and safe
  • Flexible for multiple merchants
  • Extensible for long-term scaling

This pipeline powers the product catalog and offer data for the Ballistic ecosystem.