Most integration projects start the same way: two systems that should share data, don't. In this case, we had an inventory management and purchase order processing platform on one side and a project management system on the other. The PM tool held takeoffs with detailed materials lists. The inventory platform tracked stock levels, vendor catalogs, and PO workflows. Between them sat a human being with a notepad.
Here's how we approached connecting them.
Discovery: mapping both data models
Before writing any code, we spent time understanding the schema on each side. This is the part that gets skipped when integrations go wrong.
The PM system represented materials as line items on a takeoff, tied to a project. Each line item had a description, quantity, unit of measure, and sometimes a spec reference or part number. Sometimes. The data quality varied by who created the takeoff, and descriptions weren't standardized. "1/2 inch EMT conduit" in one takeoff might be "1/2 EMT" or "EMT .5in" in another.
The inventory platform had a structured product catalog. SKUs, vendor part numbers, unit costs, reorder points, warehouse locations. Clean, normalized data because someone had to enter it once and the system enforced consistency from there.
The core challenge was mapping between the two. Takeoff line items are free-text or semi-structured. Inventory records are catalog entries with unique identifiers. We needed a reconciliation layer that could match fuzzy descriptions to catalog SKUs with a high degree of confidence, and flag ambiguous matches for human review rather than guessing.
We built a matching service that used a combination of normalized text comparison, part number extraction (regex patterns for common industry formats), and a manual mapping table that grew over time as users confirmed matches. The manual mapping table was critical. It gave us a feedback loop so the system got better at matching the more it was used.
Sync strategy: event-driven with reconciliation
We evaluated three approaches:
Polling/batch sync runs on a schedule, pulling changes at regular intervals. Simple to implement. The downside is latency. If a takeoff gets approved at 9:01am and your batch runs at 10:00am, that's an hour before downstream processes start.
Webhooks push data when something changes. Lower latency, more responsive. The downside is reliability. Webhooks can fail silently. The receiving end might be down.
We went with a hybrid: webhooks as the primary mechanism with a scheduled reconciliation job that catches anything webhooks missed.
The PM system fired a webhook when a takeoff moved to "approved." We received the event, pulled the full takeoff data via their REST API (the webhook payload was too thin, just an ID and event type), ran it through our matching service, and pushed the results to the inventory platform.
The reconciliation job ran every four hours. It compared approved takeoffs against processed records, flagged gaps, and reprocessed them. In the first two weeks, it caught about 3% of events that webhooks had dropped. After we hardened the receiver with retry logic, that dropped below 0.5%.
Middleware architecture
We built a lightweight middleware service rather than connecting the systems directly. Neither system's API was designed to talk to the other. Data transformations were necessary at every step.
The middleware was a Node.js service. PostgreSQL for the mapping table, sync state, and event log. BullMQ backed by Redis for async job processing so webhook receivers could respond immediately.
The service exposed a small internal API for the operations team to review flagged matches, confirm or correct them, and see sync status.
Handling schema mismatches
Units of measure. The PM system used abbreviations inconsistently ("ea", "each", "EA", "LF", "lin ft", "linear feet"). The inventory platform required specific UOM codes from a fixed list. We built a normalization layer with a lookup table. When a new variant appeared that didn't match, the item got flagged.
Quantity rollups. Takeoffs listed materials per area or per phase. The inventory system tracked quantities per warehouse and per project. We had to aggregate takeoff quantities at the project level and compare against current stock to determine the net requirement.
Vendor mapping. The PM system didn't track vendors. The inventory platform associated each product with one or more vendors. When generating PO recommendations, we pulled vendor associations from the inventory side and applied business rules (preferred vendor first, fallback if backordered, price thresholds for multiple quotes).
Building for failure
Integrations break. We implemented a dead letter queue for operations that failed after three retries. Items in the dead letter queue generated alerts. No data was silently dropped.
For API outages, circuit breakers. Five consecutive errors opened the circuit and stopped calls for a cooldown period. This prevented cascading failures.
Every sync operation wrote to an append-only event log. Timestamp, source system, target system, payload hash, result status, error details. When something went wrong, we could trace exactly what happened.
Monitoring
We tracked four metrics: sync latency (target under 60 seconds for webhook-triggered syncs), match confidence distribution (automatic vs. human review), error rate by type, and queue depth.
Structured logging shipped to a centralized log store. Alerts fired on error rate spikes, queue depth thresholds, and if reconciliation found missed events above a configurable threshold.
What we'd do differently
Push harder on the matching service from day one. We started simple and iterated, which cost time and meant more manual review early on. Starting with a normalization dictionary built from historical takeoff data would have shortened the ramp-up.
Also, advocate for standardized takeoff descriptions earlier. No matching logic eliminates the cost of inconsistent source data. Picking from a standardized materials list rather than free-typing would have improved match rates from the start.