Why On-Time Delivery Is Worse Than You Think
Real-time production visibility for manufacturers is a system that connects ERP, scheduling, and shop floor data into a unified view of every active job's status, progress, and delivery risk. The system flags jobs falling behind before they miss their ship date, giving production managers the time and information to intervene while recovery is still possible.
On-time delivery in job shop manufacturing averages 83% to 87% across the industry. That number is a fiction. Most shops measure OTD against the last confirmed promise date, which often reflects a date that has already been pushed back once or twice after the customer called asking where their parts were. Measured against the original promise date from the initial order acknowledgment, OTD at many shops drops to 70% to 78%. That is the number the customer remembers. That is the number that determines whether they send the next program to your shop or to the one across town.
Late delivery costs compound in ways that never appear on a single report. The direct costs are visible: expediting charges, premium freight, overtime to recover lost days. A single air freight shipment on a late aerospace order runs $2,000 to $8,000 depending on weight and destination. The indirect costs are larger by an order of magnitude and nearly invisible. Customers who experience repeated late deliveries reduce order volume, shift work to competitors, or remove the shop from their approved supplier list entirely. That revenue loss happens gradually, quarter by quarter, and almost never gets attributed back to the production visibility problem that caused it.
The root cause at most job shops is the same. The production manager cannot see what is actually happening across all active jobs in real time. The information exists. It is scattered across three or four systems that were never designed to communicate with each other, and by the time someone assembles the full picture manually, the recovery window has closed.
Three Systems, Three Realities, Zero Communication
A typical 50 to 150-person job shop runs three categories of systems. Each one holds a piece of the production picture. None of them shows the complete view, and none of them knows what the others contain.
The ERP Layer
JobBOSS, Epicor, ProShop, Global Shop Solutions, or a similar system manages orders, job records, material purchasing, and invoicing. The ERP knows what was ordered, what the routing is, what materials are required, and when the job is supposed to ship. What it does not know: where the job is right now on the shop floor, whether the machine it needs is running or sitting idle with an alarm, or whether the material that was supposed to arrive Monday is still on a truck somewhere in Indiana.
The Scheduling Layer
Some shops run dedicated scheduling software: Schedlyzer, Jobpack, Visual Planning, or a module within their ERP. Others use Excel spreadsheets, whiteboards, or magnetic scheduling boards on the shop wall. The scheduling layer shows the plan. Which jobs run on which machines in what sequence, and when. It does not automatically update when the plan breaks down. A machine that goes down at 10 AM might not be reflected in the schedule until someone manually updates it at 2 PM, or the next morning, or never.
The Floor Layer
What is actually happening right now. Machine status: running, idle, down for maintenance, waiting for setup. Job progress: operation 3 of 7, 60% complete. Operator assignment: first shift started the job, second shift is continuing. Material availability: the bar stock is staged at the machine, or it is still sitting on the receiving dock waiting for inspection. This information lives in the heads of supervisors and operators, on paper travelers making their way through the shop, and occasionally in a data collection terminal that feeds a system nobody has opened since last Tuesday.
Where Late Deliveries Are Born
The production manager sits in a meeting at 8 AM and reviews the schedule. Sixteen jobs are supposed to ship this week. The ERP shows materials in house for all of them. The schedule shows capacity allocated. Everything looks on track.
By 11 AM, three things have changed. The Mazak Integrex threw an alarm and is waiting for a maintenance technician who is finishing another repair. The heat treat vendor called to say a batch will return 2 days late. And the quality department placed a hold on a lot of 6061 bar stock because the material cert does not match the purchase order specification. Three of 16 jobs are now at risk. The production manager will not learn any of this until they walk the floor, check email, and make four phone calls. That might happen at 1 PM. It might happen tomorrow morning if the rest of the day fills up with other fires.
The gap between what changed and when the production manager finds out is where late deliveries originate. Every hour of that gap narrows the recovery options. Production visibility eliminates the gap entirely.
What Production Visibility Actually Means
Production visibility is the ability to see every active job's status, every machine's condition, and every potential delivery risk from a single screen, updated continuously and automatically. The system does not depend on anyone entering status updates manually (though it accepts them). It pulls data from connected systems and calculates job progress, remaining work, and delivery probability in real time without human intervention.
The distinction between a production visibility system and a shop floor data collection system is intelligence. Data collection tells you what happened yesterday. Visibility tells you what is happening now and what will happen by Friday if nothing changes. The AI component bridges the gap between historical reporting and predictive operational awareness.
A production manager using a visibility system opens a dashboard that answers five questions without a single click: Which jobs ship this week. Which of those are behind schedule. Why are they behind. What is the impact if they ship late. What action can be taken right now to recover. The system answers these by synthesizing data from ERP, scheduling, floor status, equipment monitoring, and supplier communications into a single operational picture. One screen. Five answers. Before the first cup of coffee.
What a Production Dashboard Should Show
Every element on the production dashboard should answer a question the production manager or shop owner asks every day. If a dashboard element does not answer a daily question, it is consuming screen space that should be used for something that does.
Job Status View
Every active job listed with current operation, percent complete, scheduled ship date, and a calculated risk indicator: on track, at risk, or behind. The risk indicator is computed automatically from historical data, not manually assigned by a supervisor. If a job sits at operation 4 of 8 with 3 days until ship date, and the remaining operations historically require 4.5 days based on actual cycle times for this part type on these machines, the system flags it as at risk without anyone having to notice the problem.
Jobs are sortable by ship date, customer priority, risk level, and dollar value. The production manager can view all 200 active jobs or filter immediately to the 12 that need attention today.
Machine Status View
Every machine displayed with current status: running (with job number and operator name), idle, in setup, down for maintenance, or down with no assignment. Utilization percentage for the current shift. For machines connected through MTConnect or OPC-UA, real-time cycle count and estimated time to operation completion down to the minute.
This view answers a single question: "Where is my capacity right now?" When the production manager needs to accelerate a late job, they see which machines are available, which operators are on shift and qualified, and which lower-priority jobs can be bumped or rescheduled with the least downstream impact.
Delivery Risk Queue
A prioritized list of jobs at risk of missing their ship date, ranked by customer impact and dollar value. Each entry includes the specific reason the job is at risk (machine down, material delay, quality hold, capacity conflict, outside processing delay), the estimated delay in hours or days, and suggested recovery actions based on current available capacity. This is the screen the production manager opens before anything else in the morning. On a good day, 3 to 5 items. On a bad day, 12 to 15. Either way, they are visible and actionable before the first phone call from a customer asking where their parts are.
Customer Delivery Scorecard
On-time delivery performance by customer over 30, 60, and 90-day windows. This view answers the question that determines the long-term health of the business: which customer relationships are we strengthening, and which are we damaging. A customer at 96% OTD over 90 days is a relationship being maintained. A customer at 78% is a relationship being destroyed, one late shipment at a time. The production manager uses this view to prioritize scheduling decisions when two competing jobs need the same machine at the same time.
Material and Supplier Status
Outstanding purchase orders with expected delivery dates, cross-referenced against the specific jobs waiting for that material. If Supplier A's shipment of 17-4 PH stainless is 3 days late and four jobs depend on it, all four jobs appear in the delivery risk queue automatically. The system tracks supplier on-time delivery performance over time and factors that reliability history into risk calculations on every future job routed through that supplier.
How AI Predicts Late Deliveries Before They Happen
The intelligence layer uses historical production data to predict which jobs are likely to ship late, even when the current schedule suggests they are on track. The predictions improve with every completed job.
Pattern-Based Risk Detection
The system learns from your shop's specific production history. If jobs with more than 5 operations involving outside heat treatment that run on the Mazak Integrex historically take 22% longer than their scheduled time, the system applies that pattern to every new job with similar characteristics. A job that looks on track based on the schedule might carry a moderate risk flag because the system has seen this movie before and knows how it ends. The schedule is optimistic. The data is not.
Cascading Delay Analysis
When one job falls behind, the system calculates the full downstream impact. If Job A is delayed 8 hours on Machine 3, and Machine 3 is scheduled to start Job B immediately after, the system flags both jobs and traces the cascade through the entire schedule. In a tightly loaded shop, a single delay propagates through 5 to 10 downstream jobs within 48 hours. No production manager can trace these cascades manually across 200 active jobs in real time. The system does it continuously, recalculating every time a status changes.
Capacity Conflict Detection
The system compares the total remaining work across all active jobs against available machine hours and operator hours for each remaining shift before ship dates. When the math does not work, when there are more required hours than available hours, the system identifies which specific jobs cannot physically be completed on time and surfaces them 3 to 5 days before the capacity crunch actually hits. That window is the difference between planned overtime and a panicked phone call to the customer on ship day.
Supplier Risk Scoring
The system tracks every purchase order and outside processing order against its promised delivery date. Over months, it builds a reliability score for each supplier based on actual performance. A heat treat vendor who delivers on time 72% of the time carries a fundamentally different risk weighting than one who delivers at 95%. Jobs routed through lower-reliability suppliers carry higher risk scores automatically, giving the production manager earlier warning on the work most likely to be affected by supply chain delays. The system does not assume your vendors will hit their dates. It knows which ones actually do.
Equipment Monitoring and the Utilization Gap
Equipment monitoring feeds directly into the production visibility system because machine utilization is a primary variable in delivery performance. Understanding it requires more than a general sense of how busy the shop floor looks at any given moment.
Connecting to Machines
Modern CNC equipment from Mazak, DMG MORI, Haas, Okuma, and Doosan supports data output through MTConnect or OPC-UA protocols. These connections provide real-time cycle status, spindle load, feed rates, alarm codes, and program information. The connection is read-only. It does not affect machine operation, change any parameters, or introduce any risk to the production process.
For older equipment without native connectivity, retrofit sensors provide the critical data at low cost. A current sensor on the machine's main power circuit detects cycle start, cycle end, and idle states. Vibration sensors on the spindle housing track cutting conditions and flag anomalies that indicate tool wear or bearing degradation. These sensors cost $200 to $800 per machine, install in under an hour with no machine downtime, and provide 80% of the utilization data available from a full MTConnect connection.
What Utilization Data Reveals
Most job shops estimate their equipment utilization at 60% to 75%. Actual measured utilization, tracked by the minute across a full week, typically comes in at 35% to 55%. That gap between perception and reality is large enough to change capital spending decisions, scheduling strategies, and delivery commitments.
When the data shows that the Haas VF-6 runs at 41% utilization, you can investigate exactly why. Setup time accounts for 18%. Waiting for material accounts for 12%. Unplanned downtime accounts for 8%. Scheduled maintenance accounts for 4%. The remaining 17% is idle time with no assigned reason. That 17% is recoverable capacity: roughly 7 hours per day on a two-shift operation. Recovering half of it through better scheduling and material staging delivers the productive equivalent of a $400,000 machine purchase without the capital outlay, the delivery wait, or the floor space.
Utilization as a Delivery Recovery Lever
When a job is at risk of shipping late, the production manager needs to know precisely where the capacity exists to recover it. Utilization data answers that question in seconds instead of hours. Machine 4 has 6 hours of available capacity on second shift this week. Machine 11 has 4 hours on Thursday. Operator Martinez is qualified on both machines and available for overtime. The recovery plan assembles itself when the data is visible. Without the data, the production manager makes phone calls, walks the floor, guesses at machine availability, and burns two hours of management time that could have been spent running the recovery.
Data Architecture: Connecting Without Replacing
A production visibility system connects to three data layers. The architecture reads from existing systems without modifying, replacing, or disrupting any of them.
ERP Connection
The system reads order data, job routings, material requirements, ship dates, and customer information from your ERP. For systems with API access (ProShop, Epicor Kinetic, Plex), the connection is real-time and continuous. For systems using database queries (JobBOSS, Global Shop Solutions, IQMS), the system reads on a configurable interval, typically every 5 to 15 minutes. The ERP is not modified in any way. The connection is read-only. Your team continues using the ERP exactly as they do today.
Scheduling Connection
If the shop uses dedicated scheduling software, the system reads the current schedule and compares it against actual floor status in real time. If the shop uses spreadsheets or whiteboards (and many do, because no scheduling software has earned their trust), the production visibility system can serve as the scheduling layer, generating recommended job sequences based on ship dates, machine capacity, material availability, and the risk scores it has computed.
Floor Data Connection
Machine data flows through MTConnect adapters, OPC-UA servers, or retrofit sensors. Operator inputs (job started, operation completed, quality hold, material issue) come through simple terminal or tablet interfaces at each work center. Each input takes under 10 seconds. Some shops use barcode scanning against the job traveler to log operation completions, which takes 3 seconds and requires no typing. The system accepts whatever input method the floor team will actually use consistently, because the best data collection system in the world is useless if nobody uses it.
The three data streams converge in the visibility platform, which reconciles discrepancies (ERP says material is available, floor input says it never arrived), calculates job progress against plan, and generates the risk assessments and dashboard views the production team relies on every shift.
Implementation: Thirty Days to First Value
Implementing a production visibility system follows a phased approach that produces measurable results within the first 30 days of the dashboard going live.
Weeks 1-2: Assessment. Map the current production management workflow end to end. Identify every system holding production data: ERP, scheduling tools, spreadsheets, whiteboards, machine controllers. Document how the production manager currently tracks job status and how long it takes. Quantify the current OTD rate against original promise dates, the number of late deliveries per month, and the direct costs of late delivery over the last 12 months (expediting, overtime, premium freight, contract penalties). These numbers establish the baseline the system is measured against.
Weeks 2-4: Data connection. Connect to the ERP. Configure machine data collection for the bottleneck equipment first, typically 8 to 12 machines carrying 70% of the production load. Install floor data input terminals at key work centers. Validate that data flows correctly from all three sources into the visibility platform. Test the reconciliation logic on real production data.
Weeks 3-6: System build. Configure the dashboard views the production team will use daily. Build the risk calculation engine using the shop's historical production data (more history produces more accurate initial predictions, but even 12 months of data provides a useful starting point). Set customer priority rules. Define alert thresholds. Create the interface that the production manager, supervisors, and shop owner will actually open every morning.
Weeks 5-8: Pilot and refine. Run the system alongside existing processes for 2 to 3 weeks. Compare the system's risk predictions against actual outcomes. Track every prediction: which jobs the system flagged that actually shipped late, which jobs it missed, and which jobs it flagged that the team recovered through intervention. Refine the algorithms based on those results. Let the production team drive interface improvements based on how they actually use the tool.
Weeks 7-10: Full deployment. Transition to the visibility system as the primary production management tool. Train all supervisors and lead personnel. Expand machine connectivity to remaining equipment. Establish the daily operating rhythm: production manager reviews the dashboard at shift start, addresses the risk queue by priority, and verifies status at mid-shift.
The system improves continuously after deployment. Every completed job provides new data that refines the risk prediction models. Within 90 days, delivery risk predictions typically reach 80% to 85% accuracy. Within 6 months, 90% or higher. The system learns your shop's specific patterns, your specific machines, your specific suppliers, and your specific customers. This is how AI-powered production intelligence works in practice, as covered in our complete guide to AI for manufacturers.
Getting Reliable Data from the Floor
The biggest implementation challenge has nothing to do with technology. It is getting consistent data input from operators and supervisors on the shop floor. The system depends on knowing when jobs start, when operations complete, and when issues occur. Blind spots in floor data produce blind spots in risk prediction.
Designing for the Floor
The input interface must be simple enough that an operator can log an event in under 10 seconds without removing safety glasses or gloves. Barcode scanning against the job traveler is the most reliable method: one scan at operation start, one scan at completion. Two interactions per operation, 3 seconds each.
Touchscreen terminals at work centers handle non-standard events: machine down, quality hold, waiting for material, setup in progress. The interface uses 6 to 8 large buttons covering 95% of all events that occur on the floor. No typing. No dropdown menus. No login screens. The terminal stays active and ready. If an operator has to remember a password to report a machine down, the machine down goes unreported.
Tablet-based interfaces serve supervisors who move between work centers throughout the shift. They update multiple job statuses, log material arrivals, and add context notes without returning to a fixed terminal.
Building the Habit
The first 2 weeks of floor data collection determine whether the system achieves long-term adoption or dies quietly. The mechanism is simple: if operators see that their inputs produce visible results, they keep logging. If the data disappears into a system nobody checks, they stop within a month.
The production manager's daily use of the dashboard is the most powerful adoption driver on the floor. When an operator reports a machine down event and a maintenance technician arrives within 30 minutes because the system flagged it and the production manager acted on it, the operator learns that reporting matters. When a supervisor reports a material shortage and purchasing expedites within the hour because the visibility system showed the delivery impact across four downstream jobs, the supervisor reports the next shortage without being asked. The feedback loop between floor input and management response is what makes data collection sustainable.
The Six Metrics That Drive Improvement
Production visibility generates enormous amounts of data. The metrics that actually drive operational improvement are a focused subset. Track these six and let everything else serve as supporting detail.
On-time delivery rate. Measured against the original promise date, not the last revised date. This is the metric customers use to decide whether to send you their next program. Target: 93% or above for general job shop work, 97% or above for high-consequence aerospace and medical device customers who will remove you from their approved supplier list at 90%.
Risk queue accuracy. What percentage of jobs flagged as at-risk actually ship late if no intervention occurs. This measures the quality of the system's predictions and improves with every completed job. The number starts at 65% to 70% in the first month and reaches 85% to 90% within 6 months as the model learns your shop's specific patterns.
Recovery rate. What percentage of at-risk jobs are recovered to on-time status through intervention after the system flags them. A high recovery rate means the system provides enough warning time for the production team to act. If jobs are being flagged and still shipping late, either the warning comes too late or the recovery resources are insufficient. Both are diagnosable and fixable.
Equipment utilization (OEE). Overall Equipment Effectiveness measured as availability multiplied by performance multiplied by quality. Most job shops have never had the data to calculate true OEE by machine. The visibility system provides it automatically for every connected piece of equipment. Industry benchmark for job shop OEE is 60%. Most shops measure between 35% and 50% when the data is honest.
Schedule adherence. How closely actual production follows the planned sequence. Low schedule adherence indicates poor scheduling, frequent interruptions, or both. The visibility system identifies which specific causes drive the most deviation and quantifies their impact in lost hours and late deliveries.
Average days late. For jobs that ship late, how late are they on average. Reducing average late days from 5.2 to 2.1 represents a meaningful improvement in customer impact even before OTD percentage reaches target. Customers notice the difference between a shipment that arrives a day late and one that arrives a week late, and they respond accordingly.
ROI: The Math on Visibility
Production visibility ROI comes from four sources. Each one is measurable within the first 90 days of operation, using numbers your shop already tracks.
Direct Late Delivery Cost Reduction
Tally your expediting charges, premium freight costs, overtime for late order recovery, and contract penalties over the last 12 months. Most shops in the 50 to 200 employee range spend $120,000 to $400,000 annually on these costs, and many undercount because the costs are scattered across different line items and departments. A visibility system that improves OTD from 82% to 91% typically reduces these direct costs by 40% to 60% in the first year. On a shop spending $250,000 annually on late delivery costs, that is $100,000 to $150,000 recovered.
Revenue Retention
Customer attrition from chronic delivery problems is the largest cost of poor visibility and the hardest to measure precisely, because customers who leave rarely announce the reason. One approach: identify every customer whose order volume declined more than 20% year-over-year and investigate whether delivery performance was a contributing factor. In manufacturing, the departure signal is not a phone call. It is simply fewer purchase orders, quarter after quarter, until the customer is gone. For a shop running $8 million in annual revenue, losing 3% to delivery-driven attrition is $240,000 per year. Retaining even half of that through improved OTD more than justifies the entire visibility system investment.
Capacity Recovery
Better visibility into equipment utilization and production flow reveals capacity that was previously invisible because no one measured it. A shop that discovers 15% more usable capacity in its existing equipment can take on additional work without capital investment. On a floor running $3 million in CNC equipment, recovering 15% utilization is the functional equivalent of purchasing $450,000 in new machines, without the capital outlay, the 16-week delivery wait, or the additional floor space.
Labor Efficiency
Production managers, supervisors, and planners spend 2 to 4 hours per day gathering status information: walking the floor, checking three systems, making phone calls, sitting in status meetings that exist only because no one has a current view of production. A visibility system that consolidates this information onto a single screen recovers 50% to 70% of that time. For a shop with 3 people in production management roles, that is 3 to 6 hours per day of management capacity redirected from status chasing to process improvement, customer communication, and capacity planning. The work that actually moves the business forward.
Frequently Asked Questions
Do we need to replace our ERP?
No. The visibility system reads from your existing ERP without modifying it. Your ERP remains the system of record for orders, invoicing, and job costing. The visibility system adds an intelligence layer on top that your ERP was never designed to provide, and both run in parallel. Your team continues using the ERP exactly as they do today. For details on how AI tools integrate with JobBOSS, Epicor, ProShop, and other common manufacturing ERPs, see our guide to AI quoting, which covers ERP integration architecture in depth.
What if our machines are too old for data collection?
Retrofit sensors work on any machine with a power circuit, regardless of age or manufacturer. A 1990s Haas or Fadal without MTConnect capability can be monitored for cycle status, idle time, and utilization through a $400 current sensor and a $200 wireless transmitter. The data is less granular than what a 2024 Mazak provides natively, but it covers the 80% that matters most for delivery management: is the machine running or not, and if not, how long has it been idle. Full OEE calculation requires additional data points, but basic utilization tracking and delivery risk assessment work with retrofit sensors on any equipment.
How do we get operators to actually log data?
Two requirements, both non-negotiable. First, design the input for the floor. Barcode scanning takes 3 seconds. Large touchscreen buttons take 5 seconds. No logins. No typing. No menus. No password resets. Second, produce visible results from their inputs. When operators see that reporting a machine down event triggers a maintenance response within 30 minutes, they report the next event. When reporting a material shortage triggers purchasing action within the hour because the system showed the delivery impact, the supervisor reports without being asked. The production manager's daily, visible use of the dashboard is the single strongest driver of floor-level adoption.
How much does this cost?
A production visibility system for a shop with 30 to 150 employees and 15 to 40 machines typically costs $90,000 to $200,000 for the initial build, including ERP integration, machine connectivity hardware and configuration, floor input terminals, dashboard development, and the AI risk prediction engine. Ongoing costs for system maintenance, model refinement, and expanding connectivity to additional equipment run $3,000 to $6,000 per month. Compare that to one lost customer, or one year of expediting charges, or one capital equipment purchase that could have been avoided with better utilization of the machines you already own.
How fast will we see results?
The dashboard provides immediate value from the day it goes live, because the simple act of consolidating scattered data into a single view reveals problems that were previously invisible. Production managers consistently report finding 3 to 5 at-risk jobs on the first morning that they would not have caught until days later under the old process. Risk prediction accuracy improves over the first 90 days as the system accumulates data specific to your operation. Most shops see measurable OTD improvement within the first 60 days, with full impact realized by month 4 to 6 as the team integrates the system into their daily workflow and the risk models reach mature accuracy.
Can this work for a shop that runs both production and job shop work?
Yes. The system handles mixed-mode manufacturing by applying different tracking and risk models to different job types. Repeat production jobs have predictable cycle times and established routings, so risk predictions are highly accurate from the start. First-run job shop work carries more variability, so the system applies wider risk margins and flags uncertainty explicitly. Both job types appear on the same dashboard, sorted and filtered however the production team prefers. The system adapts to how your shop actually runs, not to a theoretical model of how manufacturing is supposed to work.
See Where Your Deliveries Are Actually At Risk
We start with an assessment. Map your production data, identify the visibility gaps, and show you what a connected system looks like with your data.
Talk to Our Team →