Beyond Order-Taking: Roy Baharav on Why the Drive-Thru Pickup Window Is the Next AI Frontier
In quick-service restaurants, the drive-thru is no longer a supporting channel; it is the core revenue engine, accounting for as much as 70% of sales in many brands. Over the past decade, operators have invested heavily in optimizing throughput at the speaker box and efficiency in the kitchen, but the system has remained fundamentally fragmented. Each stage of the experience has been measured independently, leaving a critical gap in understanding what actually happens in the moments that determine customer satisfaction: the pickup window.
As labor constraints tighten and customer expectations rise, the industry is beginning to confront a structural limitation in how it measures performance. Speed-of-service metrics capture how fast a car moves through the lane, but not whether the guest received the correct order, was greeted properly, or had a positive interaction at the final point of contact. That gap has created an environment where operational decisions are made with partial visibility, and where the most human part of the experience remains largely unmeasured.
It is in this context that companies like Hi Auto are expanding the scope of voice AI beyond order-taking into what it calls “window intelligence,” an attempt to bring structure, data, and accountability to the pickup window for the first time.
The overlooked gap in drive-thru performance
The pickup window has historically been treated as an extension of fulfillment rather than a distinct operational environment. According to Roy Baharav, CEO of Hi Auto, this blind spot persisted not because it was unimportant, but because it was previously impossible to measure, and, in many cases, simply not visible as a problem until operators began asking different questions.

The motivation to extend visibility into the window did not come purely from internal product strategy. It also came directly from operators in the field who were already feeling the gap. Baharav points to feedback from Chuck Doran, Co-Owner at Lee’s Famous Recipe Chicken, as a defining example of how the problem surfaced operationally rather than theoretically.
“Chuck Doran, Co-Owner at Lee’s, described the pickup window as historically a blind spot and asked us to extend AI past ordering.”
This operator-driven perspective reinforced what the data was already suggesting: that the window is not just a fulfillment endpoint, but a critical interaction layer where friendliness, escalation handling, and protocol adherence directly shape the customer experience. Baharav notes that once visibility is extended into this layer, entirely new categories of operational signals become measurable.
“It unlocks the ability to understand friendliness, escalations, protocol accuracy, and other behaviors.”
It also revealed an architectural advantage in connecting systems that were previously separated. Because Hi Auto already captures structured order data at the speaker, linking that information with window audio creates a continuous chain of truth across the entire drive-thru journey.
“It also fit the architecture. Our existing AI already knows what was ordered at every drive-thru we power. Combining order data with window audio enables things other systems can do, like flagging POS voids that have no corresponding conversation.”
From fragmented metrics to end-to-end visibility
The concept of end-to-end visibility reframes the drive-thru not as a series of disconnected steps, but as a continuous system. For operators, this shift is significant because it connects decisions made at the speaker with outcomes observed at the window and beyond.
“End-to-end means from the moment a car pulls in to the moment it pulls away with food.”
This matters because drive-thrus represent the majority of revenue for many quick-service restaurants, yet optimization efforts have traditionally been siloed across order accuracy, kitchen preparation, and window timing. Without a unified view, operators could not trace how issues propagate across the system.
“Drive-thrus account for up to 70% of QSR revenue. Operators have optimized pieces of the experience in isolation: order accuracy, kitchen prep, and window times. They couldn’t connect them.”
How window intelligence works in practice
Window intelligence builds on the existing AI infrastructure used for order-taking by extending visibility into the final interaction point. The system begins with audio capture at the pickup window, where speech recognition and speaker separation distinguish between employee and guest voices. From there, behavioral models analyze the interaction for patterns that correlate with operational outcomes.
“A microphone at the pickup window captures every guest-employee exchange. Speech recognition processes it. Speaker separation distinguishes employee from guest. Then analysis: greeting detection, tone, friendliness signals, dispute escalation, and accuracy flags when guests raise issues at the window.”
What makes this approach operationally powerful is the ability to connect real-time conversation data with POS records. This enables operators to detect inconsistencies that were previously invisible, such as voided orders without corresponding dialogue or changes made after the original order was placed.
Engineering complexity in a real-world environment
Unlike controlled digital environments, the drive-thru presents constant acoustic and behavioral variability. Engines, weather conditions, overlapping conversations, and inconsistent speaker positioning all introduce noise that standard speech systems are not designed to handle.
“The acoustic environment. Engines, wind, music, multiple speakers from the kitchen and the car, varied distances from the mic. Off-the-shelf speech-to-text doesn’t hold up.”
At scale, the challenge extends beyond recognition into real-time system integration. Orders must be transmitted instantly to kitchen systems to avoid delays, while also preserving accuracy across highly customized requests.
“A guest says ‘give me a number three, but swap the drink for a large sweet tea, and actually make two of them.’ That’s intent detection with menu structure and modifier logic.”
ROI across labor, accuracy, and ticket size
While window intelligence is still in closed beta, Hi Auto’s broader platform already demonstrates measurable operational impact across speed, accuracy, and labor utilization. “We see that Hi Auto is able to improve Speed of Service by 15-20 seconds.”
Accuracy levels also outperform industry benchmarks, particularly in hybrid human-AI configurations. “96% across our platform. 97% at Bojangles in the InTouch Insight & QSR Magazine 2025 Emerging Technology Study.”
Labor efficiency is another key driver of adoption, with operators reallocating saved hours toward guest-facing activities rather than simply reducing headcount. “3 to 8 hours saved per store, per day… that’s $1,500 to $4,000 per location in monthly savings.”
A structural shift, not a temporary solution
For operators, the long-term implication is not simply automation, but redefinition of roles within the store. AI systems are increasingly absorbing transactional work, allowing employees to focus on hospitality and execution quality at the window.
“We take one task off the team’s plate, order taking. The team stays for everything else.”
That shift is already visible in workforce stability and operational consistency. “Stores running Hi Auto see a 17% reduction in employee turnover.”
As adoption scales across nearly 1,000 locations processing over 100 million orders annually, the strategic question for operators is no longer whether AI can function in the drive-thru, but why parts of the experience remain unmeasured at all.
n
:::tip
This story was distributed as a release by Jon Stojan under HackerNoon’s Business Blogging Program.
:::