The lacking information hyperlink in enterprise AI: Why brokers want streaming context, not simply higher prompts

Metro Loud
13 Min Read



Enterprise AI brokers at this time face a elementary timing drawback: They’ll't simply act on vital enterprise occasions as a result of they aren't all the time conscious of them in real-time.

The problem is infrastructure. Most enterprise information lives in databases fed by extract-transform-load (ETL) jobs that run hourly or every day — in the end too gradual for brokers that should reply in actual time.

One potential method to sort out that problem is to have brokers straight interface with streaming information techniques. Among the many major approaches in use at this time are the open supply Apache Kafka and Apache Flink applied sciences. There are a number of industrial implementations primarily based on these applied sciences, too, Confluent, which is led by the unique creators behind Kafka, being considered one of them.

In the present day, Confluent is introducing a real-time context engine designed to unravel this latency drawback. The know-how builds on Apache Kafka, the distributed occasion streaming platform that captures information as occasions happen, and open-source Apache Flink, the stream processing engine that transforms these occasions in actual time.

The corporate can be releasing an open-source framework, Flink Brokers, developed in collaboration with Alibaba Cloud, LinkedIn and Ververica. The framework brings event-driven AI agent capabilities on to Apache Flink, permitting organizations to construct brokers that monitor information streams and set off robotically primarily based on situations with out committing to Confluent's managed platform.

"In the present day, most enterprise AI techniques can't reply robotically to essential occasions in a enterprise with out somebody prompting them first," Sean Falconer, Confluent's head of AI, instructed VentureBeat. "This results in misplaced income, sad clients or added threat when a fee fails or a community malfunctions."

The importance extends past Confluent's particular merchandise. The trade is recognizing that AI brokers require totally different information infrastructure than conventional functions. Brokers don't simply retrieve data when requested. They should observe steady streams of enterprise occasions and act robotically when situations warrant. This requires streaming structure, not batch pipelines.

Batch versus streaming: Why RAG alone isn't sufficient

To know the issue, it's essential to tell apart between the totally different approaches to shifting information by means of enterprise techniques and the way they’ll connect with agentic AI.

In batch processing, information accumulates in supply techniques till a scheduled job runs. That job extracts the information, transforms it and hundreds it right into a goal database or information warehouse. This may happen hourly, every day and even weekly. The strategy works nicely for analytical workloads, but it surely creates latency between when one thing occurs within the enterprise and when techniques can act on it.

Information streaming inverts this mannequin. As an alternative of ready for scheduled jobs, streaming platforms like Apache Kafka seize occasions as they happen. Every database replace, consumer motion, transaction or sensor studying turns into an occasion revealed to a stream. Apache Flink then processes these streams to affix, filter and mixture information in actual time. The result’s processed information that displays the present state of the enterprise, updating constantly as new occasions arrive.

This distinction turns into vital when you think about what sorts of context AI brokers really need. A lot of the present enterprise AI dialogue focuses on retrieval-augmented era (RAG), which handles semantic search over information bases to seek out related documentation, insurance policies or historic data. RAG works nicely for questions like "What's our refund coverage?" the place the reply exists in static paperwork.

However many enterprise use circumstances require what Falconer calls "structural context" — exact, up-to-date data from a number of operational techniques stitched collectively in actual time. Contemplate a job advice agent that requires consumer profile information from the HR database, searching conduct from the final hour, search queries from minutes in the past and present open positions throughout a number of techniques.

"The half that we're unlocking for companies is the power to primarily serve that structural context wanted to ship the freshest model," Falconer mentioned.

The MCP connection drawback: Stale information and fragmented context

The problem isn't merely connecting AI to enterprise information. Mannequin Context Protocol (MCP), launched by Anthropic earlier this yr, already standardized how brokers entry information sources. The issue is what occurs after the connection is made.

In most enterprise architectures at this time, AI brokers join by way of MCP to information lakes or warehouses fed by batch ETL pipelines. This creates two vital failures: The info is stale, reflecting yesterday's actuality slightly than present occasions, and it's fragmented throughout a number of techniques, requiring vital preprocessing earlier than an agent can cause about it successfully.

The choice — placing MCP servers straight in entrance of operational databases and APIs — creates totally different issues. These endpoints weren't designed for agent consumption, which might result in excessive token prices as brokers course of extreme uncooked information and a number of inference loops as they attempt to make sense of unstructured responses.

"Enterprises have the information, but it surely's usually stale, fragmented or locked in codecs that AI can't use successfully," Falconer defined. "The true-time context engine solves this by unifying information processing, reprocessing and serving, turning steady information streams into reside context for smarter, sooner and extra dependable AI choices."

The technical structure: Three layers for real-time agent context

Confluent's platform encompasses three parts that work collectively or adopted individually.

The real-time context engine is the managed information infrastructure layer on Confluent Cloud. Connectors pull information into Kafka matters as occasions happen. Flink jobs course of these streams into "derived datasets" — materialized views becoming a member of historic and real-time indicators. For buyer help, this may mix account historical past, present session conduct and stock standing into one unified context object. The Engine exposes this by means of a managed MCP server.

Streaming brokers is Confluent's proprietary framework for constructing AI brokers that run natively on Flink. These brokers monitor information streams and set off robotically primarily based on situations — they don't look ahead to prompts. The framework contains simplified agent definitions, built-in observability and native Claude integration from Anthropic. It's accessible in open preview on Confluent's platform.

Flink Brokers is the open-source framework developed with Alibaba Cloud, LinkedIn and Ververica. It brings event-driven agent capabilities on to Apache Flink, permitting organizations to construct streaming brokers with out committing to Confluent's managed platform. They deal with operational complexity themselves however keep away from vendor lock-in.

Competitors heats up for agent-ready information infrastructure

Confluent isn't alone in recognizing that AI brokers want totally different information infrastructure. 

The day earlier than Confluent's announcement, rival Redpanda launched its personal Agentic Information Aircraft — combining streaming, SQL and governance particularly for AI brokers. Redpanda acquired Oxla's distributed SQL engine to offer brokers normal SQL endpoints for querying information in movement or at relaxation. The platform emphasizes MCP-aware connectivity, full observability of agent interactions and what it calls "agentic entry management" with fine-grained, short-lived tokens.

The architectural approaches differ. Confluent emphasizes stream processing with Flink to create derived datasets optimized for brokers. Redpanda emphasizes federated SQL querying throughout disparate sources. Each acknowledge brokers want real-time context with governance and observability.

Past direct streaming opponents, Databricks and Snowflake are basically analytical platforms including streaming capabilities. Their energy is advanced queries over massive datasets, with streaming as an enhancement. Confluent and Redpanda invert this: Streaming is the muse, with analytical and AI workloads constructed on prime of information in movement.

How streaming context works in observe

Among the many customers of Confluent's system is transportation vendor Busie. The corporate is constructing a contemporary working system for constitution bus firms that helps them handle quotes, journeys, funds and drivers in actual time. 

"Information streaming is what makes that potential," Louis Bookoff, Busie co-founder and CEO instructed VentureBeat. "Utilizing Confluent, we transfer information immediately between totally different elements of our system as a substitute of ready for in a single day updates or batch reviews. That retains every little thing in sync and helps us ship new options sooner.

Bookoff famous that the identical basis is what is going to make gen AI invaluable for his clients.

"In our case, each motion like a quote despatched or a driver assigned turns into an occasion that streams by means of the system instantly," Bookoff mentioned. "That reside feed of knowledge is what is going to let our AI instruments reply in actual time with low latency slightly than simply summarize what already occurred."

The problem, nonetheless, is perceive context. When 1000’s of reside occasions move by means of the system each minute, AI fashions want related, correct information with out getting overwhelmed.

 "If the information isn't grounded in what is going on in the actual world, AI can simply make unsuitable assumptions and in flip take unsuitable actions," Bookoff mentioned. "Stream processing solves that by constantly validating and reconciling reside information in opposition to exercise in Busie."

What this implies for enterprise AI technique

Streaming context structure indicators a elementary shift in how AI brokers devour enterprise information. 

AI brokers require steady context that blends historic understanding with real-time consciousness — they should know what occurred, what's occurring and what may occur subsequent, suddenly.

For enterprises evaluating this strategy, begin by figuring out use circumstances the place information staleness breaks the agent. Fraud detection, anomaly investigation and real-time buyer intervention fail with batch pipelines that refresh hourly or every day. In case your brokers have to act on occasions inside seconds or minutes of them occurring, streaming context turns into mandatory slightly than non-obligatory.

"Whenever you're constructing functions on prime of basis fashions, as a result of they're inherently probabilistic, you employ information and context to steer the mannequin in a path the place you need to get some form of end result," Falconer mentioned. "The higher you are able to do that, the extra dependable and higher the end result."

Share This Article