The Key to Achieving Real-time AI: Optimizing Your Data Approach By Mukundha Madhavan, APAC Tech lead, Datastax

The Key to Achieving Real-time AI: Optimizing Your Data Approach

Mukundha Madhavan, APAC Tech lead, Datastax

Mukundha Madhavan is a seasoned technology leader and strategist, currently spearheading innovation as the APAC Tech Lead for DataStax. Mukundha’s expertise lies in Real-time AI and Generative AI, rendering him a pivotal asset in driving technological advancements and solutions in these domains. With a rich history at Google, he played a pivotal role as the leader of APAC Partner Engineering.

Enterprises of all shapes and sizes are looking to take advantage of artificial intelligence (AI) and its numerous benefits. With India's digital economy predicted to be worth $1 trillion by 2025, the Ministry of Electronics and Information Technology (MeitY) expects a sizable impact from the adoption of AI and other emerging technologies. A PWC survey meanwhile, finds that 54% of local manufacturing companies are already incorporating some form of AI within their business operations.

However, taking successful pilot projects and turning them into long-term production deployments is still challenging for everyone but the biggest names such as Amazon, Netflix, Uber, and FedEx. With their considerable investments in data, analytics, and AI strategies, these brands are seeing returns in the form of customer experience optimisations and real-time decision-making. Uber's Michelangelo machine learning (ML) platform is an example of how reliable ML capabilities enable companies to forecast demand quickly.

So how can the rest of the pack get up to speed?

Batch AI vs. Real-time AI

Knowing the right approach to use is essential for maximising the success of automation, machine learning, and other AI projects. There are two main options – batch and real time. Batch processing uses patterns from historical data to gain insights from new sets of data that regularly come in, hence, the term "batch processing". The data thus collected over a period of time is great for things like trend analysis and predictive analysis.

In contrast, real-time AI focuses on responding quickly to certain transactions and events with the help of AI/ML models that are tailored directly to those measures, as opposed to deriving long-term trends from huge data sets. Examples of real-time AI use cases include personalisation, pricing decisions, and cyber security.

With radical differences between the batch and AI approaches carefully considering which AI architecture best suits a project's objectives is vital for success.

Realizing Real-Time AI Impact

To avoid data management issues, it is important to select a suitable approach to an organisation's data. A feature store, which stores and processes data fed into an ML model (“features”) for analysis is essential in the AI process. These data sets are prepared for analysis through transformations such as scaling values or comparison with prior records.

Although acceptable with a batch AI approach, this time lag can result in slower responses which are disastrous for those using real-time AI. For example, customers may become frustrated if interactions that normally take seconds stretch into minutes. Banks and other financial institutions have seconds to identify fraudulent transactions and any delay is a potential loss.

Additionally, too much aggregation and data transformation make it difficult to identify relevant actions quickly. Applying many different systems for the same data can lead to wrong recommendations being sent out at inappropriate times.

The Feature Store Is the Key to Real-Time AI

One way these issues can be addressed is by having the feature store develop a better understanding of the relationships between actions and items. With this, the feature store can be used on behaviours that change over time and prompt users when certain actions can be taken.

Examples include determining the appropriate part in a user journey to make an offer or recommending the optimal response an employee can make under a certain set of circumstances. With access to in-depth event data, feature store models can better understand a situation's potential outcomes and make the best recommendation to achieve the desired one. Even better, these models can utilise operational data so users can carry out specific actions when needed the most.

For this to work, the feature store should be part of a wider data pipeline where the data architecture does not hamper access and scaling up. Advanced open-source database management systems, like DataStax Astra DB or Apache Cassandra, can enable a scalable feature store that can handle millions of features while supporting data streaming for sharing data to external sources.

This approach makes it easier for data scientists to test their models, enable faster improvements, and then scale up the model to production deployment levels. This minimises the potential for any data leakage between training and production which could adversely affect model accuracy at scale.

For AI applications to be successful, massive amounts of event data must be used to build models that can be deployed into applications. These apps make the difference in terms of improving services, customer experience, and operations as it enables constant evolution whenever new data comes in.