Leveraging Generative AI for Predictions: A Promising Frontier
Can Generative AI Predict from Structured Data?
TL;DR Answer: Yes, but you need use case nuance.
As a data scientist, I have always been frustrated with the limitations of ML models in terms of the sheer time and effort required to build and deploy models for traditional structured and semi-structured prediction problems like churn or demand forecasting. While solutions like AutoML have emerged in recent years, they still pose challenges since a lot of domain, problem, feature and data context must be provided—extensive exploratory data analysis (EDA) and feature engineering are still necessary.
Think about decisions being made in an organization every day:
- Multiple Predictions: To model all decisions of an organization, you need to build numerous prediction models, and continuously monitor them. New models may also emerge as you mature.
- Speed Over Accuracy: Not all models need to be highly accurate; they need to enable faster decisions. For instance, a marketing manager might benefit from a capability that directionally understands campaigns as a good starting point.
- Unstructured and Evolving Data: Much of the data required for decision-making is unstructured and evolving, making analytical data marts less effective. There are a lot of situational criteria that cannot be modeled through optimization algorithms.
Now, how can large language models (LLMs) help?
The true potential of Generative AI and LLMs is to enable what we can refer to as "micro-decisions", which involve a lot of structured and unstructured data to make predictions, model scenarios & enable efficient decision-making for users for their daily work. From my research and experience, there are 3 application areas:
- Make or augment predictions using a combination of structured and unstructured data (which is the primary focus of the rest of this article).
- Create synthetic data or scenarios of events and their potential impact that can be used to augment predictions (e.g., data generation for minority consumers to solve class imbalance for prospect marketing).
- Use reasoning capabilities to augment decision-making. E.g., recommend a course of action for a non-performing marketing campaign.
Recommended by LinkedIn
Let's take the example of time series modeling (classification, anomaly detection, forecasting, etc.). The applicability of domain and problem-agnostic solutions is immense here - from inventory position predictions in supply chain to price performance monitoring for category managers. Though it may sound complex, significant work is already underway. Broadly, there are three methodologies:
- LLMs or LMMs as Enablers: This includes the automation and augmentation of synthetic data generation, data engineering, EDA, model selection, hyperparameter tuning, ensemble, and deployment. A lot of work has already been done on the automation of engineering and data generation activities, but there's a lot of untapped potential for feature engineering and on-the-fly modeling by leveraging LLMs. An agentic framework that can bring these components together can potentially automate and help with micro-decisions. There's a lot of work happening in the industry on multi-agents (LangGraph: Multi-Agent Workflows).
- LLMs as Forecasters (LLM-Centric Models): Although LLMs are fundamentally text-based, they are at their core transformers, which can be re-engineered for prediction since the attention mechanism is flexible enough. This can be done using concepts like custom time patching, custom tokenization, and text projection of time series along with optional fine-tuning. Examples include LLMTime and Time-LLM. Here's a good summary of this approach: Position Paper: What Can Large Language Models Tell Us about Time Series Analysis.
- Foundational Models for Time Series: Just like LLMs, we can build Large Timeseries Models (LTM) that are pre-trained foundational models on multiple domain datasets. Examples include TimeGPT and Lag-LLaMa. These models support zero-shot inferencing with accuracy comparable to traditional ML and statistical methods for time series. For more details: TimeGPT Foundational Model and Lag-LLaMa.
Combining these approaches with Generative AI capability of modeling scenarios through synthetic data generation can create robust, on-the-go decision enablement solutions for businesses.
These approaches offer the flexibility of fast experimentation capabilities, meaning they could be used for quick decision-making in an organization without undergoing the traditional route. Moreover, they can leverage a lot of external and live data that can enhance the overall model performance.
Having said that, these are underexplored areas and there will still be complex problems where the traditional way of solving an ML problem will still be required. Also, since these methods are just getting implemented, they might have limitations, which will become clearer when tested in real-life scenarios. Despite this, the potential for using Generative AI to enhance decision-making is massive.
P.S. Have you been experimenting with any of these areas? Share your experiences below, would also love to connect and understand your learnings.
#GenAI #PredictiveAnalytics #MachineLearning #DataScience #ArtificialIntelligence #timeseries
PhD Researcher // Machine Learning Engineer // Reinforcement Learning, Human-Agent Teaming
1yI've been exploring the use of LLMs in multi-agent systems, specifically systems with both human agents and LLM agents. My research isn't focused on predictive analysis though. But this article got me thinking; based on the work you and I have done in the past in supply chain, don't you think systems where specialist/expert LLM agents collaborate with human decision-makers to reach consensus could greatly accelerate micro-decisions? And do you see B2B clients going for such systems in the current landscape? Really interesting article man! I might deep dive into some aspects of this sometime.
Co-Founder of Altrosyn and Director at CDTECH | Inventor | Manufacturer
1yThe integration of Gen AI into traditional prediction problems marks a significant evolution in organizational decision-making processes, akin to historical shifts propelled by technological advancements. However, translating theoretical advancements into practical applications poses inherent challenges, reminiscent of past endeavors to implement emerging technologies at scale. Considering the complexity of multi-agent models and the nuanced nature of micro-decisions, how can organizations effectively balance the trade-offs between predictive accuracy and computational efficiency, especially in dynamic and high-dimensional data environments?