Democratizing Marketing Mix Models (MMM) with Open Source and Gen AI

A practical system design combining open-source Bayesian MMM and GenAI for transparent, vendor-independent marketing analytics insights.

Image sourced from Freepik

Marketing Mix Models (MMM) have been in the industry for several years and recently they have experienced a renaissance. With digitally tracked signals being deprecated for increasing data privacy restrictions, Marketers are turning back to MMMs for strategic, reliable, privacy-safe measurement and attribution framework.

Unlike user-level tracking tools, MMM uses aggregated time-series and cross-sectional data to estimate how marketing channels drive business KPIs. Advances in Bayesian modeling with enhanced computing power has pushed MMM back in the center of marketing analytics.

For years, advertisers and media agencies have used and relied on Bayesian MMM for understanding marketing channel contributions and marketing budget allocation.

Role of GenAI in Modern MMM:

Increasing number of companies are now utilizing GenAI features as an enhancement to MMM in several ways.

1. Data Preparation and Feature Engineering

2. Pipeline Automation: Generating code for MMM pipeline

3. Insight Explanation — translate model insights into plain business language

4. Scenario planning and budget optimization

While these capabilities are powerful, they rely on proprietary MMM engines.

The purpose of this article is not to showcase how Bayesian MMM works but to demonstrate a potential open-source and free system design that marketers can explore without the need of subscribing to black box MMM stack that vendors in the industry provide.

The approach combines:

1. Google Meridian as the open-source Bayesian MMM engine

2. Open-source Large Language Model (LLMs) — ‘Mistral 7B’ as an insight and interaction layer on top of Meridian’s Bayesian inference output.

Here is a architecture diagram that represents proposed open-source system design for marketers.

Open-Source Stack workflow: Bayesian MMM + Gen AI

This architecture diagram was created using Gen-AI assisted design tools for rapid prototyping

This open-source workflow has several benefits:

1. Democratization of Bayesian MMM — Eliminates black box problem of proprietary MMM tools.

2. Cost Efficiency: Reduces financial barrier for small/medium businesses to access advanced analytics.

3. This seperation preserves statistcal rigor required from MMM engines and makes it just more accessible.

4. With GenAI insights layer, audience do not need to understand the Bayesian Math, instead they can just interact using GenAI prompts to learn about model insights on channel contribution, ROI, and possible budget allocation strategies.

5. Adaptability to newer open-source tools: GenAI layer can be replaced with newer LLMs as and when they are openly available to get enhanced insights.

Hands-on example of implementing Google Meridian MMM model with a LLM layer.

For the purpose of this showcase, I have used open-source model ‘Mistral 7B LLM’ which is an open-source LLM sourced locally from ‘Hugging Face’ platform hosted by ‘Llama’ engine.

This framework is supposed to be domain agnostic, i.e., any alternative open-source MMM models such Meta’s Robyn, PyMC etc. and LLM versions for GPT, Llama models can be used depending on the scale and scope of insights desired.

Important Note:

1. A synthetic marketing dataset was created having KPI as ‘Conversions’ and marketing channels as TV, Search, Paid Social, Email and OOH (Out-of-Home media).

2. Google Meridian produces rich output such as ROI, channel coefficients and contributions in driving KPI, response curves etc. While these output are statistically sound, they often require specialized expertise to interpret. This is where an LLM become valuable and can be used as an ‘insight translator’.

3. Google Meridian python code examples were used to run Meridian MMM model on synthetic marketing data created. For more information on how to run Meridian code, please refer: https://developers.google.com/meridian/notebook/meridian-getting-started

4. An open-source LLM model ‘Mistral 7B’ was utilized due to its compatability with the free tier of Google Colab GPU resources and also adequate model for generating instruction-based insights without relying on any API access requirements.

Example: Below snippet of Python code was executed in Google Collab platform:

A synthetic marketing dataset (not shown in this code) was created and as part of Meridian workflow requirement an input data builder instance is created as show below:

Configure and execute the Meridian MMM model:

# Initializing the Meridian class by passing loaded data and customized model specification. One advantage of using Meridian MMM is the ability to set modeling priors for each channel which gives modelers ability to set channel distribution as per historical knowledge of media behavior.

This code snippet runs the meridian model with defined priors for each channel on the input dataset generated. Next step is to assess model performance. While there are model output parameters such as R-squared, MAPE, P-Values etc. that can be assesed, for the purpose of this article I am just including visual assesment example:

Now that the Meridian MMM model has been executed, we have model output parameters for each media channels such as ROI, response curves, model coefficients, spend levels etc. We can bring all this information into a single input JSON object that can be used directly as an input to LLM to generate insights:

Downloading Mistral 7B LLM from Hugging Face platform locally and installing the required Llama engine to execute the LLM

Executing the Mistral LLM using the input JSON having Meridian MMM output and including the appropriate instructional prompt.

Example Output:

1. Based on the Marketing Mix Modeling (MMM) output, TV, OOH, and Search Ads have a positive ROI (3.0, 2.5, and 1.5 respectively), while Social Media and Email have a lower ROI (0.9 and 0.7 respectively).

2. The key drivers of ROI for each channel in simple business terms are:

– TV: Every dollar spent on TV ads generates $3 in sales.

– Social Media: Every dollar spent on Social Media ads generates $0.9 in sales.

– Search Ads: Every dollar spent on Search Ads generates $2.5 in sales.

– Email: Every dollar spent on Email marketing generates $0.7 in sales.

– OOH: Every dollar spent on Out of Home (OOH) ads generates $1 .5 in sales.

3. Channels with diminishing returns are Social Media and Email as their response curves show that the sales growth is not proportional to the increasing marketing spend.

4. To maximize total ROI with the current budget, we should consider reallocating spend from Social Media and Email towards TV and OOH as they have a higher ROI.

Practical Considerations:

1. Model quality and insights is still dependent on input data quality.

2. Prompt design is critical to avoid misleading insights

3. Automation for input data processing and model output reporting and visualization will help this stack to operate in scale.

Final thoughts:

This walkthrough illustrates how a potential open-source based Bayesian MMM augmented with Gen AI workflow can translate complex Bayesian results into an actionable insights for marketers and leaders.

This approach does not attempt to simplify the math behing Marketing Mix Models, instead it preserves it and makes an attempt to make it more accessable for broader audience with limited model knowledge and budget resources for their organization.

As privacy-safe marketing analytics becomes a norm, open-source MMM systems with Gen AI augmentation offer a sustainable path: transparent, adaptable and designed to evolve with both business and underlying technology.

Resources & References:

1. Google Meridian: https://developers.google.com/meridian/notebook/meridian-getting-started

2. Mistral 7B LLM: Open-source instruction tuned large language model from Hugging Face platform. https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF

3. Hugging Face: Repository and hosting for open-source AI models. https://huggingface.co/

4. Bayesian Marketing Mix Modeling: General methodology and industry application.

5. Google Colab: Free GPU environment for prototyping.

6. All code snippets and outputs shown in this article are illustrative and intended for educational purpose only.

7. This article reflects an independent exploration of open-source tools. No commercial endorsements are implied.


Democratizing Marketing Mix Models (MMM) with Open Source and Gen AI was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Liked Liked