From Fraud to Foresight: a Double-Edged Sword of Synthetic Data in Market Research

 

When I wrote this article on quality control in neuromarketing research back in 2018, I never imagined we’d still be here.

That article responded to a scandal in Europe, where survey data was found to be manipulated and invented outright by agencies tasked with reporting real consumer behavior. It was a wake-up call—proof that not all market research was what it seemed.

Fast-forward to 2025. The U.S. Department of Justice just announced a massive fraud case, where millions of dollars were allegedly paid for fake market research data. Once again, the trust that underpins our industry has been shaken.

And yet, something has changed: the tools of deception have evolved.


How I Used AI to Simulate a Study on “Freshness”

At this year’s AChemS meeting, I gave a talk on AI in sensory research. One of the demonstrations I gave was intentionally provocative: I asked ChatGPT to simulate a consumer research study using MaxDiff and Implicit Association Testing (IAT) to explore the perception of "freshness" in home fragrance products. This is a question I’ve heard over many years. How can we measure such a concept like “freshness” where anyone may interpret it differently? Most tools, from traditional qual and quant research, do not differentiate this idea very well. So the question is ripe for new approaches. But what is the best way?

Normally for this type of research question I do suggest a combination of behavioral qual (using behavioral science frameworks to more deeply explore emotional, perceptual, and behavioral drivers behind concepts) and different quant approaches like MaxDiff and implicit association. And this just makes sense to me. You get a behavioral framework of language that consumers use to describe their choice architecture, and then you weigh those factors through MaxDiff, then finally you compare stimuli or brands for perceptual associations. You end up getting the consumer decision framework with a hierarchy of consumer expectations (MaxDiff) and what they actually experience (implicit association).

But how can this be done better? What is missing? I thought, why not run it through an LLM, play it out for different scenarios, like in foresight planning?

Let me be clear: the goal was not to fake data but to stress-test a study design. By simulating how people might respond, I wanted to explore gaps, assumptions, and how well our methods differentiated between products. Consider it a pilot study by proxy.

Here’s what I did:


Step 1: Defining the Concept

Before jumping into data generation, I started with an important foundational step: What do we even mean by “freshness”? Consider this the “qual” portion of the study.

This is a deceptively tricky concept in consumer research—especially in categories like home fragrance, where “fresh” can mean:

  • Clean or soapy?

  • Citrus or cool?

  • Airy, natural, sharp, energizing, or all of the above?

To clarify this, I created a working list of descriptive attributes commonly associated with “fresh” in home fragrance:

Category Examples / Phrases General Descriptors Clean, Crisp, Bright, Light, Pure, Clear, Soft, Airy, Cool, Gentle, Simple, Subtle, Smooth

Nature-Related / Environmental Green, Herbal, Grassy, Leafy, Dewy, Watery, Marine, Ocean breeze, Rainy/Rain-fresh, Mountain air, Outdoor, Forest, Floral (lavender, lily of the valley, jasmine), Citrus (lemony, zesty, lime, orange peel, grapefruit)

Energy / Mood-Related Refreshing, Invigorating, Energizing, Awakening, Uplifting, Calming, Soothing, Breezy, Balanced

Cleanliness / Hygiene-Related Just-washed, Shower-fresh, Soapy, Laundry-fresh, Powdery, Sparkling clean, Sanitized

Temperature-Related Cool, Chilled, Icy, Minty (e.g., peppermint, eucalyptus) Phrases Consumers Might Use “Smells like fresh air”, “Just out of the shower”, “Smells clean and light”, “Reminds me of being outside”, “Crisp and cool”, “Not too heavy”, “Not sweet or overpowering”, “Like clean laundry drying outside”, “Zesty and bright”, “So fresh and so clean”

This list became the foundation for the MaxDiff and IAT simulations that followed. Already, just going through this exercise prompted questions: Are we over-relying on visual metaphors (like “light”)? Are sensory experiences like “soapy” or “powdery” universally understood? Do these terms carry consistent emotional valence?

By interrogating the construct of “freshness” with AI before jumping into method, I was already able to shaping a more focused and informative study design. I pushed ChatGPT to categorize these by descriptive and emotional factors, habit loops, etc., looking for behavioral patterns.

Step 2: MaxDiff Prompt – Simulating Attribute Importance

I then asked ChatGPT to take these words and form them into MaxDiff-style statements and then:

"Simulate MaxDiff results for consumer preferences on what attributes best define the concept of 'fresh' in home fragrance. And then visualize it as a bar chart and convert to a preference share ranking table."

The model returned this:

Simulated MaxDiff Home Fragrance Freshness data from ChatGPT

Top drivers of “fresh” in home fragrance were cleanliness, fresh air, and laundry-like cues — consistent with common cultural associations of the concept of “fresh” for home care. Zesty citrus, green/herbal, and watery notes were somewhat valued, but not core. Nature-oriented or more complex emotional cues (energizing, soothing, calming) ranked lower, suggesting that for home fragrance “freshness” is functionally tied to cleanliness and airflow more than to nature or mood.

This all makes sense. Not bad, ChatGPT!

Step 3: IAT Prompt – Simulating Implicit Perceptions by Brand

Next, I prompted ChatGPT to create a list of descriptive and emotional attributes (adjectives) based on (and connected to) each of the MaxDiff statements.

ChatGPT’s generation of MaxDiff statements and descriptive adjectives for “fresh”, as presented at AChemS 2025.


And then asked it to design an IAT study:

"Generate go/no-go IAT results (reaction times) comparing two brands—Febreze vs. Glade—across these attributes. Present as average reaction times per attribute per brand."

ChatGPT IAT study design based on my description, as presented at AChemS 2025.

And with some additional prompting to get where we wanted to get to, I got IAT results:

Synthetic ChatGPT generated results for implicit study of “freshness”, as presented at AChemS 2025.

These visuals helped me see which attributes were both strongly and consistently associated with each brand—and which might require more probing in a real study.

Now there’s a few things you would need to do to make this more realistic synthetic data. I just did this as a quick example for a talk. If you were to ask ChatGPT to generate this data again, you would get different results each time. And if you were to ask it to do this 100 times, a batch simulation, you’d get quite a nice range of data you could run real analysis on. You could create an agent or script to do this 100, 500, or more times. You could also, instead of all that, ask it to generate the output as a PPT slide with data export for the graph and use that for the data to analyze.

But you do have to keep in mind that IT IS NOT REAL.

If you are going to do this, make sure you are not using it to replace real data. This is purely an exercise to troubleshoot, pilot, and explore concepts. And follow some best practices:

  1. Be explicit about the persona and context—otherwise, synthetic data can be too generic.

  2. Add variability: e.g., "You’re 1 of 500 unique consumers with slightly different preferences.”

  3. Use for testing logic and design, not drawing conclusions.

  4. Always disclose that it’s synthetic data.

Step 4: Identifying Gaps and Optimizing Study Design

After running the simulated MaxDiff and IAT results, I took a step back and asked: What isn’t this telling me?

This is where synthetic data becomes more than just playful exploration—it becomes a design critique tool.

I prompted ChatGPT to:

“Help me identify gaps in this study design. Are there any assumptions I’m making? Are there alternative methods I should consider?”

This step revealed several valuable insights from ChatGPT:

🧩 Does This Approach Help?

Implicit association testing (IAT) plus MaxDiff already adds a layer of differentiation beyond just asking people directly.

  • MaxDiff gives you stated importance — what people say they want.

  • IAT gives you perceived fit — what they actually associate with the product experience at a non-deliberative level.

When the two don't align, that gap tells a powerful story about whether your product delivers on the need space it’s targeting.

However, the words themselves (like "clean," "fresh," "bright," "breezy") still risk being too general unless we anchor them better.

🚩 Where This Method May Still Struggle to Differentiate:

  • Many of the words like “clean,” “fresh,” “airy,” “light” are category generic.

  • If all stimuli score similarly on these words, the test loses discriminatory power.

  • Consumers might apply the same labels to both citrusy-fresh and powdery-soft fresh, even though these are different sensory profiles.

🛠️ How to Improve Differentiation:

1️⃣ Richer, More Specific Language Sets:

  • Add more precise sensory words:

    • “Soapy,” “ozonic,” “chlorine-clean,” “lemony,” “minty-cool,” “soft cotton,” “dryer sheet,” “dewy grass,” “seaside breeze,” “earthy green,” “airy floral.”

  • Use product-type anchors:

    • “Smells like clean laundry drying outside” vs. “Smells like opening a window after rain” vs. “Smells like a hotel bathroom.”

  • Try negative contrasts:

    • “Heavy,” “dense,” “sweet,” “musky,” “overpowering,” “chemical.”

➡️ Pair opposites in IATs or MaxDiff (e.g., airy vs. heavy or bright vs. musky) for stronger signal.

2️⃣ Scenario-Based Framing Instead of Abstract Words:

  • Rather than single adjectives, use contextual statements:

    • “This fragrance makes me feel like I’ve just opened the windows on a spring morning.”

    • “Reminds me of clean clothes just out of the dryer.”

    • “Feels like standing by the ocean breeze.”

These trigger mental imagery and memory-based responses that are more discriminating.

3️⃣ Incorporate Multisensory Cross-Modal Cues:

  • Ask about temperature perceptions (cool vs. warm fresh).

  • Explore texture metaphors (crisp like a snap, soft like cotton).

  • Use color associations (bright white, pale blue, soft yellow) alongside word associations.

This works because fragrance perception is cross-modal — color, texture, and temperature all shape expectations.

4️⃣ Behavioral Choice Tasks Instead of Just Words:

  • Add forced-choice scenarios:

    • “If you were choosing a spray to energize your morning cleaning routine, which fits better?”

    • “If you wanted a calm, relaxing scent for evening, which feels right?”

Behavioral measures (which one they pick for a scenario) can differentiate functional vs. emotional needs.

ChatGPT design recommendations based on our “Freshness” Pilot Test, as presented at AChemS 2025.


Why This Matters

This entire exercise took under an hour. It didn’t cost thousands of dollars or require a participant panel. It also didn’t produce real data.

And that’s where the danger lies.

If I wanted to pass this off as legitimate research, it would look convincing. With the right design and formatting, synthetic data like this can be hard to detect. In the wrong hands, it becomes fraud—not foresight.

As researchers, we now face a dual challenge:

  1. Harness AI’s power to explore, test, and refine study designs.

  2. Guard against its misuse in generating fraudulent results masked as real insight.


Building Ethical Boundaries for Synthetic Research

Here’s what I propose for our industry moving forward:

  • Declare when synthetic data is used—just like pre-registration or conflict-of-interest statements.

  • Use AI-generated data only for design validation, never as a substitute for empirical data.

Develop best-practice guidelines for how synthetic simulations can support—but not replace—consumer research.

Trust is the currency of science. Whether we’re using EEGs, surveys, or simulated data, that trust depends on transparency, ethics, and a shared commitment to truth.

Let’s keep that front and center.

 

Let’s Design Smarter, Together

This example was just a simulation—but it’s also a demonstration of how powerful synthetic tools can be when used with the right guardrails and scientific thinking. Whether you're exploring a fuzzy concept like “freshness,” vetting a new product idea, or trying to design a study that truly differentiates your brand, there are smarter, faster, and more ethical ways to do it.

At Nerdoscientist, I specialize in helping teams:

  • Stress-test and future-proof research plans before they go live

  • Apply behavioral and neuroscience methods to consumer understanding

  • Use tools like implicit testing, biometrics, and even AI simulations to uncover deeper insights

  • Develop strategic foresight for product development, emotional positioning, and sensory design

If you're facing tough research questions—or just want to make sure you're asking the right ones—let’s talk.


🧠 Smarter research starts before the first datapoint.

Next
Next

So You Want to Learn Neuroscience? Start Here—But Also, Let’s Talk.