Synthetic Consumers Are Not Fake Humans

11,000+

Synthetic Interviews

AI Models Tested

Countries

2,500

Real Households

Validation Stages

Methodology

Three stages, progressively harder

Stage 1 — Personality Expression (Brazil, Peru, US): A 25-item personality inventory administered to 2,700 synthetic panelists. Mean correlation between intended and expressed personality: r = 0.83. Removing personality specification collapsed correlation to zero.

Stage 2 — Decision Anchoring (Brazil, Peru, US): 12 decision-making parameters — loss aversion, temporal discounting, exploration tendency — derived from a Nature-published dataset of 10.6 million real human decisions. Explicit decision parameters: r = 0.54. Personality alone: r = 0.04.

Stage 3 — Behavioral Curation (US): Real purchase data from the Dunnhumby "Complete Journey" dataset. 2,500 households, two years of grocery transactions, 39 behavioral attributes per household, 22 FMCG scenarios, five data configurations.

Key Findings

Three laws nobody expected

The right data depends on the right question

Personality profiles excel at identity-expressive tasks (r = 0.90). Decision parameters excel at deliberative choices (r = 0.64). Raw behavioral data excels at habitual purchases (r = 0.37). Each layer wins in its domain and fails outside it.

Mixing layers hurts

Adding personality profiles to raw behavioral data dropped accuracy by 20–105% across every model. The AI treats personality as a "character sheet" and role-plays accordingly — overriding the behavioral evidence.

Question design matters 25× more than model choice

Scenario framing accounts for ~76% of variance in prediction accuracy. Behavioral category: 15%. Model choice: only 3%. The industry conversation about which AI to use is the wrong conversation.

Core Discovery

The Identity-Operation Gradient

Synthetic populations replicate who someone is with high fidelity, but systematically fail at how they transact. This three-tier taxonomy maps the boundary between what AI consumers can and cannot simulate.

Identity-Explicit Price sensitivity, brand loyalty

0.318

Identity-Adjacent Lifestyle, health-consciousness

0.259

Operational-Behavioral Visit frequency, basket size

0.121

This gradient was consistent across all six models and mirrored the identical pattern observed in Stage 2 with abstract decision parameters. Identity-adjacent parameters like moral sensitivity (r = 0.71) dramatically outperformed operational parameters like loss aversion (r = 0.16).

“The quality of the question you ask the AI matters twenty-five times more than which AI you ask.”

— Adriana Rocha

For Practitioners

What this means for concept testing

Concept testing, message evaluation, and brand positioning operate in identity territory — where synthetic populations showed their strongest performance (r = 0.25–0.50 for well-designed scenarios).

Purchase prediction operates in operational territory, where synthetic fidelity remains insufficient for quantitative forecasting.

The practical positioning: not replacement, but augmentation. 100 concepts tested synthetically, 10 survivors validated with real consumers, 3 winners launched.

Study Overview

Validation architecture

Stage	Domain	Instrument	Key r	Sample
1	Personality expression	25-item IPIP-FFM	0.83	2,700 panelists × 3 models × 3 countries
2	Decision anchoring	24 vignettes, 12 parameters	0.54	3,600 panelists × 4 models × 3 countries
3	Behavioral curation	22 FMCG scenarios	0.37	100 real households × 6 models × 5 conditions

Models tested: Claude Opus, Claude Sonnet, Claude Haiku (Anthropic), GPT-5.4, GPT-4o-mini (OpenAI), Gemini Flash (Google), GPT-4.1 (OpenAI).

Request the full paper

Complete methodology, statistical tables, per-vignette correlation matrices, verbatim model responses across all models, and the Stage 2 and Stage 3 instruments in full.

Request Paper via Email Read the Research World Article

About the Author

Adriana Rocha is the founder of Wisdom Beyond Technology and Wortya, and author of Refactoring the Firm: Building Intelligence-Native Organizations for the AI Age. The qualitative field study that motivated this research was conducted in collaboration with Datum International (Urpi Torrado) and presented at ESOMAR LATAM 2026.

Correspondence: adrianar@wortya.com · LinkedIn

Synthetic Consumers AreNot Fake Humans