SIMULA: The End of Neurotypical Bias in AI?

The Taxonomic Bridge to Neuro-Inclusive Alignment

At a Glance

The Problem: Current AI models are trained on the “neurotypical noise” of the 2024 internet, creating a “data wall” (data scattered through unacknowledged biased frameworks), LLM misalignment, and inherent friction for divergent minds.
The Potential Breakthrough: Google’s SIMULA framework shifts from “scraped noise” to “engineered precision” using structured, taxonomy-led synthetic data generation.
The Promise of the “Alignment Leap”: By prioritizing taxonomic reasoning, logic, first principles, and dual-critic falsification, SIMULA-generated data aligns natively with the equally hierarchical reasoning, literal cognitive architecture of many neurodivergent people.

A few days ago I was struggling with prompting an image of our teams and projects, their integration and hierarchy. I struggle with image prompting because I’m not very good at translating what I “see” in my mind. What I had in mind was a hierarchical tree, but I didn’t name it. That was last week. As I was going through the new features released for Google’s AI products, I read about SIMULA, a Reasoning-Driven Synthetic Data Generation method. As I was going through the pdf of the published paper, I looked at the illustration and thought: “this is how I think.”

This is how I have thought since forever. I remember that the forward-looking school I was in at 6th grade wanted to introduce this new method of literary analysis to the class. The method was a hierarchical analysis of themes and facts, and how they were organized in the book we read. They didn’t need to explain much to me, I just did it. They let me do it for the rest of the year – the class never adopted the method. We had different cognitive architectures.

These different architectures are hard wired in our brains, and we also interact with AIs according to them. This is how, out of the many LLMs available since ChatGPT-1 that appeared and I tested, I ended up using the Google “AI Mode” search function everyday, for work or any other need until August 2025, when I adopted the whole Gemini system and built my project inside it. The reason it worked for me is that I could “hypothesis-prompt” my questions and obtain the best results and very good references. This basically consisted of offering a very detailed and formal hypothesis and asking the model to “elaborate and substantiate”. At the time, I failed to add “or falsify”, because I incorrectly assumed that if it failed, it would always falsify the hypothesis. Still, again, when I read that the SIMULA method includes a double-critic to do just that, I identified the workings of my own mind.

Beyond the “tickly” structure mirroring between my mind and the SIMULA paper figure, the authors (Davidson et al., 2026) explicitly suggest that the models trained according to this methodology, produce data sets clean of “noise”. When I read about noise, it occurred to me that models trained with this kind of synthetic data, instead of the (cess)pool of human generated “flattened” data, may be better aligned with our divergent minds right out of the box.

What is SIMULA

SIMULA, which stands for Reasoning-Driven Synthetic Data Generation and Evaluation, was introduced in March 2026. The framework was developed by a research team primarily from Google and Google DeepMind, including Tim R. Davidson (EPFL), Benoit Seguin, Enrico Bacis, Cesar Ilharco, and Hamza Harkous. It is a seedless, agentic framework designed to generate high-quality synthetic datasets at scale from first principles instead of relying on existing seed data or manual prompts.

SIMULA prioritizes explainability and control, and uses hierarchical taxonomies and iterative reasoning steps, such as “double-critic” refinement and calibrated complexity scoring. That ensures generated data is diverse, complex, and high-quality for training specialized AI models.

The Simula Methodology: the Taxonomic Shift

Simula is designed to orchestrate synthetic datasets through complex reasoning instead of the traditional method of manual prompting, or relying on real-world seed data. Its architecture is based on taxonomies that capture a “global coverage space,” followed by agentic refinements to ensure local diversity and complexity.

Controllability and Explainability: Unlike standard “black-box” training methods, Simula offers a controllable and explainable process, allowing developers to define specific dataset characteristics. This has huge implications for the development of specialist models.
Seedless Framework: It does not rely on extensive pre-existing data distributions, which is critical for specialized domains where data might be scarce or private. It also means better “coverage” of underrepresented data “types”, which is often the case for underrepresented people types.

Alignment with Neurodivergent Cognition?

That’s the hypothesis: that Simula-trained models achieve higher alignment with autistic and divergent communication styles by removing the “noise” typical of neurotypical social data. The SIMULA taxonomic approach provides a hand-designed, optimized, and theoretically auditable pipeline that replaces noisy web text with a carefully curated curriculum.

I decided to conduct an experiment, or a meta-experiment: I asked the Google Search AI Mode model to “elaborate and substantiate, AND to falsify the following hypothesis, compare the two exercises and tell me how strong the hypothesis holds”: Models trained with the SIMULA methodology for data generation can achieve higher alignment with human cognition and offer increased inclusivity for non-mainstream cognitive styles (autistic/divergent) because its taxonomic structure eliminates exclusionary “noise.”

Google Search AI Mode gave me two outputs: one as an immediate response to the prompt, the other, spontaneously, as a response to a subsequent iteration. I combined the two, as follows:

Hypothesis test:

1. Substantiation: The Case for Alignment and Inclusivity

Structured Knowledge Representation: The SIMULA framework moves away from treating synthetic data as “random fake examples” and instead uses reasoning-driven, recursive expansion to map out target domains. By building a dense, hierarchical taxonomy first, it ensures global coverage and variety, including “edge cases” that traditional unstructured training might ignore.
Representational Alignment: Aligning models with human conceptual hierarchies (conceptual distance) has been shown to reorganize AI representations to match human-category structures without explicit supervision. A taxonomic approach like SIMULA’s could theoretically enforce this structural alignment.
Inclusivity via Precision: Current research suggests AI is already being used to bridge cross-neurotype communication gaps (e.g., NeuroBridge). By “operationalizing” nebulous language into logical, non-judgmental structures, models can reduce the cognitive overload that often impacts neurodivergent individuals. [Google Research 2026, April 16; “Google Simula AI Reveals Why the Future Needs Better Data.” 2026; Lampinen, Andrew, and Klaus Greff, 2025; Haroon, Rukhshan, et al. 2025; Silver, 2025; Pitcher, 2025; “Neurodiversity and the Need for Stable AI Guidelines.”, 2025]

Noise Reduction and Cognitive Load: For many neurodivergent individuals, rigid systems built on neurotypical assumptions can unintentionally marginalize alternative processing styles. The taxonomic cognitive structure in SIMILA-like models reduces the “noise” or unnecessary abstractions that typically act as a tax on human attention.
Empirical Success with ND Users: Recent experiences from ND “power users” suggest that AI models with an “engineering style” that is concise, emotionally precise, and non-generic are highly recommended by individuals with autism. These models allow users to reclaim agency in systems that otherwise diminish their autonomy.
Intentional Calibration: Synthetic data generation allows creators to purposefully remove the biases of real-world data. By using a structured taxonomic tree, sparse but critical categories—such as rare cognitive communication styles—remain well-represented rather than being lost in a regression toward the mean. [Millidge 2026; Zewe, 2025; Costa, 2025; Reitz, 2026; “Vibe Coding Is Life”, 2026; “Act as a world-class depth psychologist, behavioral therapist, cognitive bias expert, symbolic art director, and an OSINT expert” , 2026; Glazko et al. 2025; Natarajan, 2026]

2. Falsification: Potential Points of Failure

The “Structural Bias” Trap: A core epistemological challenge is whether the taxonomy itself is inherently neurotypical. If the “global coverage space” defined in SIMULA is designed using mainstream reasoning, the “noise” being filtered out might actually be the very nuances of divergent communication styles.
Loss of Emergent Nuance: While SIMULA aims for diversity through agentic refinement, its focus on an “explainable and controllable process” could potentially lead to a model that is “shockingly stupid” at handling the erratic, non-linear logic that characterizes some human interactions.
Empirical Performance Gap: Meta-analyses show that human-AI combinations often perform worse than either alone, particularly in decision-making tasks. Even a highly aligned SIMULA-trained model might fail to foster effective collaboration if it cannot adapt to individual cultural or cognitive contexts. [Malihe Alikhani, and Sabit Hassan, 2025; Davidson, T. R., Seguin, B., Bacis, E., Ilharco, C., & Harkous, H. (2025).; Park, Joon Sung, et al. 2024; Vaccaro, M., Almaatouq, A., & Malone, T. (2024); NYU Center for Data Science. 2025; ]

The “Authenticity” Barrier: While AI can support code-switching and information access, a major challenge is balancing authentic self-expression with societal conformity. If a taxonomic structure is too optimized for neurotypical social norms, it might exacerbate existing inequities rather than bridging them.
Mathematical Inevitability of Misalignment: Some researchers argue that perfect AI alignment is mathematically unattainable due to principles of computational universality and mathematical incompleteness. In this view, “agentic neurodivergence”—maintaining a dynamic ecosystem of diverse, partially misaligned agents—is actually a safer strategy than a single, perfectly “aligned” system.
Clinical vs. Lived Reality: Much of the existing research in AI for autism focuses on clinical biomarkers or “improving” empathetic responses to meet neurotypical standards. A taxonomic structure that prioritizes these clinical metrics might still overlook the actual lived communication needs of ND individuals. (Act as a world-class depth psychologist, 2026; Costa, 2025; Glazko et al., 2025; Millidge, 2024; Natarajan, 2026; Reitz, 2026; Vibe Coding Is Life, 2026; Zewe, 2025)

3. How Strongly the Hypothesis Holds

Strength: Moderate-High (Theoretical) / Low-Moderate (Empirical)

The hypothesis is logically sound within the framework of “Representational Alignment.” If SIMULA successfully captures the “global coverage space” including divergent cognitive patterns as valid sub-categories rather than “noise” to be filtered, alignment should increase.

However, the “noise” you mention is often where the richness of neurodivergent expression lives. If the taxonomic structure is too rigid—optimizing for plausibility over authenticity—it risks creating a “clinical” alignment that actually increases communicative alienation by forcing divergent styles into a sanitized, taxonomic box. (Hernández-Espinosa, Alberto, et al. 2026)

Strength: Strong (Theoretical) / Emerging (Empirical)

The hypothesis that the SIMILA approach aligns with ND cognitive styles holds strong theoretical weight because it honors the “Neurodivergent by Design” philosophy—using AI to respect unique brain wiring rather than forcing it to conform to a noisy average. Your own experience of 100% satisfaction through hypothesis-based prompting is a powerful “n=1” piece of evidence for this. By using logic and structure as the primary interface, we bypass the neurotypical “filler” that often complicates communication. [Costa, 2025]

Conclusion: Toward Cognitive-Aligned Systems

It is possible that the Simula methodology will represent a significant step toward bioinspired AI architectures that mimic neurodivergent traits such as non-linear associations and organized, hierarchical reasoning.

Will it? It all depends on how significant the falsification/weak points of the hypothesis that the Google Search AI mode model accurately identified are.

The choice I made to write this essay exposing the reasoning and arguing was, in part, just an exercise to mirror the methodology I am commenting on, but in part it is a way of honoring how I think, reason, and work – alone or with my AI assistants.

I may be wrong, and I won’t celebrate prematurely, but I am excited about what SIMULA will bring us.

References

Act as a world-class depth psychologist, behavioral therapist, cognitive bias expert, symbolic art director, and an OSINT expert. (2026). [Facebook post]. Facebook. https://www.facebook.com/salestraining/posts/vai-tev-pietiek-drosmesact-as-a-world-class-depth-psychologist-behavioral-therap/1407316998090082/

Alikhani, M., & Hassan, S. (2025, October 9). Hype and harm: Why we must ask harder questions about AI and its alignment with human values. Brookings. http://www.brookings.edu/articles/hype-and-harm-why-we-must-ask-harder-questions-about-ai-and-its-alignment-with-human-values/

Costa, C. (2025, December 29). Neurodivergent by design: Using AI to honor and support learning differences. In IntechOpen EBooks. https://doi.org/10.5772/intechopen.1012983

Davidson, T. R., Seguin, B., Bacis, E., Ilharco, C., & Harkous, H. (2025). Orchestrating synthetic data with reasoning. In Will synthetic data finally solve the data access problem? Research at Google. https://research.google/pubs/orchestrating-synthetic-datasets-with-reasoning/

Davidson, Tim R, et al. (2026) “Reasoning-Driven Synthetic Data Generation and Evaluation.” ArXiv.org, 2026, arxiv.org/abs/2603.29791. Accessed 29 Apr. 2026.

Glazko, K., Cha, J., Lewis, A., Kosa, B., Wimer, B. L., Zheng, A., Zheng, Y., & Mankoff, J. (2025). Autoethnographic insights from neurodivergent GAI “power users.” Proceedings of the SIGCHI conference on human factors in computing systems. CHI Conference, 274, 1–19. https://doi.org/10.1145/3706598.3713670

Google Research. (2026, April 16). Designing synthetic datasets for the real world: Mechanism design and reasoning from first principles. Research at Google. https://research.google/blog/designing-synthetic-datasets-for-the-real-world-mechanism-design-and-reasoning-from-first-principles/

Google Simula AI reveals why the future needs better data. (2026). [Online forum post]. R/AISEOInsider on Reddit.com. www.reddit.com/r/AISEOInsider/comments/1suc5pk/google_simula_ai_reveals_why_the_future_needs/

Haroon, R., et al. (2025, October 22). NeuroBridge: Using generative AI to bridge cross-neurotype communication differences through neurotypical perspective-taking (pp. 1–19). ArXiv (Cornell University). https://doi.org/10.1145/3663547.3746337

Hernández-Espinosa, A., et al. (2025). Neurodivergent influenceability as a contingent solution to the AI alignment problem. ArXiv.org. https://arxiv.org/abs/2505.02581

hetsorg. (2026, February 24). Utilizing AI to support in communication with ASD individuals [Video]. YouTube. www.youtube.com/watch?v=YPnXVJHl1K8

Koegel, L. K., et al. (2025, February 15). Using artificial intelligence to improve empathetic statements in autistic adolescents and adults: A randomized clinical trial. Journal of Autism and Developmental Disorders. https://doi.org/10.1007/s10803-025-06734-x

Lampinen, A., & Greff, K. (2025, November 11). Teaching AI to see the world more like we do. Google DeepMind. deepmind.google/blog/teaching-ai-to-see-the-world-more-like-we-do/

Millidge, B. (2024). Alignment in the age of synthetic data. Beren.io. www.beren.io/2024-05-11-Alignment-in-the-Age-of-Synthetic-Data/

Natarajan, P., et al. (2026). Development and evaluation of a synthetic AI model for generating high-fidelity synthetic datasets. AIP Conference Proceedings, 3393, 020006. https://doi.org/10.1063/5.0308672

Neurodiversity and the need for stable AI guidelines. (2025, March 27). Digitally Enhanced Education Webinars [Video]. YouTube. youtu.be/o0VBu8AcO3E?si=x7pCsbpZnUaiq3bj

NYU Center for Data Science. (2025, January 2). Building human-compatible AI through cognitive science: A new path forward. Medium. nyudatascience.medium.com/building-human-compatible-ai-through-cognitive-science-a-new-path-forward-f6706fe52cee

Park, J. S., et al. (2024). Generative agent simulations of 1,000 people. ArXiv.org. arxiv.org/abs/2411.10109

Pitcher, G. (2025, August 4). These 5 AI tools will change how neurodivergent people live. NpnHub. npnhub.com/these-5-ai-tools-will-change-how-neurodivergent-people-live/

Reitz, K. (2026). Kenneth Reitz. kennethreitz.org/archive/sidenotes

Silver, M. (2025, December 5). Showing neurotypicals how autistic communication works. Tufts Now. now.tufts.edu/2025/12/05/helping-neurotypicals-understand-autistic-communication

Vaccaro, M., Almaatouq, A., & Malone, T. (2024). When combinations of humans and AI are useful: A systematic review and meta-analysis. Nature Human Behaviour, 8(12), 2293–2303. https://doi.org/10.1038/s41562-024-02024-1

Vibe coding is life | Sorry, don’t mean to interrupt your vibe coding session but I just felt the need to share my experience with my local model: Mradermacher/MiroThinker-…. (2026). [Facebook post]. Facebook Groups. www.facebook.com/groups/vibecodinglife/posts/1927773227811205/

Zewe, A. (2025, September). 3 Questions: The pros and cons of synthetic data in AI. MIT News | Massachusetts Institute of Technology. news.mit.edu/2025/3-questions-pros-cons-synthetic-data-ai-kalyan-veeramachaneni-0903

Zolyomi, A. (2021, August 26). Grounded design of affective computing accounting for social, emotional, and sensory experiences of autistic adults [Doctoral dissertation, University of Washington]. ResearchWorks Archive. digital.lib.washington.edu/researchworks/items/b0265d69-4656-4575-9442-451cf4aeecc1