LiarMP4 benchmarks Predictive AI baselines against a novel Generative AI approach to content moderation. By recursively analyzing the semantic dissonance between visual evidence, audio waveforms, and textual claims, our multi-agent architecture catches sophisticated disinformation that traditional models miss.
Kliment Ho, Shiwei Yang, Keqing Li
Mentored by Dr. Ali Arsanjani, Professor Lau
Target User: Trust and Safety teams, investigative journalists, and social media platforms.
The Problem: Traditional content moderation relies on Predictive AI evaluating metadata such as account age and engagement velocity to output a scalar probability. While computationally efficient, this approach fails entirely against Contextual Malformation. This occurs when a completely authentic video is paired with a fabricated caption. Standard deepfake detectors pass the video as real, but the semantic intent is entirely deceptive.
We propose using Generative AI to extract distinct Veracity Vectors including Visual Integrity, Audio Integrity, Source Credibility, Logic, and Emotion to provide auditable and interpretable moderation signals.
Models were evaluated against our manually verified Ground Truth dataset. We calculate a Composite Mean Absolute Error measuring the absolute distance across eight multi-dimensional Veracity Vectors alongside overall Tag Accuracy.
| Method | Iteration Depth | Composite Error | Tag Accuracy | Average Latency |
|---|---|---|---|---|
| Baseline Predictive AI | 0 | 38.4 | 44% | 1.2s |
| Baseline Generative AI | 1 | 24.1 | 18.4% | 15.4s |
| Generative AI Fractal Logic | 2 | 12.8 | 64.8% | 34.1s |
| Multi-Agent System | 3 | 7.4 | 92.1% | 45.8s |
Detailed performance metrics evaluated across our verified Ground Truth dataset.
| Type | Model | Prompt | Reasoning | Tools | FCoT Depth | Accuracy | Comp. MAE | Tag Acc |
|---|---|---|---|---|---|---|---|---|
| GenAI | gemini-2.5-flash | standard | fcot | None | 2 | 83% | 15.11 | 64.8% |
| GenAI | gemini-2.5-flash | standard | fcot | Search, Code | 2 | 77.3% | 16.69 | 34.2% |
| GenAI | gemini-2.5-flash | standard | cot | Search, Code | 1 | 76.2% | 20.79 | 20.2% |
| GenAI | gemini-2.5-flash | standard | none | Search, Code | 0 | 71.4% | 23.33 | 22.2% |
| GenAI | gemini-2.5-flash | standard | cot | None | 1 | 63.6% | 26.16 | 32.7% |
| PredAI | Gradient Boost | Standard | none | None | 0 | 63.3% | 11.04 | 82% |
| GenAI | gemini-2.5-flash | standard | none | None | 0 | 63% | 20.66 | 18.4% |
| GenAI | gemini-2.5-flash-lite | standard | cot | None | 1 | 56.5% | 20.37 | 24.5% |
| GenAI | qwen3 | standard | none | None | 0 | 46.2% | 44.36 | 27.8% |
| GenAI | gemini-2.5-flash-lite | standard | fcot | None | 2 | 28.6% | 27.21 | 30.7% |
| GenAI | qwen3 | standard | cot | None | 1 | 0% | 54.44 | 80% |
Breakdown of Mean Absolute Error across all eight veracity and alignment modalities.
| Model | Prompt | Reasoning | Tools | Vis | Aud | Src | Log | Emo | V-A | V-C | A-C |
|---|---|---|---|---|---|---|---|---|---|---|---|
| gemini-2.5-flash | standard | fcot | None | 6.38 | 12.13 | 17.02 | 11.28 | 14.47 | 29.15 | 10.85 | 23.4 |
| gemini-2.5-flash | standard | fcot | Search, Code | 5.91 | 13.18 | 16.36 | 15.00 | 16.36 | 30.91 | 12.27 | 22.73 |
| gemini-2.5-flash | standard | cot | Search, Code | 12.86 | 21.90 | 17.14 | 20.48 | 23.33 | 30.00 | 17.62 | 27.14 |
Our approach evolved significantly as we identified the limitations of standard vision models against sophisticated real-world misinformation.
Our initial exploration utilized solely open-source Vision Language models such as Qwen3-VL. The primary goal was detecting spatial anomalies and deepfakes. We quickly discovered that high visual accuracy was insufficient for solving modern misinformation because genuine footage is routinely weaponized with false text.
We integrated Dr. Ali Arsanjani's Factuality Factors (visit Alternus Vera) and Modality Alignment criteria. This shift allowed us to evaluate how audio, video, and text relate to one another, applying techniques like Veracity Vectors and Truthness Tensors. This successfully penalized malicious content where authentic videos were recontextualized with deceptive claims.
To handle the complexity of processing isolated factuality vectors, we built a highly scalable architecture natively around the Google Agent Development Kit. This enables recursive verification steps and dynamic community context integration.
Project Attribution: This work was developed as part of the Alternus Vera Research Project, focusing on LiarMP4. This project was conducted under the supervision of Dr. Ali Arsanjani.
@misc{arsanjani_alternusvera,
author = {Arsanjani, Ali and others},
title = {Alternus Vera: A Research Project for LiarMP4, Detecting Contextual Malformation with Fractal Chain of Thought},
year = {2024},
publisher = {Alternus Vera Research Group},
url = {https://alternusvera.com},
note = {Core codebase: https://github.com/DevKlim/LiarMP4}
}
The entire research pipeline including the robust backend server, Frontend Studio, and Model dependencies is open source and fully containerized via Docker.