The Threat Landscape

You open a banking app for the first time. It asks for your name, your date of birth, your address. Then it asks for your face.

The front camera activates. A green oval appears on screen, guiding you to position yourself. You blink. You smile. You tilt your head left, then right. A progress bar fills. The app tells you verification is complete — you are who you claim to be.

Behind that thirty-second interaction, a pipeline of software consumed your camera feed, extracted biometric features, checked them against liveness heuristics, verified your GPS location fell within an acceptable region, and cross-referenced accelerometer data with the visual motion it observed. Five separate data sources. Dozens of validation checks. All happening on the device in your hand.

This book is about what happens when every one of those data sources is lying.

The KYC Pipeline

Know Your Customer (KYC) verification exists because financial regulators require it. Banks, fintech companies, cryptocurrency exchanges, and payment processors must verify that their users are real people, located where they claim to be, and not on any sanctions list. The penalty for inadequate verification ranges from fines in the millions to criminal liability for the compliance officers who signed off.

The typical mobile KYC flow has five stages, each designed to answer a specific question:

Stage 1: Face Capture

The app activates the front camera and runs face detection — usually through a third-party SDK like Google ML Kit or a commercial liveness and verification SDK. The SDK draws a bounding box around the detected face, checks that it meets quality thresholds (brightness, sharpness, angle), and captures one or more reference frames.

What it validates: A human face is present in the camera feed. The face meets minimum quality standards for downstream processing.

What it trusts: The camera hardware is delivering live, unmodified frames.

Stage 2: Liveness Detection

This is the defense against someone holding a printed photo in front of the camera. Liveness checks come in two flavors:

Passive liveness analyzes the captured frames for depth cues, texture patterns, and reflection characteristics that distinguish a live face from a photograph or screen. The user doesn’t need to do anything special — the analysis runs on the frames already captured.

Active liveness instructs the user to perform specific actions: smile, blink, turn your head left, nod down. The SDK tracks facial landmarks across frames and verifies the requested motion occurred. Advanced active liveness systems cross-check the visual motion against sensor data — if the app told you to tilt your head left, the accelerometer should register the corresponding device movement from your hand following your head.

What it validates: The face in the camera is a live, three-dimensional human, not a photograph, video, or mask.

What it trusts: Camera frames are live. Sensor readings reflect actual physical motion.

Stage 3: Document Verification

The app asks you to photograph your government-issued ID — passport, driver’s license, national ID card. OCR (optical character recognition) extracts the text fields: name, date of birth, document number, expiry date. Some SDKs also perform document liveness — checking for holographic features, microprinting, and other physical security measures that photographs of documents don’t reproduce well.

The extracted data is compared against the information you provided during registration. Advanced systems also compare the photo on the document against the selfie captured in Stage 1.

What it validates: You possess a valid identity document that matches your registration data.

What it trusts: The camera is pointing at a real document.

Stage 4: Geofencing

Many KYC flows include a location check. A banking app launching in the US needs to verify you’re actually in the US — or at least not in a sanctioned country. Some services are more specific: a local bank branch app might check that you’re within a certain radius of the branch.

The app queries the device’s location services — usually through Google’s FusedLocationProviderClient — and checks the returned coordinates against a geofence. Some apps also check for mock location detection, looking for signs that the GPS coordinates are being spoofed.

What it validates: The device is physically located in an acceptable region.

What it trusts: The GPS hardware is reporting actual coordinates. The operating system is not lying about mock location status.

Stage 5: Decision

All the collected evidence — face match score, liveness score, document OCR results, location check — is sent to a backend service that makes the final accept/reject decision. Some companies run additional checks server-side: sanctions list screening, duplicate detection, velocity checks (how many accounts were opened from this device recently).

The critical observation is this: stages 1 through 4 all happen on the client device. The data they consume — camera frames, GPS coordinates, sensor readings — originates from hardware that the device owner controls completely.

How These Stages Map to Attack Surfaces

Each pipeline stage creates a distinct attack surface with its own injection requirements:

Pipeline Stage	Attack Surface	Data Source	Injection Method
Face Capture	Camera feed	CameraX / Camera2 API	Frame replacement via bytecode hooks
Liveness	Camera + Sensors	ImageAnalysis + SensorEventListener	Coordinated frame + sensor injection
Document Scan	Camera feed	Same camera API, different payload	Mid-flow frame source switch
Geofencing	GPS coordinates	FusedLocationProvider / LocationManager	Location callback interception

The methodology this book teaches addresses each surface individually (Chapters 7-9) and then combines them in coordinated engagements (Chapters 10-12). Understanding the pipeline is the first step — knowing where each data source enters the application is what makes targeted injection possible.

The Verification SDK Landscape

Most applications don’t build their own face detection, liveness checks, or document OCR. They embed a third-party SDK — a library that handles the computer vision, machine learning, and biometric processing. Understanding which SDK a target uses shapes your expectations about what defenses you’ll encounter.

Google ML Kit

The most common SDK for basic face detection. ML Kit is free, runs entirely on-device, and integrates directly with CameraX’s ImageAnalysis pipeline. It detects faces, identifies landmarks (eyes, nose, mouth), and can classify expressions (smiling, eyes open). However, ML Kit does not perform liveness detection on its own — apps using ML Kit for liveness typically implement their own heuristics on top of the face detection results.

From an attacker’s perspective, ML Kit is the easiest target. Its face detection is straightforward — deliver frames with a visible face that meets the quality thresholds, and it passes. There’s no server-side component and no active challenge-response protocol.

Commercial Active Liveness SDKs

Some commercial liveness SDKs are known for aggressive anti-spoofing. These SDKs use a combination of 3D face mapping, texture analysis, and active liveness challenges. They require the user to perform specific head movements and analyze the resulting 3D deformation of the face model — something that a flat photograph cannot reproduce.

These SDKs may also check device integrity, monitor for common hooking frameworks (Frida, Xposed), and perform server-side validation of the captured session. Bypassing a commercial active liveness SDK is significantly harder than bypassing ML Kit alone — you need high-quality face frames with natural motion, coordinated sensor data that matches the observed visual movement, and sometimes additional measures to avoid the SDK’s anti-tampering checks.

Server-Side Challenge-Response and Document Verification SDKs

Some SDKs use a server-side challenge-response protocol: the server sends a unique visual stimulus (a colored light sequence), the client captures the user’s face as the stimulus plays, and the server verifies both the liveness and the response to the specific challenge. This is notably harder to bypass because each session has a unique server-generated challenge.

Other SDKs combine document verification with biometric matching — they capture the ID document, extract the photo, then compare it against a live selfie with liveness checking. Their strength is in the document analysis pipeline, which includes fraud pattern detection on the document itself.

Why SDK Choice Matters for Red Teams

The hooks in this toolkit operate below the SDK layer — they intercept at the Android camera API, before the SDK ever receives the frames. This means the fundamental injection mechanism works regardless of which SDK the target uses. However, the SDK determines what quality of injection is required:

ML Kit targets pass with basic face frames at moderate quality
Active liveness SDK targets require high-quality frames with natural motion sequences, matched sensor data, and sometimes anti-tamper evasion
Server-side challenge-response targets add the challenge of server-generated stimuli that can’t be predicted in advance

Recon (Chapter 5) teaches you to identify which SDK a target embeds, so you know what level of sophistication your payloads need before you start the engagement.

The Fundamental Weakness

Every mobile KYC system shares an architectural assumption that creates its primary vulnerability: client-side code trusts client-side data.

When an app calls ImageAnalysis.Analyzer.analyze(), it receives an ImageProxy object and processes whatever pixels are in it. The app has no way to verify that those pixels came from the physical camera sensor rather than from a file on disk. The API contract simply delivers image data — the app must trust it.

The same is true for every other data source:

Location callbacks deliver a Location object with latitude and longitude. The app cannot verify these coordinates against actual satellite signals — it receives them from the operating system, which received them from the location service, which may or may not have received them from actual GPS hardware.
Sensor callbacks deliver a SensorEvent with acceleration or rotation values. The app cannot distinguish between readings from a physical MEMS sensor and values injected programmatically.

This isn’t a bug. It’s the fundamental architecture of the Android operating system. Apps run in sandboxes and access hardware through framework APIs. Those APIs deliver data — they don’t authenticate it. On an unmodified device, this works because the operating system is trusted. But when you control the device — when you can modify the app’s bytecode, intercept its API calls, and substitute your own data — the entire trust model collapses.

The camera API delivers whatever you put into it. The location API returns whatever coordinates you configure. The sensor API reports whatever motion profile you define. The app processes all of it with the same confidence it would give to real data, because it has no mechanism to tell the difference.

This is the attack surface this book explores.

Real-World Fraud Patterns

The techniques in this book are used by security professionals to test whether verification systems can be bypassed. But the attack surface is real, and understanding the fraud patterns that exploit it is essential context for anyone working in this space.

Account Opening Fraud

The most common pattern: an attacker obtains a victim’s identity documents (purchased on darknet markets, stolen from data breaches, or photographed covertly) and uses them to open financial accounts. The attacker needs to pass the selfie stage with a face that matches the document photo — either by using a high-resolution photograph of the victim’s face, a video loop, or an AI-generated face that matches the document.

Without liveness detection, this is trivial: hold a phone displaying the victim’s photo in front of the camera. With passive liveness, it’s harder but still feasible with high-quality screens. Active liveness raises the bar further — but if you control the data at the API level, none of these defenses matter. The verification SDK processes whatever frames you deliver, and if those frames show a convincing face performing the requested actions, it passes.

Location-Gated Service Bypass

Financial services are often geographically restricted due to licensing and regulatory requirements. A cryptocurrency exchange licensed only in certain jurisdictions might use geofencing to block users from sanctioned countries. A gambling app might verify you’re within a state that permits online betting.

GPS spoofing at the application level bypasses these checks completely. The app queries the location API and receives coordinates that place the device wherever the operator configured. Unlike VPN-based location masking (which only affects IP geolocation), API-level GPS spoofing fools the app’s own location queries — the same APIs it trusts to verify the user’s physical position.

Synthetic Identity Fraud

Rather than stealing a real person’s identity, synthetic identity fraud creates an entirely new one. The attacker combines real and fabricated information — a real Social Security number (often from a minor, elderly person, or deceased individual) with a fake name, date of birth, and address. The face for the verification selfie is AI-generated.

The challenge for the attacker is passing liveness: a static AI-generated image fails active liveness checks that require motion. But if you can inject frames directly into the camera API, you can deliver a sequence of generated images showing whatever facial expressions and head movements the SDK requires. The SDK’s liveness algorithm evaluates the frames, not how they arrived.

Liveness Bypass via Pre-Recorded Media

The simplest technical attack: record a video of the target actions the liveness check will request (smiling, blinking, head tilting), then play that video back through the camera API. No AI generation required — just a real person performing the motions once, captured as a sequence of frames that can be replayed indefinitely.

This is the primary attack pattern this book teaches you to execute in a controlled environment: recording camera frames, then injecting them into a patched application that processes them as live camera input.

Multi-Factor Bypass

The most sophisticated attacks combine multiple injection surfaces simultaneously. A banking onboarding flow might require a face selfie with liveness (camera + sensors), a location check (GPS), and a document scan (camera again with different payload). Each surface must be active and coordinated — the sensor data must be physically consistent with the visual data, the GPS coordinates must fall within the geofence, and the frame source must switch mid-flow when the app transitions from selfie to document capture.

This coordination challenge is what makes multi-step targets significantly harder than single-surface ones. Chapter 10 teaches the operational methodology for managing these concurrent injection streams, and Lab 6 puts it into practice against a three-step verification flow.

The Economics of Identity Verification

Understanding why this attack surface exists requires understanding the economics on both sides.

The Cost of Fraud

Identity fraud cost financial institutions an estimated $20 billion in direct losses in the US alone in 2023. The indirect costs — investigation, remediation, customer compensation, regulatory fines — multiply that figure several times over. A single successfully opened fraudulent account can generate tens of thousands of dollars in losses before it’s detected.

For context: a fraudulent bank account is not just a one-time loss. Once opened, the account becomes infrastructure for further crimes — money laundering, wire fraud, check kiting, credit card fraud. The account might exist for weeks or months before the institution detects the fraud, during which time the losses compound. Regulators increasingly hold the institution responsible not just for the initial fraud but for everything the fraudulent account facilitated.

The Cost of Verification

Implementing robust KYC isn’t free. Third-party liveness SDKs charge per verification — typically $0.50 to $3.00 per check, depending on the level of assurance. At scale, this adds up: a fintech onboarding 100,000 users per month might spend $150,000 to $300,000 annually on verification alone. That’s before internal engineering costs, compliance staff, and the cost of false rejections (legitimate users who fail verification and abandon the signup).

The cost creates a perverse incentive. Companies want verification to be thorough enough to satisfy regulators but fast and frictionless enough not to lose legitimate customers. Every additional check increases security but also increases the drop-off rate — the percentage of users who abandon the signup process because it’s too slow or too demanding. This tension between security and conversion is where vulnerabilities hide. Companies cut corners. They skip server-side validation. They accept weaker liveness thresholds. They don’t cross-check sensor data against visual motion. Each shortcut is a potential attack surface.

The Arms Race

This creates a perpetual arms race. Verification vendors improve their liveness detection (depth analysis, frame entropy measurement, sensor correlation). Attackers improve their bypass techniques (higher-quality injected frames, physics-consistent sensor profiles, API-level spoofing that bypasses client-side checks). Vendors respond with server-side verification, device attestation, and behavioral analytics. Attackers adapt to each new defense.

The current state of the arms race heavily favors the attacker on one front: client-side processing is fundamentally untrustworthy. No amount of SDK sophistication can overcome the fact that the app is running on a device the attacker controls. The long-term direction of the industry is toward server-side verification — sending raw biometric data to a server that performs its own analysis — but this has performance, privacy, and cost implications that slow adoption.

Red team testing — the controlled, authorized application of attack techniques against your own systems — is how organizations stay ahead in this race. You can’t fix a vulnerability you don’t know about, and you can’t know about it unless someone tries to exploit it. That’s the purpose of the methodology this book teaches.

Who This Book Is For

This book serves three audiences at different points in their security careers:

Penetration testers and red teamers who already conduct mobile application assessments and want to add biometric and KYC bypass to their toolkit. You know how to use Frida, you understand the basics of Android reverse engineering, but you haven’t specifically targeted identity verification flows. Parts I through III give you a complete, tested methodology. Parts IV and V show you how to handle advanced targets and think about defense.

Security engineers and developers who build or maintain identity verification systems and want to understand the attacker’s perspective. You might be evaluating liveness SDKs, designing your KYC architecture, or responding to a penetration test report that found biometric bypass vulnerabilities. This book shows you exactly how those attacks work — not theoretically, but step by step with real tools against real applications.

Security-adjacent professionals — QA engineers, compliance officers, technical product managers — who need to understand the risk landscape without necessarily executing the attacks themselves. Part I gives you the conceptual foundation. The case studies and economics sections provide the business context. The defense chapters (Part V) give you the vocabulary to evaluate mitigations.

Progressive Difficulty

The book is designed to be read front-to-back, with each chapter building on the previous one. But it’s also structured so you can jump to what you need:

Parts I-III (Chapters 1-12) take you from zero to running full, multi-step engagements. If you follow along with the labs, you’ll patch an APK, inject camera frames, spoof GPS, manipulate sensor data, and document a complete engagement — all against provided practice targets.
Part IV (Chapters 13-16) goes deeper: writing your own smali bytecode, building custom hooks for targets the standard toolkit doesn’t cover, and using AI to accelerate your analysis.
Part V (Chapters 17-18) switches perspectives: what defenders should implement, how to detect the attacks you’ve learned, and what the industry is building to close these gaps.

How to Use This Book

This training package has three components:

The Book

You’re reading it. Eighteen chapters of theory, methodology, and worked examples organized in five parts. Read the chapters to understand why things work, not just how to run the commands.

The Labs

Fourteen hands-on exercises, each mapped to a book chapter. Labs are self-contained — they have specific objectives, step-by-step instructions, concrete deliverables (reports, screenshots, log files), and self-check scripts that tell you whether you succeeded. Do the labs to build muscle memory. Reading about camera injection is not the same as doing it.

The Materials Kit

Everything you need to run the labs: a practice target APK (a purpose-built KYC application with all three attack surfaces), payload templates (GPS coordinates, sensor profiles), automation scripts (recon, delivery statistics), and report templates. Download the kit and unpack it alongside the book content.

The recommended approach: read a chapter, then immediately do the corresponding lab. Chapter 4 (The Lab) maps to Lab 0 (Environment Verification). Chapter 5 (Reconnaissance) maps to Lab 1 (APK Recon). And so on through Chapter 10 and Lab 6, where you run a complete engagement against a multi-step target.

What You’ll Build

By the end of this book, you will have a working toolkit that intercepts every data source a KYC application consumes — and a methodology for using it against real targets. Here is what the end state looks like.

The Patched APK

You take a target APK, run a single command, and get back a weaponized version with 1,134 injected runtime classes. The patch-tool handles everything: decoding, injecting hook classes into a new DEX file, patching the app’s entry points, rebuilding, aligning, and signing. The output tells you exactly which hooks fired and which were skipped:

[+] Injected 1134 runtime smali files into smali_classes7/
[+] Detected: com.poc.PocApplication
[+] Patched Application.onCreate()
[+] Patched toBitmap() in 1 file(s)
[+] Patched analyze() in 1 method(s)
[+] Patched onCaptureSuccess() in 1 method(s)
[+] Patched onLocationResult() in 1 method(s)
[!] No onSensorChanged(SensorEvent) found -- target may not use SensorEventListener

Every [+] is a hook that is now active. Every [!] is a surface the target does not use — confirmed by your recon before you ever ran the tool.

The Overlay

When you launch the patched app, a lightning bolt icon appears in the top-right corner. Tap it to open the control panel. Three modules — camera, location, and sensor — each with a toggle, a status indicator, and configuration controls. You switch frame sources mid-flow, update GPS coordinates in real time, and adjust sensor profiles without restarting the app. The user experience of the original app is identical — the overlay is the only visible difference.

Camera Injection in Action

You push a folder of face images to the device. The app opens its camera. Instead of the live feed, it sees your injected frames. Google ML Kit runs face detection on your frames and draws bounding boxes around the face — a face that is not physically in front of the camera. The liveness check passes. The delivery log confirms every frame was accepted:

FrameStore: loaded 47 frames from /sdcard/poc_frames/face_neutral/
FrameInterceptor: armed, source: face_neutral
FrameInterceptor: FRAME_DELIVERED [frame 1/47]
FrameInterceptor: FRAME_CONSUMED
FrameInterceptor: intercept swapped ImageProxy

GPS Spoofing with Mock Detection Bypass

You push a JSON config with Times Square coordinates. The app queries FusedLocationProviderClient and receives your coordinates — with realistic jitter, fresh timestamps, and isFromMockProvider() returning false. The geofence check passes. The app thinks the device is in Manhattan.

The Delivery Report

At the end of an engagement, you extract delivery statistics that document every injection event across all three subsystems:

Frame:    delivered=47 consumed=45 rate=95.7%
Location: delivered=12 callback=12 mock_bypassed=12
Sensor:   delivered=89 listener=89 rate=100%

These numbers go into an engagement report with screenshots, logcat dumps, and actionable recommendations for the client. The report is the deliverable — the bypass is just the means.

The Full Picture

The complete workflow — from a stock APK to a documented bypass of a multi-step KYC flow — takes about an hour once you have practiced it. Thirty minutes of recon, five minutes of patching, fifteen minutes of execution, and the rest on reporting. By Chapter 10, you will run this cycle end-to-end against a practice target with face capture, liveness detection, and geofencing. By Chapter 14, you will build custom hooks for targets the standard toolkit does not cover. By Chapter 18, you will understand both sides — how to break verification and how to build it so it resists this class of attack.

What Comes Next

Before you touch any tool, before you decode a single APK, you need to understand the rules of the game. Chapter 2 defines the boundaries: what authorized testing looks like, what the legal landscape looks like, and the ethical framework that separates a security professional from an attacker. The techniques are the same — the authorization is what makes the difference.