Module 9 — Native Code & Advanced Reverse Engineering

Scope & ethics reminder: everything here is intended for authorized, lab-only testing (intentionally vulnerable APKs, test builds, or apps where you have written permission). Native reverse engineering and manipulation are powerful techniques — do not apply them to production systems or devices you do not own/are not permitted to test. Where a technique could enable real-world compromise (e.g., producing exploit payloads), I explain concepts and defensive implications rather than step-by-step offensive procedures for production targets.

This module covers the full lifecycle of working with native Android code: identifying native components, triaging .so files, using ELF tooling, disassemblers/decompilers (Ghidra, IDA, radare2/Cutter), dynamic native analysis (gdbserver, Frida-Gum), vulnerability classes in native code, hardening/mitigations, fuzzing, and producing audit-grade deliverables. It is geared toward pentesters validating banking apps and defenders wanting to harden native components.

9.0 Learning objectives

After completing Module 9 you will be able to:

Recognize when an app relies on native code and why that matters for security.
Extract and triage .so libraries from an APK and determine ABI, stripping status, and obvious functionality.
Use ELF inspection tools (file, readelf, objdump, nm) to collect metadata and exported symbols.
Use interactive reverse-engineering tools (Ghidra, IDA, radare2/Cutter) to analyze native functions and identify crypto/attestation/native checks.
Perform safe dynamic analysis of native libraries in a lab (gdbserver, Frida-Gum) and capture reproducible evidence.
Classify common native vulnerabilities and determine likely mitigations and remediation steps.
Provide concrete, prioritized remediation guidance and create artifacts for developer teams (patch suggestions, compiler flags, unit tests).

9.1 Why native matters for mobile security

Native libraries (.so) are used for performance, legacy code, or to make reversing harder.
Sensitive operations (crypto, attestation parsing, integrity checks) are often moved to native to raise attacker effort.
Native code brings memory-safety risks (buffer overflow, use-after-free), ABI issues, and platform-specific behavior (ALIASING, endianness).
A vulnerable native library in a banking app can lead to credential exposure, logic bypass, or remote/local code execution.

9.2 Initial triage: locating `.so` files and quick metadata

Commands (lab-only) — extract list of `.so`

# list .so inside APK
unzip -l app.apk | awk '{print $4}' | grep '\.so$' > native_paths.txt

# extract one .so for inspection
unzip -p app.apk lib/arm64-v8a/libexample.so > libexample_arm64.so

Quick file and header checks

file libexample_arm64.so
readelf -h libexample_arm64.so      # ELF header: class, bitness, endianness
readelf -s libexample_arm64.so | head -n 40   # symbol table
nm -D libexample_arm64.so | head
strings libexample_arm64.so | egrep -i "JNI_OnLoad|Java_|encrypt|decrypt|aes|rsa|cert|attest|key"

Record:

ABIs present (armeabi-v7a, arm64-v8a, x86, x86_64).
Whether binary appears stripped (lack of Java_ exports or readable func names).
Presence of JNI exports (Java_com_company_Class_method) — helps map native ↔ Java.
Any strings that suggest crypto, attestation, or integrity checks.

9.3 ELF fundamentals & Android-specific headers

Know these fields and why they matter:

ELF class (32 vs 64 bit) — matches device ABI.
SONAME and DT_NEEDED — runtime dependency listing.
Dynamic symbol table — exported functions available to the loader.
Relocations & PLT/GOT — points of potential hooking/patching at runtime.
Sections: .text (code), .rodata (constants), .data (writable), .bss, .dynsym, .dynstr.
Program headers — segments loaded by the kernel (PT_LOAD) determine memory protection (R/W/X).

Useful commands:

readelf -d libexample.so    # dynamic section (DT_NEEDED etc.)
readelf --sections libexample.so
objdump -h libexample.so

Interpretation: If DT_NEEDED lists libcrypto.so or libssl.so, the binary uses OpenSSL or BoringSSL. If PLT stubs are present for malloc, memcpy, etc., they can be points for detection of dangerous usage.

9.4 Static reverse engineering: tools & workflows

Tools

Ghidra — free, cross-platform decompiler + disassembler; excellent for ARM/ARM64.
IDA Pro — industry standard; interactive; powerful decompiler (if licensed).
radare2 / Cutter — open-source, scriptable, good for automation.
Hopper — macOS/Linux disassembler with decompiler.
objdump, strings, readelf — lightweight CLI tooling.

Workflow (recommended)

Create project (Ghidra/IDA) and import .so. Use correct processor architecture (ARM/ARM64/x86).
Run auto-analysis; let the tool identify functions, references, and strings.
Locate JNI glue: search for JNI_OnLoad or Java_ exports — these map Java callsites to native functions.
Identify crypto routines: look for AES/SHA/RSA patterns or calls to known library functions (EVP_* , RSA_, AES_).
Find integrity/attestation parsing: look for parsing functions that handle ASN.1, CBOR, or JSON; check for validation of signatures.
Annotate function signatures and rename functions where you infer meaning; build a small call-graph for security-relevant functions.
Document: function addresses, parameters, expected inputs/outputs, side effects (file I/O, network).

Tip: Ghidra’s decompiler output is easier to read than raw assembly; combine decompiler view with string and cross-reference searches. Use Search -> For Strings to find embedded constants.

9.5 Mapping native ↔ Java (JNI)

JNI exports often follow Java_{package_path_replaced_with_underscores}_{Class}_{method} naming.
If a binary is stripped, check RegisterNatives calls; many apps register native functions at runtime with custom names — follow calls to (*env)->RegisterNatives.
Use jadx or decompiled Java to find System.loadLibrary("example") and where native methods are declared (native keyword). This gives entry points for native analysis.

9.6 Common native vulnerability classes (with examples & mitigation)

Be able to spot and explain each class; do not produce exploit code for production.

9.6.1 Buffer overflow / stack overflow

Cause: unbounded strcpy, memcpy with attacker-controlled length.
Impact: code execution, control-flow hijack (local privilege escalation).
Mitigations: -fstack-protector-strong, -D_FORTIFY_SOURCE=2, ASLR, NX, RELRO. Use safe APIs (memcpy_s, bounds checks).

9.6.2 Use-after-free / double-free

Cause: freeing objects then later reusing pointers.
Impact: memory corruption, information disclosure.
Mitigations: use smart pointers in C++ (unique_ptr/shared_ptr), avoid manual free patterns, enable sanitizers during testing.

9.6.3 Integer overflow / wrap-around

Cause: arithmetic that leads to incorrect allocation sizes.
Impact: buffer overflow via undersized allocation.
Mitigations: explicit integer checks, use size_t safely, sanitize inputs.

9.6.4 Format string vulnerabilities

Cause: printf(user_input) without format.
Mitigations: use printf("%s", user_input).

9.6.5 Improper crypto usage

Hardcoding keys, poor IV reuse, using ECB mode.
Mitigations: use vetted crypto library API correctly, use authenticated encryption (AES-GCM), never roll your own crypto.

9.6.6 Insecure native parsing (ASN.1, CBOR, JSON)

Poor parsing can lead to crashes or incorrect validation.
Mitigations: use hardened parsers, validate lengths, handle parse errors safely.

9.7 Dynamic native analysis (safe lab procedures)

Dynamic analysis reveals runtime behavior: stack frames, memory contents, and interactions.

9.7.1 Non-invasive observation: `LD_DEBUG`, `strace` (emulator/root)

strace can trace syscalls (file I/O, sockets).
LD_DEBUG=bindings can show dynamic loader behavior (lab-only).

9.7.2 Attaching gdbserver / lldb-server (lab)

Build app with debug symbols (for labs) or use symbol store.
Start gdbserver on device and connect with cross-gdb on host (NDK toolchain).
- Example (lab-only): adb forward tcp:5039 tcp:5039 adb shell gdbserver :5039 --attach <pid> # on host with NDK gdb: $NDK/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android-gdb (gdb) target remote :5039
Use breakpoints to inspect arguments and memory, but only in lab.

9.7.3 Frida-Gum (native interception)

Use Interceptor.attach or NativeFunction (see Module 6) to log native args and return values at runtime. This is often less intrusive than gdb and easier to script. Example patterns: read buffer pointers, capture plaintext before encryption.

9.7.4 Safe memory inspection

Use gdb to dump memory regions or use Frida’s Memory.readByteArray()—avoid writes unless necessary (writes can crash process and corrupt forensic evidence).

Always record: process snapshot, PID, timestamp, frida/gdb versions, and APK SHA256 for reproducibility.

9.8 Building minimal native test harnesses (defensive testing)

To validate a suspected weakness, create a small lab harness:

Extract function prototype (from native analysis).
Build a small native program (or Java wrapper) that calls the function with crafted benign inputs to observe behavior (crash/logging/return values).
Run harness in emulator or instrumented environment with ASAN/UBSAN where possible.

Note: Only use harnesses in labs; do not weaponize them.

9.9 Fuzzing native libraries

Fuzzing helps find memory-safety bugs.

Approaches:

Unit-level fuzzing: compile native code with AFL/LibFuzzer/afl++ harnesses and run fuzzers.
Integration fuzzing: feed malformed inputs via the Java layer that propagate to native functions.
Instrumentation: build with AddressSanitizer (ASAN) and run fuzzing to catch undefined behavior.

Practical notes:

Instrumentation requires rebuilding native code with proper sanitizer flags and is typically done with developer cooperation.
For closed-source native libs, consider input fuzzing through the Java interface and monitor for crashes (adb logcat, tombstones).

9.10 Hardening native code — practical, prioritized guidance

Compiler & linker flags (must-haves)
- -fstack-protector-strong or -fstack-protector-all
- -D_FORTIFY_SOURCE=2
- -Wl,-z,relro,-z,now (Full RELRO)
- PIE/ASLR: -fPIE -pie (position independent executables/libraries)
- -fvisibility=hidden for non-exported symbols
Memory-safety tools in CI
- Run ASAN/UBSAN on test builds.
- Integrate memory sanitizer runs into CI for native modules.
Symbol management
- Strip release binaries (strip --strip-unneeded) but retain symbols in a secure symbol store (for crash diagnostics).
- Use obfuscation of JNI registration instead of predictable Java_ exports.
Use vetted crypto libraries
- BoringSSL, libsodium, or platform crypto with clear guidance rather than custom crypto.
Input validation & defensive coding
- Validate lengths, use safe copy functions, check for integer overflow when allocating.
ASLR & PIE enforcement
- Ensure the build produces PIC and the platform enforces ASLR.
Code review & threat modeling
- Native code must be subject to the same code reviews and testing as JVM/ART code.

9.11 Patch & remediation guidance (developer-ready)

For each type of native finding, produce a patch recommendation:

Buffer overflow → bounds-check inputs, replace unsafe APIs, add unit tests, compile with -D_FORTIFY_SOURCE=2, run ASAN.
Format string → fix formatting calls (printf("%s", input)), add unit tests.
Crypto misuse → switch to AEAD (AES-GCM), use nonces correctly, never hardcode keys.
Missing RELRO/PIE → update build flags and verify with readelf -l and readelf -h.
Exposed JNI → use dynamic registration and hide symbols; keep symbol table minimal.

Provide example patch diffs (C/C++) in the report where possible, but do not include exploit payloads.

9.12 Evidence & reporting requirements for native findings

When reporting a native security issue, include:

Binary metadata: artifact SHA256, ABI, size, date/time.
Function location: module name + offset (e.g., libexample.so + 0x1234) and annotated snippet from disassembler.
Trigger: input sequence or Java call that exercises the code path (lab-only reproduction steps).
Crash artifacts: tombstone, stack trace, core dump (hashed).
Proof of concept: safe harness or unit test that reproduces the issue in a controlled environment (do not include exploit code for production).
Remediation suggestion: code patch, compiler flags, and tests.

9.13 Automation: scripts & templates

Provide a small automation script (lab-only) to extract .so summary and produce a short report.

#!/usr/bin/env bash
APK="$1"
OUTDIR="${2:-native_report}"
mkdir -p "$OUTDIR"
unzip -l "$APK" | awk '{print $4}' | grep '\.so$' > "$OUTDIR/native_paths.txt"

for p in $(cat "$OUTDIR/native_paths.txt"); do
  name=$(basename "$p")
  unzip -p "$APK" "$p" > "$OUTDIR/$name"
  echo "## $name" >> "$OUTDIR/report.md"
  file "$OUTDIR/$name" >> "$OUTDIR/report.md"
  readelf -h "$OUTDIR/$name" >> "$OUTDIR/report.md" 2>&1
  readelf -d "$OUTDIR/$name" >> "$OUTDIR/report.md" 2>&1
  echo -e "\nSymbols (top):" >> "$OUTDIR/report.md"
  readelf -s "$OUTDIR/$name" | head -n 40 >> "$OUTDIR/report.md"
  echo -e "\nStrings (interesting):" >> "$OUTDIR/report.md"
  strings "$OUTDIR/$name" | egrep -i "JNI_OnLoad|Java_|encrypt|decrypt|aes|rsa|attest|key|secret" | sort -u >> "$OUTDIR/report.md"
  echo -e "\n\n" >> "$OUTDIR/report.md"
done

sha256sum "$APK" > "$OUTDIR/apk.sha256"
echo "Report written to $OUTDIR/report.md"

This produces a starting point for triage; expand with Ghidra export steps as needed.

9.14 Labs (authorized, step-by-step outlines)

Lab 9-A — Native triage & mapping

Objective: extract all .so, identify JNI bridges, produce call-map linking Java ↔ native functions.
Steps:
1. Extract .so files (commands above).
2. Use strings and readelf to list JNI exports.
3. Open in Ghidra and annotate JNI_OnLoad and RegisterNatives calls.
Deliverable: Call-map (markdown) and native_report.md.

Lab 9-B — Static crypto review

Objective: find native crypto usage and evaluate correct API usage.
Steps:
1. In Ghidra, find functions referencing AES, RSA, EVP_ etc.
2. Check modes, IV handling, and key material sources.
Deliverable: Crypto assessment with remediation.

Lab 9-C — Dynamic native observation with Frida-Gum

Objective: log plaintext before native encryption in a test APK.
Steps:
1. Identify JNI export that does encryption.
2. Use Frida Interceptor.attach in lab to log input pointer content (safe, read-only).
3. Correlate with Java callsites and network traffic.
Deliverable: Frida log + evidence package (scripts, timestamps, APK sha256).

9.15 Deliverables for Module 9

native_report.md — triage summary for all .so files.
Annotated Ghidra project (lab-only) with notes on security-relevant functions.
Crash artifacts & harnesses (lab-only) where applicable.
Prioritized remediation list with code-level suggestions and CI/build flags.
Fuzzing / sanitizer playbook for native modules.

9.16 Common pitfalls & guidance for reviewers

Mistaking stripped symbols for “no risk” — stripped binaries hide names but not logic; require deeper analysis.
Relying only on CLI tools — readelf/strings are starting points; use a decompiler for real assessment.
Assuming native equals secure — native increases complexity and attack surface; apply strict testing.
Not preserving symbols — always keep a secure symbol store for post-release diagnostics and forensic analysis.

9.17 Further reading & resources

Android NDK docs — build flags and recommended compiler toolchains.
Ghidra and radare2/Cutter tutorials for ARM/ARM64.
OWASP guidelines for native code security (if available).
Papers and blog posts about common memory-safety bugs and mitigations (ASAN, FORTIFY).