Module 2 — Reconnaissance & Information Gathering

This module teaches how to gather actionable intelligence about an Android app and its backend in a safe, repeatable, and auditable way. Reconnaissance is where you build the map: package identifiers, network endpoints, app behavior, server infrastructure, and any external resources the app depends on. The goal is a prioritized, evidence-backed view that feeds static and dynamic analysis and enables targeted testing in later modules.

Reminder: All commands and labs below are for authorized, isolated test environments and intentionally vulnerable test apps (or production apps strictly within your written scope). Do not run any tests against systems you don’t have explicit permission to test.

Learning objectives

After Module 2 you will be able to:

  • Obtain APKs reliably (Play Store, device extraction, vendor-supplied).
  • Produce a reproducible fingerprint of an APK (package name, signer fingerprint, SDK targets).
  • Enumerate backend hosts, endpoints, and certificate characteristics using static artifacts and passive scanning.
  • Detect evidence of certificate pinning and attestation usage in static code.
  • Produce a prioritized reconnaissance report (artifact list, attack surface matrix, risk hypotheses).

2.0 Recon workflow (high level)

  1. Collect artifacts (APK, Play Store metadata, release notes, sample accounts).
  2. Fingerprint the APK (package, cert, targetSdk, permissions).
  3. Static surface discovery (strings, manifest, embedded endpoints, third-party SDKs).
  4. Passive infrastructure mapping (DNS, registration, hosting, CDN, subdomains).
  5. Dynamic discovery in a lab (proxy captures, network traces) — only with lab CA installed.
  6. Analysis & prioritization: list attack surface components and hypotheses for verification.

2.1 Sources & artifact collection

A. Public/OSINT sources (do first)

  • Play Store listing: app description, developer, app id (package name), supported regions, updated date, changelog.
  • APK repositories: APKMirror, APKPure (only for lab; verify legitimacy & hashes).
  • GitHub/GitLab/StackOverflow: public repos or leaked code/config.
  • Company website & developer docs: API docs, developer keys, subdomains.
  • Third-party SDK docs: analytics, crash reporting (these expand the attack surface).

Why: public metadata often reveals package names, API hosts, SDK versions and sometimes test credentials.

B. Acquiring the APK (authorized)

  1. From Play Store (lab or authorized vendor builds)
    • Use vendor-provided signed APKs or Play Console artifacts if available.
    • If allowed and needed, use controlled Play Store downloaders in your lab; always validate downloaded APK SHA256 against vendor-provided checksum.
  2. From a device (adb pull)
# list installed packages (filter by suspicious keywords)
adb shell pm list packages | grep -i bank

# find path for the package
adb shell pm path com.example.bank

# pull the APK to local machine
adb pull /data/app/com.example.bank-1/base.apk ./app.apk

Note: pulling from device may require the device to be set up in the lab and the app to be installed in the test account. Respect device policies.

  1. From artifact repository / CI
  • Ask devs to provide the release APK and the signing certificate fingerprint. Store artifacts with checksums.

2.2 APK fingerprinting & metadata (commands + interpretation)

Quick fingerprint (minimal)

# show package name, version, sdk
aapt dump badging app.apk

# verify signer and show certs (apksigner)
apksigner verify --print-certs app.apk

# list files inside
unzip -l app.apk

What to record:

  • package: name (e.g., com.example.bank)
  • versionName / versionCode
  • targetSdkVersion and minSdkVersion
  • Signing certificate SHA-256 / SHA-1 (critical for attestation, Play Integrity)
  • Presence of lib/ (native code), assets/, res/, and AndroidManifest.xml

Extracting manifest and basic checks

# decode manifest to human readable XML
apktool d app.apk -o app_unpacked

# inspect manifest
less app_unpacked/AndroidManifest.xml

Look for: exported components, allowBackup, sharedUserId, custom permissions, and uses-permission list.

2.3 Strings, endpoints & secrets discovery (static)

Common commands

# quick grep for URLs, API keys, or sensitive keywords
strings app.apk | egrep -i "http|https|api|endpoint|token|key|secret|client"

# search unpacked resources
grep -R --line-number -i "http" app_unpacked | head
grep -R --line-number -i "key\|secret\|token" app_unpacked | head

Use JADX for logic context

  • Open jadx-gui app.apk and search for:
    • HttpClient, OkHttpClient, CertificatePinner, KeyStore, getDeviceId, getSerial, getAndroidId, KEY_, ATTENSTATION, PlayIntegrity, SafetyNet, attest.
  • Examine call sites where endpoints or keys are used — this helps map which flows are security-critical.

What to capture:

  • Hostnames and full endpoints (record file + line)
  • Hardcoded API keys or client secrets (file + line + hash of string)
  • Third-party SDK endpoints (Crashlytics, analytics) — useful for privacy surface

2.4 Backend & infrastructure enumeration (passive mapping)

Use static results (endpoints) and public tools to map servers and domains.

DNS, WHOIS, and hosting

Commands (lab host):

# resolve and list IPs
dig +short api.example.com

# get TLS cert info
echo | openssl s_client -connect api.example.com:443 -servername api.example.com 2>/dev/null | openssl x509 -noout -text

# whois
whois example.com

Subdomain enumeration (passive & active)

  • Passive: crt.sh, VirusTotal, Passive DNS (via OSINT tools) — gather subdomain candidates.
  • Active (authorized only): run a limited subdomain discovery (avoid aggressive scanning).

Why: identify staging, dev, or misconfigured endpoints (e.g., api-dev.example.com).

TLS characteristics

  • Check certificate chain, issuer, validity, SANs. Note if the cert is issued by a mobile-oriented CA or a known CDN.
  • Identify whether the API uses HSTS, TLS versions, supported ciphers (use sslyze or openssl).

2.5 Detecting attestation & pinning usage (static + clues)

Static indicators to look for

  • Play Integrity / SafetyNet: strings like com.google.android.play.integrity, SafetyNet, attestation, or class names referencing PlayIntegrityManager.
  • Key Attestation / Keystore: usage of KeyAttestation, Keymaster, references to KeyStore, KeyPairGenerator with AndroidKeyStore.
  • Certificate pinning: presence of CertificatePinner, OkHttpClient.Builder().certificatePinner(...), or custom hostname verifier classes.

Search examples:

# search for "CertificatePinner" or "certificatePinner"
grep -R --line-number -i "CertificatePinner\|certificatePinner" app_unpacked || true

# search for "PlayIntegrity", "SafetyNet", "attest"
grep -R --line-number -i "PlayIntegrity\|SafetyNet\|attest" app_unpacked || true

# search for AndroidKeyStore usage
grep -R --line-number -i "AndroidKeyStore\|KeyStore\|KeyAttestation" app_unpacked || true

Interpretation: presence of these strings does not prove server-side validation. You must require server tokens (Module 13 forensic checklist) and validate them on server.

2.6 Passive network capture in lab (authorized, CA installed)

If permitted and in your lab (with lab CA installed), capture an initial session to observe flows and tokens.

Basic proxy capture with mitmproxy

# run mitmproxy (example)
mitmproxy -w capture.mitm

# set emulator/device proxy to lab-proxy IP:port and install CA (Module 0 covered)
# run app actions and then save the capture

Packet capture with tcpdump on proxy VM

sudo tcpdump -i any -s 0 -w session.pcap host api.example.com
# stop capture after reproducing use case

Important: if certificate pinning present, you’ll see TLS handshakes failing or the client may bypass proxy. Do not attempt to bypass pinning on production systems. Instead record the evidence (errors, exceptions in logcat) and request vendor attestation tokens.

2.7 Dynamic artifact collection (logs & tokens)

During authorized dynamic tests, collect:

  • adb logcat output (timestamped) capturing app startup and transactions.
adb logcat -v threadtime > logcat_$(date -u +"%Y%m%dT%H%M%SZ").log
  • Proxy traces (mitm/ Burp), PCAPs (tcpdump).
  • Any attestation/token strings seen in the app network flows (copy base64 token for server-side validation).
  • App’s runtime device properties reported (ro.build, ro.boot.*). If you need properties, collect with care and only in lab:
adb shell getprop | egrep -i "ro\.build|ro\.boot|ro\.product|ro\.secure"

Store each artifact with SHA-256 and metadata (who collected, when, host/tool/version).

2.8 Prioritization: building an attack surface matrix

For each finding, capture:

  • Component (e.g., exported content provider, endpoint /auth/token, embedded key)
  • Evidence (file/line/pcap/log reference)
  • Impact hypothesis (what an attacker could do if exploited)
  • Likelihood (high/medium/low based on exposure & complexity)
  • Recommended verification (static proof, lab reproduction, require vendor artifact)

Example row:

ComponentEvidenceImpactLikelihoodVerify
content://com.example.bank.provider/accountsapp_unpacked/AndroidManifest.xml:L123data leak of account listMediumattempt to read using test app or request vendor to describe access control

2.9 OSINT & supply-chain: checking third-party SDKs

  • List SDKs found in build.gradle, lib/, or strings.
  • Check SDK versions against known CVEs (Snyk, NVD) — automated tools like MobSF can help.
  • Identify whether crash/analytics SDKs transmit PII (privacy risk).

Commands (example to list SDKs by package names in decompiled code):

grep -R --line-number -i "com.crashlytics\|com.google.firebase\|io.sentry" app_unpacked | head

2.10 Automation & reproducibility (scripts & MobSF)

  • Use MobSF for quick automated static results (APK unpacking, manifest, API strings, known insecure patterns). It’s a great triage tool for labs.
  • Create small reproducible scripts to:
    • Extract package and cert info.
    • Search for endpoints & suspicious strings.
    • Produce a JSON summary used by your tracking system.

Example minimal script (bash) to extract basic metadata:

#!/usr/bin/env bash
APK="$1"
aapt dump badging "$APK" > metadata.txt
apksigner verify --print-certs "$APK" >> metadata.txt
unzip -l "$APK" | sed -n '1,60p' >> metadata.txt
sha256sum "$APK" > apk.sha256

2.11 What to ask vendors / what evidence to request

If a third-party pen-test claims findings, request (Module 13 will expand this):

  • Original APK they tested (SHA256).
  • Exact device/emulator model, OS, kernel version, bootloader state.
  • Raw adb logcat and proxy captures (PCAP, Burp logs).
  • Any attestation tokens with nonces used during test (base64).
  • Steps to reproduce within their lab (high-level + artifacts) and an environment manifest.

2.12 Deliverables for Module 2

Produce a Recon Report containing:

  1. Artifact manifest: APK(s), cert fingerprints, checksums, manifest snapshot.
  2. Endpoint inventory: list of API hosts, endpoints, observed tokens, and TLS characteristics.
  3. SDK & third-party list: versions and potential impacts.
  4. Exported components inventory: with lines and risk comments.
  5. Attack surface matrix (prioritized).
  6. Evidence package: raw PCAPs, logcats, and JSON summary used for follow-on testing.

2.13 Labs (recommended exercises)

  • Lab A — APK fingerprint & string harvest
    • Use a test APK: run the fingerprinting commands, extract endpoints and keys, and produce the artifact manifest.
  • Lab B — Passive backend mapping
    • Given endpoints from Lab A, run dig, fetch cert info, and map hosting (CDN vs origin). Produce a report of TLS chain and recommended server-side checks.
  • Lab C — Proxy capture (with lab CA)
    • Configure an emulator to use your lab proxy, perform sample app flows, capture mitmproxy/Burp logs, and record observed headers and tokens. Note failures / TLS handshake anomalies.

2.14 Common pitfalls & detection of deceptive indicators

  • False positives from strings: a string that looks like a key might be a placeholder — always cross-check in code call sites.
  • Emulator vs real device differences: Play Integrity and SafetyNet often behave differently on emulators — don’t conclude absence solely from emulator behavior.
  • Obfuscation: ProGuard/R8 can make strings and method names confusing — use control-flow analysis and search by cryptographic calls, not only names.
  • Misleading vendor claims: if a vendor says “we validated server attestation” ask for the server validation logs and the attestation token — don’t rely on client-only screenshots.

2.15 Quick commands cheat sheet (recap)

# Acquire from device
adb shell pm path com.example.bank
adb pull /data/app/com.example.bank-1/base.apk ./app.apk

# Basic APK metadata
aapt dump badging app.apk
apksigner verify --print-certs app.apk
sha256sum app.apk

# Unpack & search
apktool d app.apk -o app_unpacked
strings app.apk | egrep -i "http|api|token|key|secret"
grep -R --line-number -i "CertificatePinner\|PlayIntegrity\|SafetyNet\|attest" app_unpacked

# TLS & host info
dig +short api.example.com
echo | openssl s_client -connect api.example.com:443 -servername api.example.com 2>/dev/null | openssl x509 -noout -text

# Capture logs in lab
adb logcat -v threadtime > logcat_$(date -u +"%Y%m%dT%H%M%SZ").log
sudo tcpdump -i any -s 0 -w session.pcap host api.example.com