Module 15 — Data Storage Security & Local Forensics in Android

Scope & ethics reminder: All analysis, extraction, and tests described here are for authorized assessments and lab environments only. Never perform these actions against production user devices or systems without explicit written permission.

This expanded Module 15 goes far beyond the high-level overview: it gives you the full playbook — commands, scripts, forensic workflows, developer-safe remediation code, CI checks, test cases, templates for reporting, evidence-handling procedures and regulatory guidance. Use it for pentest engagements, red-team exercises in a lab, or to harden production apps via secure design and automated checks.

Table of contents (quick)

  1. Storage types & security implications
  2. Full enumeration commands & automated discovery scripts
  3. Deep inspection: SharedPreferences, SQLite, Files, Cache, Keystore, WebView, External storage, Media, Backups
  4. Forensic evidence collection: manifest, hashing, chain-of-custody, timestamps
  5. Reproducible lab extraction methods (emulator vs device, run-as, root, adb backup caveats)
  6. Automatic detection (grep/regex, static analysis, JADX patterns)
  7. Practical remediation: secure code samples (EncryptedSharedPreferences, SQLCipher, Keystore, secure deletion)
  8. Special cases: SDKs, logs/crash reporters, clipboard, screenshots, content providers, FileProvider, WebView cache
  9. Secure deletion and secure wipe limitations on Android filesystems
  10. CI/instrumentation tests & lint rules to prevent insecure storage commits
  11. Privacy, retention, and regulatory notes (GDPR, PCI-DSS implications)
  12. Evidence & report templates, severity mapping, example findings
  13. Lab exercises & reproducible test vectors
  14. Appendix: helper scripts (hashing, artifact bundling, redaction)

1. Storage types — security implications (detailed)

  • Internal app storage (/data/data/<pkg>/)
    • Default app sandbox; protected by Linux UID.
    • Threat: rooted device, adb run-as misconfigured, debuggable builds.
    • Key locations:
      • shared_prefs/ (XML)
      • databases/ (SQLite .db files)
      • files/ (app files)
      • cache/ (cache)
      • app_webview/ (WebView data on some devices)
  • External storage (/sdcard/ / Android/data/<pkg>/ / Android/media/<pkg>/)
    • World-readable historically; scoped storage limits now exist but misuses remain.
    • Threat: any installed app with storage permission can read.
  • Scoped storage
    • Apps targeting Android 10+ see scoped storage; but developers may bypass with requestLegacyExternalStorage.
  • ContentProvider / FileProvider exposures
    • Exposing files via content:// URIs may leak files to other apps if permissions/grants are wrong.
  • Keystore / TEE / StrongBox
    • Keys are protected here; good to use. But Keystore usage must be correct (non-exportable, auth gating).
  • Backups (Android Auto Backup / ADB backup)
    • Auto backups can include app data unless disabled. android:allowBackup="false" required to disable. adb backup is deprecated on newer Android but still used in older devices.
  • Crash reporters / analytics SDKs
    • Some SDKs capture logs, breadcrumbs, and may send PII to 3rd parties.

2. Full enumeration commands & automated discovery

Use these commands to fully enumerate storage contents and metadata in an evidence-friendly way (UTC timestamps, hashes).

2.1 Prereqs / safety

  • Sync clocks to NTP on host and device.
  • Use a fresh emulator snapshot or a dedicated lab device.
  • Run all commands in a case-specific working directory.

2.2 Basic environment manifest (one-liner)

date -u +"%Y-%m-%dT%H:%M:%SZ" > env_manifest.txt
adb shell getprop ro.build.product >> env_manifest.txt
adb shell getprop ro.build.version.release >> env_manifest.txt
adb shell uname -a >> env_manifest.txt

2.3 Enumerate internal storage

APP=com.example.app
OUT=case_artifacts
mkdir -p "$OUT"
adb shell "run-as $APP sh -c 'ls -la /data/data/$APP | sed -n \"1,200p\"'" > "$OUT/ls_root.txt"
adb shell "run-as $APP find /data/data/$APP -type f -printf '%p|%s|%TY-%Tm-%TdT%TH:%TM:%TS|%M\n'" > "$OUT/file_index.txt"

If run-as fails, capture the device state and note root status — do not brute force.

2.4 Pull and hash artifacts (preserve and hash)

# Copy app internal dirs to host (if run-as allowed)
adb shell "run-as $APP sh -c 'tar -C /data/data/$APP -cf - .' " > "$OUT/${APP}_data.tar"
sha256sum "$OUT/${APP}_data.tar" > "$OUT/${APP}_data.sha256"

2.5 List shared prefs & DBs quickly

adb shell "run-as $APP ls -la /data/data/$APP/shared_prefs" > "$OUT/shared_prefs_list.txt"
adb shell "run-as $APP ls -la /data/data/$APP/databases" > "$OUT/databases_list.txt"

2.6 Automated file walker (non-root fallback)

If run-as unavailable but device is rooted in lab:

adb root
adb shell "tar -C /data/data/$APP -cf - ." > "$OUT/${APP}_data_rooted.tar"

Record root use, reason, and obtain permission. Hash artifacts immediately.

3. Deep inspection: each storage artifact type

3.1 SharedPreferences (XML)

  • Location: /data/data/<pkg>/shared_prefs/*.xml
  • Typical misuses:
    • Storing plaintext tokens, passwords, PII.
    • MODE_WORLD_READABLE (deprecated but possible on old devices).
  • How to inspect:
adb shell run-as $APP cat /data/data/$APP/shared_prefs/config.xml > "$OUT/config.xml"
  • Patterns to search (grep):
grep -iE "(token|access|refresh|password|secret|api_key|auth)" "$OUT/config.xml" || true
  • Remediation: use EncryptedSharedPreferences (see code later).

3.2 SQLite databases

  • Location: /data/data/<pkg>/databases/*.db
  • Inspect:
adb shell run-as $APP sqlite3 /data/data/$APP/databases/app.db ".schema" > "$OUT/app_schema.sql"
adb shell run-as $APP sqlite3 /data/data/$APP/databases/app.db "select * from users limit 10;" > "$OUT/users_sample.txt"
  • Checks:
    • Are tokens, credentials, or card PAN fragments stored in plaintext?
    • Use of PRAGMA key indicates SQLCipher encrypted DB.
  • If DB encrypted with SQLCipher: you need the key or developer cooperation. Evidence: presence of PRAGMA cipher or failed readable schema.
  • Remediation: use SQLCipher with proper key derived from Keystore or per-device secret.

3.3 Files and caches (files/, cache/)

  • Look for:
    • Exported reports, PDFs, images with embedded PII.
    • Temporary files containing JSON responses, logs, or tokens.
  • Pull examples:
adb shell run-as $APP tar -C /data/data/$APP/files -cf - . > "$OUT/files.tar"
  • Check timestamps, unlinked temp files, and cleanup behavior on logout.

3.4 WebView data & cookies

  • Location varies by Android/WebView implementation (e.g., /data/data/<pkg>/app_webview/).
  • Inspect cookie databases or cache.
  • Attack surface: webviews may store session cookies or localStorage with sensitive tokens.

3.5 External storage & media

  • adb shell ls -R /sdcard/Android/data/$APP/
  • Look for exported attachments or exported logs.
  • If app saves PDFs to external storage, verify access controls and expiration.

3.6 Keystore usage

  • Keystore artifacts are not easily pulled — you verify usage via code:
    • Does the app generate keys in Keystore (check for KeyGenerator.getInstance(...,"AndroidKeyStore"))?
    • Are keys marked non-exportable? (KeyGenParameterSpec settings)
  • Remediation: use Keystore for keys and derive encryption keys.

3.7 FileProvider & content providers

  • Manifest check: AndroidManifest.xml provider entries.
  • Confirm grantUriPermissions and narrow <paths> definitions.

3.8 Backups

  • Check AndroidManifest.xml for android:allowBackup.
  • If true, app data may be backed up to cloud. For banking apps set false.

4. Forensic evidence collection: manifest, hashing, chain-of-custody

4.1 Evidence policy basics

  • Use a Case ID for grouping artifacts (CASE-YYYYMMDD-XXX).
  • Compute SHA-256 for every artifact as soon as it is pulled.
  • Log who performed each action (operator), timestamp (UTC), command used, and device serial.

4.2 Example evidence register CSV columns

case_id,file_name,relative_path,sha256,size_bytes,timestamp_utc,creator,tool_version,notes

4.3 Hashing example

sha256sum "$OUT/app_data.tar" | awk '{print $1}' > "$OUT/app_data.tar.sha256"

4.4 Chain-of-custody note

  • Keep original copies read-only (or in append-only storage).
  • Work on copies.
  • Use secure storage (S3 with object lock or a vault).
  • Document any root use; root introduces additional risk to evidence integrity — capture a full device image if possible.

5. Reproducible lab extraction methods

5.1 Emulator vs real device

  • Emulator: easy to snapshot and reset; some protections differ (Play services may be limited).
  • Real device: closer to production behavior; more realistic for attestation, Keystore, StrongBox behavior.

5.2 run-as vs rooted extraction

  • run-as requires the app to be debuggable or the device to allow it. It is preferred because it uses the app’s UID without rooting.
  • If run-as fails, obtain operator permission and use a rooted device or emulator snapshot for extraction; always document root steps.

5.3 Full filesystem images (lab only, for deep forensics)

  • Use adb shell su -c 'dd if=/dev/block/by-name/userdata of=/sdcard/userdata.img' on rooted lab devices — only in lab.
  • Extract and hash that image. Use mount tools and sleuthkit if needed. Keep images read-only.

5.4 Snapshot workflow

  1. Snapshot before test (emulator snapshot or device image).
  2. Run test interactions.
  3. Pull artifacts and hash.
  4. Restart/restore snapshot for next test case.

6. Automatic detection — code scanning & heuristics

6.1 Grep/regex patterns (fast)

Search decompiled source or repository for risky patterns:

# Hardcoded keys
grep -RIn --exclude-dir={.git,build} -E "api[_-]?key|secret|password|privateKey|hardcod" .

# SharedPreferences usage
grep -RIn "getSharedPreferences" .

# Keystore and insecure ciphers
grep -RInE "Cipher.getInstance|SecretKeySpec|AES/ECB|AES/CBC|MD5|SHA1" .

# File save to external
grep -RIn "getExternalStorage|Environment.getExternalStorage" .

6.2 Static analysis tools & rules

  • MobSF (Mobile Security Framework): automated APK scanning, reveals strings, insecure storage patterns.
  • QARK, Semgrep: write custom rules to detect risky API usage (e.g., getSharedPreferences(..., MODE_WORLD_READABLE)).
  • Android Lint & custom lint rules: detect allowBackup=true, missing android:exported, insecure file writes, use of FileOutputStream to external paths.

6.3 Example Semgrep rule (pseudo)

Detects getSharedPreferences storing tokens without encryption:

rules:
- id: sharedprefs-plaintext-token
  pattern: |
    $PREFS = context.getSharedPreferences(...);
    $PREFS.edit().putString($KEY, $VALUE).apply()
  message: "Storing tokens or credentials in SharedPreferences without encryption"
  severity: WARNING

7. Practical remediation — secure code samples

7.1 EncryptedSharedPreferences (AndroidX Jetpack) — Kotlin

// build.gradle: implementation "androidx.security:security-crypto:1.1.0-alpha03" (check latest)
val masterKey = MasterKey.Builder(context)
    .setKeyScheme(MasterKey.KeyScheme.AES256_GCM)
    .build()

val securePrefs = EncryptedSharedPreferences.create(
    context,
    "secure_prefs",
    masterKey,
    EncryptedSharedPreferences.PrefKeyEncryptionScheme.AES256_SIV,
    EncryptedSharedPreferences.PrefValueEncryptionScheme.AES256_GCM
)

securePrefs.edit().putString("refresh_token", token).apply()

Notes

  • MasterKey is stored in Keystore; keys non-exportable where possible.
  • Prefer per-user keys and require biometric gating for high-value secrets.

7.2 SQLCipher for Android (example)

  • Add SQLCipher dependency and open DB with a key derived from Keystore:
// derive symmetric key or decrypt stored DB key via Keystore
val passphrase: ByteArray = getDbKeyFromKeystore()
val factory = SupportFactory(SQLiteDatabase.getBytes(passphrase))
val db = Room.databaseBuilder(context, AppDatabase::class.java, "secure.db")
    .openHelperFactory(factory)
    .build()

Key management

  • Store DB key only in Keystore or derive on runtime from device-bound private key.

7.3 Keystore-backed encryption for files (Kotlin)

Example pattern: generate AES-GCM key in Keystore (via AndroidKeyStore), then use it to encrypt/decrypt file content.

// Key generation (only once)
val keyGen = KeyGenerator.getInstance(KeyProperties.KEY_ALGORITHM_AES, "AndroidKeyStore")
val spec = KeyGenParameterSpec.Builder("file_key",
    KeyProperties.PURPOSE_ENCRYPT or KeyProperties.PURPOSE_DECRYPT)
    .setBlockModes(KeyProperties.BLOCK_MODE_GCM)
    .setEncryptionPaddings(KeyProperties.ENCRYPTION_PADDING_NONE)
    .setKeySize(256)
    .build()
keyGen.init(spec)
val key = keyGen.generateKey()

// Use cipher
val cipher = Cipher.getInstance("AES/GCM/NoPadding")
cipher.init(Cipher.ENCRYPT_MODE, key)
val iv = cipher.iv
val ciphertext = cipher.doFinal(plainBytes)
// Store iv + ciphertext in app-private file

7.4 Biometric gating for key use

Require user authentication for key operations:

.setUserAuthenticationRequired(true)
.setUserAuthenticationValidityDurationSeconds(0) // every use requires auth

7.5 Proper logging hygiene

  • Never Log.d("TAG", password) or print entire responses. Use structured logs with redaction.
  • Example helper:
fun redact(value: String?, keepLast: Int = 4): String {
  if (value == null) return "null"
  val len = value.length
  if (len <= keepLast) return "*" .repeat(len)
  return "*".repeat(len - keepLast) + value.takeLast(keepLast)
}

7.6 API tokens: recommended storage & refresh

  • Access tokens: keep in memory or short-lived storage; refresh token encrypted with Keystore.
  • On logout, wipe Keystore keys if possible or mark tokens revoked server-side.

8. Special cases — SDKs, logs, clipboard, webviews

8.1 Third-party SDKs

  • Audit SDKs for data collection. Some auto-capture logs/stack traces that include PII.
  • Use Gradle dependency scanning and runtime checks (Network logs while enabling SDK debug modes).

8.2 Crash reporters (Crashlytics, Sentry)

  • Configure to scrub PII and disable capturing entire responses. Use beforeSend hooks to redact.

8.3 Clipboard leaks

  • Avoid placing sensitive tokens/passwords on the clipboard. If necessary, clear clipboard after short period.

8.4 Screenshots & FLAG_SECURE

  • Use window.setFlags(WindowManager.LayoutParams.FLAG_SECURE, ...) for sensitive screens to prevent screenshots.

8.5 WebView localStorage & cookie jar

  • Do not store long-lived tokens in localStorage; prefer server-managed cookies with HttpOnly if using web endpoints.
  • Clear WebView caches on logout: CookieManager.getInstance().removeAllCookies(null) and webView.clearCache(true).

9. Secure deletion & filesystem caveats

9.1 Filesystem realities

  • Android filesystems (ext4, f2fs) do not guarantee secure erase at file delete — data may persist until overwritten.
  • Wear-leveling and flash characteristics mean ‘overwrite’ may not ensure removal.

9.2 Practical secure-wipe approach (best-effort)

  • Overwrite file content with random bytes of same length, then delete:
fun secureDelete(file: File) {
  val fos = FileOutputStream(file)
  val random = SecureRandom()
  val zeros = ByteArray(4096)
  var remaining = file.length()
  while (remaining > 0) {
    random.nextBytes(zeros)
    val toWrite = minOf(zeros.size.toLong(), remaining).toInt()
    fos.write(zeros, 0, toWrite)
    remaining -= toWrite
  }
  fos.flush()
  fos.close()
  file.delete()
}
  • Limitations: not guaranteed on flash. For strong privacy requirements, encrypt at rest and delete the key (cryptographic erase) is preferable.

9.3 Cryptographic erase (recommended)

  • If data is encrypted, deleting the encryption key (from Keystore) renders ciphertext practically unreadable.
  • Generate one master key for encrypting files; on logout or wipe, call keystore.deleteEntry(alias).

10. CI, unit & instrumentation tests to prevent insecure commits

10.1 Lint rules & Gradle checks

  • Enforce android:allowBackup="false" in manifest.
  • Fail build if getExternalStorage usage detected.
  • Detect Log.v/d/e statements containing keywords like password, token via grep pre-commit hook.

10.2 Unit / instrumentation tests

  • Tests to assert that SharedPreferences do not contain sensitive keys in plaintext.
  • Example Espresso test to simulate login, then check shared_prefs for expected encrypted keys.

10.3 Pre-merge Semgrep / MobSF checks

  • Add MobileSec scanning in PR pipeline; block PRs with findings marked HIGH (hardcoded keys, insecure AES/ECB, allowBackup true).

10.4 Example GitHub Action (pseudo)

  • Run gradle assembleDebug
  • Run mobSF scan via container
  • Run semgrep rules
  • Fail on high-severity results.

11. Privacy, retention & regulatory considerations

11.1 GDPR & PII

  • Minimize storage of personal data; document lawful basis.
  • Provide mechanisms for data export and deletion.
  • Keep data minimization & retention policies — encrypt and restrict access.

11.2 PCI-DSS (for card data)

  • Never store full PANs; treat PAN fragments per PCI guidance.
  • Strong cryptography, key management, and periodic reviews are required.

11.3 Logging & retention

  • Redact PII from logs; only keep aggregate telemetry.
  • Define retention schedules and deletion workflows.

12. Evidence & report templates, severity mapping

12.1 Severity mapping (storage-specific)

  • Critical: plaintext long-lived credentials or card PAN stored on external storage / unencrypted DB; secrets exported to 3rd-party analytics.
  • High: refresh tokens in plaintext in SharedPreferences or files inside /data/data accessible via easy attack (run-as or debuggable build).
  • Medium: temporary cache includes PII but cleared within short timeframe; logs with sensitive fields but not full credentials.
  • Low: debug flags, non-sensitive metadata.

12.2 Example report snippet (finding)

Finding: Plaintext refresh token stored in SharedPreferences
Location: /data/data/com.example.app/shared_prefs/auth_prefs.xml
Evidence: auth_prefs.xml (sha256: <hash>), logcat capture showing token usage (sha256: <hash>)
Reproduction steps:
  1. Install app v1.2.3 on lab emulator snapshot CASE-...
  2. Perform login using test account
  3. Pull /data/data/com.example.app/shared_prefs/auth_prefs.xml
Impact: Refresh token theft allows long-lived session hijack.
Remediation: Move refresh token to EncryptedSharedPreferences + Keystore; rotate refresh tokens server-side.

13. Lab exercises & reproducible test vectors

13.1 Exercise A — Find tokens in SharedPreferences

  • Deploy intentionally vulnerable test app (dev build).
  • Login with test account.
  • Use run-as to pull shared_prefs and grep for token|refresh.
  • Evidence: file extract, hash, screenshot.

13.2 Exercise B — SQLCipher integration test

  • Replace plain SQLite with SQLCipher in sample app.
  • Verify schema unreadable without key; store key in Keystore and demonstrate decryption works.

13.3 Exercise C — Keystore-backed file encryption & cryptographic erase

  • Implement file encryption using Keystore and test that deleting key renders content inaccessible.

13.4 Exercise D — Crash report redaction

  • Configure Sentry to redact password and verify no PII is sent.

14. Appendix — helper scripts

14.1 File walker + hash (bash)

#!/usr/bin/env bash
APP=$1
OUT=$2
mkdir -p "$OUT"
adb shell "run-as $APP find /data/data/$APP -type f -print0" | \
  tr '\0' '\n' | while read -r f; do
    adb shell "run-as $APP cat $f" > "$OUT/$(echo $f | tr '/' '_')"
    sha=$(sha256sum "$OUT/$(echo $f | tr '/' '_')" | awk '{print $1}')
    echo "$f,$sha" >> "$OUT/manifest.csv"
done

14.2 Artifact bundler (python)

# bundler.py - collect artifacts, compress, and produce checksums
import subprocess, sys, os, hashlib, json
out = sys.argv[1]
os.makedirs(out, exist_ok=True)
# assume we already have files in out; compute checksums
manifest=[]
for root,_,files in os.walk(out):
  for f in files:
    path=os.path.join(root,f)
    h=hashlib.sha256(open(path,'rb').read()).hexdigest()
    manifest.append({"path":path,"sha256":h})
open(os.path.join(out,"manifest.json"),"w").write(json.dumps(manifest,indent=2))
subprocess.run(["tar","-C",out,"-czf",f"{out}.tgz","."])

14.3 Redaction helper (python)

# redact.py - redact tokens in files by regex
import re,sys
pattern=re.compile(r'([A-Za-z0-9_\-]{20,})') # naive
for fname in sys.argv[1:]:
  txt=open(fname).read()
  red=pattern.sub('***REDACTED***',txt)
  open(fname+".redacted","w").write(red)

(Use better regexes for production redaction and avoid over-redaction.)

15. Final recommendations (executive & developer)

  • Encrypt everything: use Keystore-backed encryption for all secrets and SQLCipher for DBs storing sensitive data.
  • Minimize retention: keep tokens short-lived and clear caches on logout.
  • Sandbox & QA: ensure production builds are non-debuggable and allowBackup=false.
  • CI gates: block PRs introducing insecure storage patterns via static analysis and Lint.
  • Telemetry & monitoring: alert on unusual local data persistence patterns and provide a re-test window.
  • Forensics readiness: keep manifest & artifact retention policy and train teams on chain-of-custody.