2.2 Footprinting Sources & Techniques

Key Takeaways

  • Google hacking (dorking) uses operators like site:, filetype:, inurl:, intitle:, and cache: to surface exposed directories, login portals, and documents the organisation never meant to publish
  • WHOIS reveals registrant, registrar, and contact data plus name servers; DNS footprinting maps A, AAAA, MX, NS, CNAME, TXT, and SOA records to discover hosts and mail/name servers
  • An unrestricted DNS zone transfer (AXFR) can dump the entire zone — every internal hostname and IP — and is one of the highest-impact footprinting misconfigurations
  • Network-range footprinting uses Regional Internet Registry (RIR) and routing data (traceroute, BGP) to identify the public IP blocks an organisation owns
  • Countermeasures include registrar privacy, restricting AXFR to authorised secondaries, split-horizon DNS, stripping document metadata, and sanitising email headers and server banners
Last updated: June 2026

Search Engines and Google Hacking

Search engines are the first footprinting source. Google hacking (or Google dorking) uses advanced search operators to find content that organisations indexed without realising. The operators CEH tests:

OperatorWhat it findsExample
site:Pages limited to one domainsite:target.com
filetype: (or ext:)Documents of a typesite:target.com filetype:pdf
inurl:A string in the URLinurl:admin login
intitle:A string in the page titleintitle:"index of"
cache:Google's cached copycache:target.com
link: / related:Linking / similar sitesrelated:target.com

The Google Hacking Database (GHDB), maintained by Exploit-DB, catalogues thousands of pre-built dorks for exposed cameras, login portals, configuration files, and error messages that leak versions. Operators combine with boolean logic and exclusion (-) to refine results, and the same techniques work on Bing (ip:), DuckDuckGo, and specialised engines. txt` is a hint, not a control — never rely on it to hide secrets, since it actually advertises the paths you want hidden), require authentication, remove directory listing, and request removal of already-indexed sensitive URLs through search-console tools.

WHOIS, DNS, and Network-Range Footprinting

WHOIS is a query/response protocol (TCP port 43) for registration databases held by registrars and Regional Internet Registries. A WHOIS lookup returns the registrant and contact details, registrar, creation/expiry dates, and the domain's name servers — a direct pivot into DNS.

DNS footprinting enumerates a domain's resource records with tools like nslookup, dig, host, and dnsenum. The records that matter:

RecordReveals
A / AAAAIPv4 / IPv6 address of a host
MXMail servers (and their priority)
NSAuthoritative name servers
CNAMEAliases (often expose cloud providers)
TXTSPF/DKIM, ownership-verification tokens
SOAZone serial, primary NS, admin email
PTRReverse DNS (IP → hostname)

DNS zone transfer (AXFR)

A zone transfer replicates the full zone from a primary name server to a secondary. If a server allows AXFR to any requester, an attacker dumps every record in the zone — internal hostnames, IPs, and the whole network map — in one query (dig axfr @ns1.target.com target.com). This is one of the highest-value footprinting wins.

Network-range footprinting identifies the public IP blocks an organisation owns by querying the Regional Internet Registries (RIRs) — ARIN (North America), RIPE NCC (Europe), APNIC, LACNIC, AFRINIC — and by tracing routes (traceroute/tracert) and BGP/ASN data.

Two refinements the exam likes. First, traceroute maps the hops between you and the target and can reveal the perimeter device and the upstream provider; it works over ICMP (Windows tracert), UDP, or TCP, and a filtered hop shows as * * *. Second, certificate transparency (CT) logs (crt.sh) and reverse-IP/passive-DNS services expand a single seed domain into dozens of subdomains and shared-hosting neighbours without ever touching the target — a powerful passive technique for discovering forgotten dev/staging hosts.

Website, Email, and Social-Media Footprinting

Website footprinting examines a target's web presence without exploiting it: HTTP response headers and banners (Server: Apache/2.4.x, X-Powered-By: PHP/8.x), HTML comments, robots.txt, sitemap.xml, hidden form fields and parameters, and the Wayback Machine (archive.org) for content the site has since removed. Tools like theHarvester (emails, subdomains, hosts from public sources), Maltego (graphs relationships among domains, IPs, people, and emails), Recon-ng, and Shodan/Censys (search engines for internet-connected devices, returning open ports and banners) automate much of this.

com`) — the single most useful input for password-spraying and phishing — and, via email headers (Received, Return-Path, X-Originating-IP) and tracking, the mail-server path and sometimes internal IPs. Web bugs / tracking pixels embedded in an email can further confirm when and where a message was opened, and the SPF, DKIM, and DMARC TXT records expose which servers are authorised to send for the domain (and whether spoofing protection is weak). Social-media footprinting harvests employee names, roles, locations, and technology mentions for pretexting; out-of-office auto-replies and conference talks are classic over-shares.

Competitive intelligence gathering rounds out website/email footprinting — public filings, press releases, acquisition news, and job boards reveal business direction, new technology rollouts, and staffing gaps that an attacker times campaigns around. None of this requires touching the target, which is why it is so hard to detect and why minimisation, not monitoring, is the control.

Countermeasures Summary

SourceCountermeasure
WHOISRegistrar privacy / proxy registration
DNS / AXFRRestrict zone transfers to authorised secondaries; split-horizon (internal vs. external) DNS
Google dorkingAuthentication on sensitive paths; remove directory listing; do not rely on robots.txt to hide
Website bannersSuppress/anonymise Server and X-Powered-By headers
DocumentsStrip metadata before publishing
EmailSanitise headers at the gateway; avoid predictable address conventions where feasible
Social mediaEmployee OSINT-awareness training; public-content review policy

The defensive through-line: assume everything public will be aggregated, and minimise what is exposed.

Test Your Knowledge

A misconfigured authoritative name server permits an unrestricted DNS zone transfer (AXFR) to any requester. What is the single most significant footprinting risk this introduces?

A
B
C
D
Test Your Knowledge

Which footprinting source most directly reveals an organisation's email-address naming convention (for example, first.last@company.com)?

A
B
C
D
Test Your Knowledge

An analyst runs theHarvester and Maltego against a domain during an authorised engagement. What category of activity is this, and what is its primary output?

A
B
C
D