Discovering Digital Footprints
Discovery is the process by which WeCheck's engine actively traverses the open web to locate every publicly accessible signal linked to a subject. Unlike a standard search query that returns indexed results, WeCheck's Discovery Engine orchestrates multiple autonomous AI agents working in parallel to find, connect, and validate data across platforms.
How Discovery Differs from a Search
A traditional search engine is reactive and linear — it returns pages that match a keyword. WeCheck's Discovery Engine is proactive and recursive — it follows leads, validates matches, and branches into new sources as it finds them.
The difference in practice: a Google search for "John Smith" returns thousands of unvalidated results. WeCheck's engine finds the specific John Smith you are investigating by anchoring on unique behavioral and visual signals, then maps every connected digital asset to that confirmed identity.
The Multi-Agent Architecture
When a scan is launched, WeCheck deploys four specialized agents that run in parallel:
| Agent | Mission |
|---|---|
| Mapper Agent | Scans known social and professional networks (LinkedIn, X, Instagram, Facebook) for profile matches |
| Deep-Web Agent | Probes unindexed forum archives, public databases, news repositories, and blog platforms |
| Relationship Agent | Extracts entities — people, organizations, companies — mentioned in the subject's footprint and maps connections |
| Sentiment Agent | Runs NLP analysis on discovered content to assess tone, context, and behavioral patterns |
Each agent reports its findings to a central orchestration layer that merges, deduplicates, and validates results before they appear in the report.
The Discovery Loop
Discovery doesn't stop at the first layer of results. WeCheck's agents perform a recursive investigation loop:
- Locate — Find a confirmed handle or profile on Platform A
- Extract — Pull references to other handles, platforms, or real-world identifiers
- Branch — Follow each new lead to its source platform
- Validate — Cross-reference the new find against the confirmed identity (visual + behavioral signals)
- Repeat — Continue until no new leads are found or the scan depth limit is reached
This loop is what allows WeCheck to surface hidden connections that a single-pass search would miss entirely — for example, linking a professional LinkedIn profile to an anonymous Reddit account through shared behavioral patterns and network overlap.
Discovery Scope
What WeCheck scans:
- Social and professional networks (LinkedIn, X/Twitter, Instagram, Facebook, TikTok)
- Forums and communities (Reddit, StackOverflow, Quora, niche forums)
- News and media publications (local and international press, press releases)
- Public image repositories (publicly shared photos and videos)
- Blogs and long-form writing platforms (Medium, Substack, personal sites)
- Public code repositories (GitHub, GitLab — public repos and commit history)
What WeCheck explicitly does not scan:
- Private accounts or content behind authentication walls
- Deleted or removed content (unless archived by third parties in the public domain)
- Dark web or encrypted networks
- Paywalled content
- Private messages or direct communications of any kind
Cross-Platform Identity Linkage
One of the most powerful aspects of WeCheck's Discovery Engine is its ability to connect profiles across platforms even when the subject has not explicitly linked them. Linkage is established through:
- Handle Similarity — Derived or variant usernames across platforms (e.g.,
jsmith_devon GitHub andjsmith.devon X) - Biographic Overlap — Consistent location mentions, employer references, or personal details across disconnected profiles
- Network Proximity — The subject interacts with the same group of people or entities across multiple platforms
- Visual Anchoring — The same face appears in profile images across unlinked accounts
Each cross-platform link is accompanied by a Confidence Score reflecting how strongly the evidence supports the connection. See AI Matching for details on how scores are calculated.
Identity Persistence
WeCheck includes a specialized capability to detect Identity Persistence — cases where a subject has attempted to reduce their digital footprint but left residual signals:
- Archived or cached versions of deleted posts (via public web archives)
- Mentions of the subject by name or handle in third-party content they cannot remove
- Images published by other accounts that include the subject
This capability is particularly valuable in legal discovery and high-stakes vetting scenarios where thoroughness is essential.
Ethics & Boundaries
WeCheck's Discovery Engine is designed around a strict OSINT (Open Source Intelligence) philosophy:
- Public domain only — Every data point collected is publicly accessible without credentials
- Minimal footprint — The engine collects only what is relevant to the investigation, not everything it finds
- No entrapment — WeCheck does not interact with, message, or provoke subjects during discovery
- Human review — Discovery surfaces signals; final analytical judgment always rests with human professionals