Credential Harvesting: How Attackers Collect Passwords at Scale

Passwords remain the keys to the kingdom. Despite decades of security innovation, the humble username-and-password combination still guards the vast majority of business systems, cloud applications, and financial accounts. Attackers know this, and they have built an entire industrial ecosystem around credential harvesting — the systematic collection of login credentials at massive scale.

TL;DR — Key Takeaways

✓Understand the techniques attackers use to harvest credentials at scale, from phishing kits and keyloggers to infostealers, and how to defend against them
✓What Is Credential Harvesting and why it matters for your security posture
✓Learn about technique 1: Phishing Kits with Fake Login Pages

Visual Overview

flowchart TD
    A["Credential Harvesting"] --> B["Fake Login Pages"]
    A --> C["Keylogger Malware"]
    A --> D["Man-in-the-Middle"]
    A --> E["Phishing Emails"]
    B --> F["Stolen Credentials"]
    C --> F
    D --> F
    E --> F

Understanding how credentials are harvested is the first step toward defending against it. In this article, we will walk through the most common techniques attackers use, explain how stolen credentials are monetised on the dark web, and outline practical defence strategies that every small and mid-sized organisation can implement today.

What Is Credential Harvesting?

Credential harvesting is the process of collecting usernames, passwords, session tokens, or other authentication secrets from victims. Unlike brute-force attacks that guess passwords through trial and error, harvesting techniques trick or compromise the victim into handing over their real credentials. This makes harvested credentials far more valuable to attackers — they are genuine, they work immediately, and they bypass many traditional security controls.

The scale of the problem is staggering. Billions of credentials are available on underground markets, and fresh batches arrive daily. A single successful business email compromise can yield access to an entire organisation’s email, cloud storage, and financial systems — all from one harvested password.

Technique 1: Phishing Kits with Fake Login Pages

The most prolific credential harvesting method remains the phishing kit. These are pre-packaged collections of HTML, CSS, JavaScript, and server-side scripts that replicate the login pages of popular services — Microsoft 365, Google Workspace, banking portals, and more. The kits are sold on underground forums for as little as $50 and require almost no technical skill to deploy.

How It Works

The attacker registers a look-alike domain (for example, micros0ft-login.com), uploads the phishing kit, and sends a convincing phishing email that directs the victim to the fake page. When the victim enters their username and password, the credentials are captured by the kit and forwarded to the attacker via email, Telegram bot, or a web panel. Many modern kits also capture MFA codes in real time, defeating basic two-factor authentication.

The Scale Factor

Phishing-as-a-Service (PhaaS) platforms have industrialised this process. Operators offer subscription-based access to phishing kits, hosting infrastructure, and even victim lists. A single PhaaS operator can enable thousands of individual campaigns running simultaneously, harvesting millions of credentials per month.

A phishing kit does not need to be sophisticated to be effective. It needs to look real for just long enough that the victim enters their password. That window is often less than five seconds.

Technique 2: Keyloggers

Keyloggers are malicious programmes that record every keystroke a victim types. Unlike phishing, which captures a single set of credentials per attack, a keylogger provides continuous access to every username, password, credit card number, and private message the victim enters for as long as it remains installed.

Delivery Methods

Email attachments — malicious Word documents, Excel files, or PDFs that exploit macros or vulnerabilities to install the keylogger silently.
Drive-by downloads — compromised or malicious websites that exploit browser vulnerabilities to install the keylogger without user interaction.
Software bundles — keyloggers hidden inside pirated software, browser extensions, or seemingly legitimate utilities.
Physical devices — hardware keyloggers that plug between a keyboard and a computer, capturing keystrokes at the hardware level. These are particularly relevant in shared office environments and require a strong physical security policy to mitigate.

Modern keyloggers are highly evasive. They operate in the background, consume minimal system resources, and encrypt their log files before exfiltrating them. Many include screen-capture capabilities that record the victim’s screen at regular intervals, capturing credentials entered via on-screen keyboards or password managers that autofill without keystrokes.

Technique 3: Infostealers

Infostealers are a specialised category of malware designed to extract stored credentials, browser cookies, session tokens, and cryptocurrency wallet files from a victim’s device. Unlike keyloggers that wait for the victim to type, infostealers grab everything at once from credential stores that already exist on the machine.

What They Target

Browser password stores — Chrome, Edge, Firefox, and other browsers store saved passwords in local databases. Infostealers extract these in seconds.
Session cookies — even if a user has MFA enabled, a stolen session cookie can allow the attacker to hijack an active session without needing the password or MFA code at all.
Autofill data — addresses, credit card numbers, and other form data stored in the browser.
VPN and RDP credentials — configuration files for corporate VPN clients and Remote Desktop connections are high-value targets.
Email client credentials — stored passwords for Outlook, Thunderbird, and other desktop email applications.

Prominent infostealer families include RedLine, Raccoon, Vidar, and Lumma. These are sold as malware-as-a-service, with monthly subscriptions starting at around $150. The stolen data is automatically aggregated into “logs” that are sold in bulk on dark web marketplaces.

Technique 4: Man-in-the-Middle Proxy Tools

Traditional phishing captures credentials on a static fake page. Man-in-the-middle (MitM) proxy tools take this a step further by sitting between the victim and the real login page, relaying traffic in both directions in real time. This allows the attacker to capture not only the username and password but also the MFA token and the resulting session cookie.

How Reverse Proxies Work

Tools like Evilginx2 and Modlishka act as transparent reverse proxies. When the victim visits the attacker’s domain, the proxy fetches the real login page from Microsoft, Google, or whatever service is being targeted, and serves it to the victim. The victim sees the genuine page content (including valid branding and SSL certificates on the proxy domain) and enters their credentials normally. The proxy captures everything in transit, including the session cookie issued after successful authentication.

This technique is particularly dangerous because it defeats most forms of MFA, including SMS codes, authenticator app codes, and push notifications. Only phishing-resistant MFA methods such as FIDO2 hardware keys or passkeys are immune, because they bind the authentication to the legitimate domain and refuse to operate on the attacker’s proxy domain.

Technique 5: Database Breaches and Credential Dumps

Not all credential harvesting requires direct interaction with victims. When a website or service suffers a data breach, the user database — often containing usernames, email addresses, and password hashes — may be exfiltrated and eventually published online. If password hashes are weakly encrypted (or stored in plain text), attackers can recover the original passwords using rainbow tables or GPU-accelerated cracking.

Because users frequently reuse passwords across multiple services, a breach at one site often yields valid credentials for dozens of others. This is the foundation of credential-stuffing attacks, where automated tools test breached credentials against banking, email, and corporate login portals at enormous speed.

The Dark Web Credential Market

Stolen credentials are commodities. Dark web marketplaces and Telegram channels offer them in various formats:

Combo lists — massive text files containing millions of email:password pairs from aggregated breaches. These are often free or very cheap and are used for credential stuffing.
Infostealer logs — curated collections of data extracted by infostealers, including browser passwords, cookies, and system fingerprints. A single log for a corporate user can sell for $10 to $50.
Initial access listings — verified VPN, RDP, or admin panel credentials for specific organisations. These are premium products, with prices ranging from $500 to $10,000 depending on the target’s size and industry.
Session cookies — fresh, unexpired session tokens that allow immediate account takeover without authentication. These are time-sensitive and sold at a premium.

The existence of these markets means that even if your organisation has never been directly targeted, your employees’ credentials may already be circulating. Dark web monitoring services can alert you when your corporate domains appear in fresh credential dumps.

Defence Strategies for Small Organisations

Defending against credential harvesting requires a layered approach. No single control is sufficient, but the combination of the following measures dramatically reduces your risk.

Deploy Phishing-Resistant MFA

Standard MFA (SMS codes, authenticator apps) is better than passwords alone, but it can be defeated by MitM proxy attacks. Phishing-resistant MFA based on FIDO2 security keys or device-bound passkeys is the single most effective defence against credential harvesting. These methods verify both the user and the legitimacy of the login page, making proxy attacks impossible.

Use a Password Manager

Password managers generate unique, complex passwords for every account and autofill them only on legitimate domains. This eliminates password reuse (neutralising credential stuffing) and makes phishing pages ineffective (the manager will not autofill on a fake domain). Enterprise password managers also provide visibility into which employees have weak or reused passwords.

Implement Endpoint Protection

Modern endpoint detection and response (EDR) solutions can identify and block keyloggers, infostealers, and other credential-harvesting malware before they execute. Ensure that all corporate devices — including personal devices used under BYOD policies — have current endpoint protection installed.

Educate Your Staff

Technical controls are essential, but human awareness remains critical. Regular phishing simulations train employees to recognise fake login pages, suspicious emails, and social engineering tactics. Focus training on the specific techniques described in this article so that staff understand not just what to avoid, but why.

Monitor for Leaked Credentials

Subscribe to a dark web monitoring service that scans breach databases, paste sites, and underground forums for your corporate email domains. When exposed credentials are found, force an immediate password reset for the affected accounts and investigate whether any unauthorised access occurred.

Reduce the Attack Surface

Disable legacy authentication protocols that do not support MFA.
Implement conditional access policies that block sign-ins from unexpected locations or devices.
Enforce short session lifetimes so that stolen cookies expire quickly.
Regularly audit and remove dormant user accounts that are prime targets for credential stuffing.

The Bottom Line

Credential harvesting is not a niche threat — it is the primary mechanism through which most breaches begin. Whether through phishing kits, keyloggers, infostealers, MitM proxies, or dark web markets, attackers have an extensive toolkit for collecting passwords at scale. The good news is that the defence playbook is well established: phishing-resistant MFA, password managers, endpoint protection, staff training, and credential monitoring form a robust shield that even small organisations can deploy. The critical step is to act before your credentials are the next batch for sale on the dark web.