Princeton University researchers uncover one of the biggest security flaws on the internet.
For years, keyloggers have been synonymous with dodgy downloads and spyware that finds its way onto your PC. Now, the technology is much more rife and commonplace, and it appears to be being used in one of the most intrusive ways possible.
Research from Princeton University's Centre for Information Technology Policy (CITP) has discovered that 400 of the most popular websites - including Wordpress and Microsoft - are running code that's capable of tracking everything you type without your knowledge or express consent.
What's even more worrying is that your keystrokes and data are then being sent to a third-party server.
Now, you may be thinking that websites recording your activity so you can be remarketed products is nothing new, and it isn't. In this instance, however, it's far more wide-reaching than just what you decided to click and buy.
Some of the most trafficked websites in the world are recording every keystroke you make
Here, some of the most trafficked websites in the world are recording every keystroke you make - be it a site search or text chat with an assistant. If you start to fill out a form with email addresses, phone numbers or personal details and then decide not to follow through, those sites already have the information they wanted.
As Motherboard notes, it's a similar situation to what Facebook was doing with user's incomplete status updates. In 2013, when it was revealed Facebook was logging what was being said and then deleted before a post was published, people got mad. But now, with all these notable websites looking at what it is you're writing and clicking, we're almost entirely unaware it's going on.
In fairness to the sites discovered to be using such keystroke-logging services, it's not actually their intention to obtain this information. Instead it's a direct consequence of using something called a "Session Replay Script". These web scripts are used to track engagement and help with UX design to inform site owners how they can improve a visitor's journey through their website. Unfortunately, these scripts record practically everything and send them away to be analysed.
It gets worse when you discover that, as the research team put it, it can't "reasonably be expected to be kept anonymous". One such session replay script-making company FullStory actually gives site owners linked tracking information, Motherboard claims. They'd be able to see who a particular user was and watch back their entire site activities in real time - including everything they type.
Here's a video of what FullStory's Session Replay Script can do, but many other companies developing them also do similar:
Princeton's team conducted its research by looking at seven of the most popular session replay companies on the market and testing their products on a series of test pages. In doing this, they discovered that at least one of these scripts is being used by 482 of the world's top 50,000 sites as sorted by Alexa ranking.
As part of their research post, "No boundaries: Exfiltration of personal data by session-replay scripts", the team from Princeton released a list of sites that used the scripts - but only sites that have been confirmed to send such recordings off to third parties.
Looking at the list of sites, there's some rather worrying culprits. A quick glance shows that Wordpress.com, Microsoft.com, Adobe.com and Spotify.com all use these tracking scripts. The Telegraph's website is also a tracking culprit, along with BBC Good Food. Thankfully, these sites haven't been listed for doing more than simply using the software. Sites that seem to actually record and send the data away to a third-party seem to be an odd mix, with Russian search website Yandex leading the charge.
Curiously, HP's own website was listed as sending data along with Atlassian, Xfinity and Comcast.
Many companies that send this data do also offer up redaction services to remove sensitive information, but there's plenty out there that don't. That information leaking into the public sphere could have serious privacy implications.
"Collection of page content by third-party replay scripts may cause sensitive information such as medical conditions, credit card details, and other personal information displayed on a page to leak to the third-party as part of the recording," the researchers wrote in their post.
Researchers noted that personal information and passwords usually slipped through any redaction software, even if it was only partially. Interestingly, both Session Replay Script providers UserReplay and SessionCam completely block information by tracking where a user clicks before they type. However, if information is displayed by default on a screen, such as logging into a site's account page, your personal data is left unredacted.
In most cases, that's fine. But when one site, such as American pharmacy chain Walgreens, lists previous medical conditions and prescriptions on your user page by default, all that information could easily fall into the wrong hands.
If you're wondering just how likely that is to happen, it turns out things actually get worse. As noted in their post, the researchers discovered that these session-tracking companies are actually in a position to be vulnerable to targeted hacks. Not only are they high-value targets, but many of the analytics dashboards their clients use run on non-encrypted HTTP pages instead of HTTPS pages.
By using HTTP over HTTPS, "this allows an active man-in-the-middle to inject a script into the playback page and extract all of the recording data," the researchers explained.
HTTP allows an active man-in-the-middle to extract all of the recording data
Since the report went out, a handful of the sites using Session Replay Scripts and vendors responsible for creating them have commented on the matter. Many responses show that a lot of sites weren't quite aware of how these things worked, and so are looking into halting such services or improving them to keep user data safe. On the development side, of the few that spoke up on the matter, SessionCam provided a blog post outlining their concerns and reassuring users that security and privacy are their prime consideration.
As SessionCam is actually the safest provider of Session Replay Scripts, at least from a user's privacy perspective, the post is rather reassuring that the company is all for keeping user data safe.
"Everyone at SessionCam can get behind the CITP's conclusion: 'Improving user experience is a critical task for publishers. However, it shouldn't come at the expense of user privacy.'," wrote SessionCam. "The whole team at SessionCam lives these values every day. The privacy of your website visitors and the security of your data is of paramount importance to us.
"I am grateful to the CITP for raising the issue and examining the questions. We will keep working to provide even stronger security and to give our clients and their customers the peace of mind they need."
There is one way to defend against these session replay scripts; install AdBlock Plus. Previously it protected you against a handful of these trackers but now, following the Princeton study, it's updated to protect you against every script listed by the researcher's post.
Main image: Bigstock