What Type of Data Do Websites Collect About You?


When the web first became mainstream in the mid 90s or so, one of its key
characteristics was anonymity. No one used their real names and you could live
a second life online, at a blazing 33 kbps.

The web of today is very different. Not only is there a strong push to
deanonymize people, the websites you visit on a daily basis can record and
capture all sorts of information about you. What kinds of information? Read on
to find out.

Your IP

This is the most common type of information that a website will log. Your IP or Internet Protocol address is a number that denotes where on the internet you are located.

It’s basically the same thing as a real-world address. If someone wants to send you a letter, they’ll write your address on it. When you receive it, their return address will be on the back. So you know where it came from.

If you replace “letter” with “internet packet” you basically know how an IP address works. The problem is that a website can actually figure out quite a lot of private information about you from just your IP address.

They’ll know more or less where you are browsing from and which ISP you’re using. With a little more detective work (and perhaps a legal warrant) an IP address can lead someone directly to your door.

This is why so many people are using VPNs (virtual private networks)
these days. The VPN acts as a middleman, so only their IP address is visible to
the site you are visiting.

Hardware & Software

Web browsers report all sorts of information to a website that asks for
it. This includes a wealth of information about the computer you are using.

The site will know your operating system, processor, GPU and more. This
may seem innocent, but could be used to track or ID a specific machine.

One way to get around this is to browse from within a virtual machine,
which will provide generic system information to the website.

1st & 3rd Party Cookies

A cookie is a small file that a site leaves on your computer to keep a
record of things such as your site preferences. So the next time you visit, it
will already know things about you.

Cookie technology is not a bad thing in itself. Session cookies, for
example, delete themselves when you close the browser. You also get first-party
persistent cookies, which are the ones saved to your device by the site for its
own use.

A tracking cookie is a persistent, third-party cookie which is read by
sites other than the ones which created them. That cookie accumulates information
about your web activities and that information can then go back to the cookie’s

Legislation about how and when cookies can be used has been tightening in
recent years. Which is way almost every site has its cookie policy pop up the
minute you visit it for the first time. If you disagree with that policy then
no cookies will be stored on your machine.

However, there is nothing stopping a rogue site from peppering your
machine with tracking cookies without your knowledge. Luckily you can use your
browser’s privacy settings to block and delete cookies as desired.


Cookies are perhaps one example of an invisible tracker, but as a larger
category, invisible trackers also include web apps and external sites embedded
in a legitimate site.

Major news sites and other popular web pages often have advertising
content embedded at the bottom of an article which includes some form of
tracking. Google does this as well. This is why when you search for a specific
product in Google you’ll see ads for it pop up on every other site that
features Google Adsense.

Luckily there are privacy-focused search engines such as DuckDuckGo which explicitly don’t track you.

Modern browsers now also support a feature known as “do not track”, which
tells a site that it should turn off its tracking technology when you visit.
However, this is a voluntary agreement so the site can ignore it if it wants

The most effective tool in the fight against invisible trackers is the EFF’s Privacy Badger.


You’ve probably noticed that when you have to fill in shipping details on
a new site you’ve never visited before, your browser automatically fills in
details like your name and address. It’s a convenient feature, but it is also a
privacy nightmare.

Unscrupulous sites can be coded to capture that information the second
it’s autofilled. This means that site has now captured your full details
without your knowledge. As you can imagine having information such as an
address, full name or social security number can be used to wreak havoc in the
wrong hands.

It’s best to just disable autofill in your browser settings.

Other Accounts
You’re Logged In To

When you visit a site, it can detect what other accounts you are
currently logged into by the traces they leave on your machine. This is
actually very valuable information, because combined with a known email address
it tells hackers which other accounts you have.

So if one of those accounts have been part of a data breach and your
password is uncovered, you may be in trouble. Many people use the same or
similar passwords across accounts so this makes it much easier for hackers to
breach your security.

The best thing to do here is use strong, unique passwords for every
account. A good password manager that generates those random passwords is
highly recommended.

Input Logs

It’s possible for websites to be coded in such a way that every keystroke
and every mouse movement you make are recorded in detail. The tracking
abilities of websites in this regard are pretty extensive.

A research paper detailing “session replay scripts” demonstrated that most major websites make complete recordings of your keystrokes and mouse movement while you are visiting and then use this for further analysis. You can probably imagine the sorts of privacy issues this could cause.

Browser Fingerprints

A browser “fingerprint” is simply the unique combination of browser data,
such as which cookies are on your system and what plugins are installed. The
longer a browser is used and the more it is customized the easier it is to link
to a specific user.

For example, even if you use a VPN to access a site, the site knows your
fingerprint. So if you visit another site using that same browser without the
protection of anonymity, a clear link between those activities can be made.

Using a privacy-oriented browser such as the Tor Browser is a good way to
prevent this sort of de-anonymization.

How To Check What You Are

Several websites exist that will help you figure out where and how you are leaking information. Panopticlick is a great tool by the Electronic Frontier Foundation which does just that.

Just click the big “test me” button and all your paranoid fears may be
confirmed. Luckily there’s never a bad time to sharpen up your privacy