Website correlation

Website correlation, or website matching, is a process used to identify websites that are similar or related. Websites are inherently easy to duplicate. This led to proliferation of identical websites or very similar websites for purposes ranging from translation to Internet marketing to Internet crime Locating similar websites is inherently problematic because they may be in different languages, on different servers, in different countries.

Uses

Website correlation is used in:

Internet Investigations to determine the overall scope of an investigation
market research to locate competitors or determine the market reach of competing companies or for cluster sampling
Web filtering systems to ensure that all websites of a specific type are blocked from view
Data mining systems to maximize input or output data
risk management programs to ensure websites are being monitored for problems that introduce fiscal risk
Compliance monitoring as part of a compliance and ethics program or policy to ensure websites follow established guidelines
Correlation types

There are several known types of correlation, each demonstrating different strengths and weaknesses. A practical website correlation process may require combining two or more of these methods.

Similar structure

To save time and effort, website owners duplicate major portions of website code across many domains. Similarity of code structure can provide enough information for correlation. Organizations known to have a publicly search-able databases for this kind of correlation include:

http://www.delineal.com

note: Websites can sometimes utilize the same structure but have no relationship to each other.

Same server or subnet

Also known as correlated Reverse DNS lookup. Websites may be served from the same server, on one or more ip address, on one or more subnet. Several organizations retain archives of ip address data and correlate the data. Examples include:

http://www.domaintools.com

note: Correlation via this method may be misleading because websites frequently exist on the same server but have no relationship to each other.

Same owner

Websites may be authored by the same person or organization. Website owners are required to provide contact information to a registrar to obtain a domain name. Domain ownership can be determined via the WHOIS protocol which provides no mechanism for searching or correlating ownership. Several organizations retain archives of WHOIS information and provide searching and correlation services. Examples include:

http://whoisology.com
http://www.domaintools.com

note: Website ownership information can be falsified, outdated, or hidden from public view. Website Correlation via this method can be accurate, misleading, or impossible depending on the information contained in WHOIS records.

Same category

Websites are frequently categorized or tagged similarly via automated or manual means. Examples of publicly accessible website categorization databases include:

http://www.similarsitesearch.com/
http://similarsites.com
http://similarsites.de
http://www.similarsitecheck.com
http://www.similarto.us
DMOZ

note: Manual Categorization and tag methods are inherently subjective. Automated categorization and tagging methods are inherently subject to the varying weaknesses and strengths of underlying categorization algorithms.

Same tracking ID

Tracking IDs, used for analytics or affiliate identification are frequently embedded in website code. These ids can be used for correlation because they imply common management of websites. Publicly available websites for correlating by tracking id include:

http://ewhois.com

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...