Social media mining


Social media mining is the process of obtaining big data from user-generated content on social media sites and mobile apps in order to extract patterns, form conclusions about users, and act upon the information, often for the purpose of advertising to users or conducting research. The term is an analogy to the resource extraction process of mining for rare minerals. Resource extraction mining requires mining companies to sift through vast quantities of raw ore to find the precious minerals; likewise, social media mining requires human data analysts and automated software programs to sift through massive amounts of raw social media data in order to discern patterns and trends relating to social media usage, online behaviours, sharing of content, connections between individuals, online buying behaviour, and more. These patterns and trends are of interest to companies, governments and not-for-profit organizations, as these organizations can use these patterns and trends to design their strategies or introduce new programs, new products, processes or services.
Social media mining uses a range of basic concepts from computer science, data mining, machine learning and statistics. Social media miners develop algorithms suitable for investigating massive files of social media data. Social media mining is based on theories and methodologies from social network analysis, network science, sociology, ethnography, optimization and mathematics. It encompasses the tools to formally represent, measure and model meaningful patterns from large-scale social media data. In the 2010s, major corporations, governments and not-for-profit organizations engaged in social media mining to obtain data about customers, clients and citizens.

Background

As defined by Kaplan and Haenlein, social media is the "group of internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of user-generated content." There are many categories of social media including, but not limited to, social networking, microblogging, photo sharing, news aggregation, video sharing, livecasting, virtual worlds, social gaming, social search, and instant messaging.
The first social media website was introduced by GeoCities in 1994. It enabled users to create their own homepages without having a sophisticated knowledge of HTML coding. The first social networking site, SixDegrees.com, was introduced in 1997. Since then, many other social media sites have been introduced, each providing service to millions of people. These individuals form a virtual world in which individuals, entities and interactions coexist. Social norms and human behavior govern this virtual world. By understanding these social norms and models of human behavior and combining them with the observations and measurements of this virtual world, one can systematically analyze and mine social media. Social media mining is the process of representing, analyzing, and extracting meaningful patterns from data in social media, resulting from social interactions. It is an interdisciplinary field encompassing techniques from computer science, data mining, machine learning, social network analysis, network science, sociology, ethnography, statistics, optimization, and mathematics. Social media mining faces grand challenges such as the big data paradox, obtaining sufficient samples, the noise removal fallacy, and evaluation dilemma.
Social media mining represents the virtual world of social media in a computable way, measures it, and designs models that can help us understand its interactions. In addition, social media mining provides necessary tools to mine this world for interesting patterns, analyze information diffusion, study influence and homophily, provide effective recommendations, and analyze novel social behavior in social media.

Uses

Social media mining is used across several industries including business development, social science research, health services, and educational purposes. Once the data received goes through social media analytics, it can then be applied to these various fields. Often, companies use the patterns of connectivity that pervade social networks, such as assortativity—the social similarity between users that are induced by influence, homophily, and reciprocity and transitivity. These forces are then measured via statistical analysis of the nodes and connections between these nodes. Social analytics also uses sentiment analysis, because social media users often relay positive or negative sentiment in their posts. This provides important social information about users' emotions on specific topics.
These three patterns have several uses beyond pure analysis. For example, influence can be used to determine the most influential user in a particular network. Companies would be interested in this information in order to decide who they may hire for influencer marketing. These influencers are determined by recognition, activity generation, and novelty—three requirements that can be measured through the data mined from these sites. Analysts also value measures of homophily: the tendency of two similar individuals to become friends. Users have begun to rely on information of other users' opinions in order to understand diverse subject matter. These analyses can also help create recommendations for individuals in a tailored capacity. By measuring influence and homophily, online and offline companies are able to suggest specific products for individuals consumers, and groups of consumers. Social media networks can use this information themselves to suggest to their users possible friends to add, pages to follow, and accounts to interact with.

Research

Research areas

Social media mining research articles are published in computer science, social science, and data mining conferences and journals:

Conferences

Conference papers can be found in proceedings of Knowledge
Discovery and Data Mining, World Wide Web, Association
for Computational Linguistics, Conference on Information
and Knowledge Management, International Conference on Data
Mining, Internet Measuring Conference.
Social media mining is also present on many data management/database conferences such as the ICDE Conference, SIGMOD Conference and International Conference on Very Large Data Bases.