MovieLens was not the first recommender system created by GroupLens. In May 1996, GroupLens formed a commercial venture called Net Perceptions, which served clients that included E! Online and Amazon.com. E! Online used Net Perceptions' services to create the recommendation system for Moviefinder.com, while Amazon.com used the company's technology to form its early recommendation engine for consumer purchases. When another movie recommendation site, eachmovie.org, closed in 1997, the researchers who built it publicly released the anonymous rating data they had collected for other researchers to use. The GroupLens Research team, led by Brent Dahlen and Jon Herlocker, used this data set to jumpstart a new movie recommendation site, which they chose to call MovieLens. Since its inception, MovieLens has become a very visible research platform: its data findings have been featured in a detailed discussion in a New Yorker article by Malcolm Gladwell, as well as a report in a full episode of ABC Nightline. Additionally, MovieLens data has been critical for several research studies, including a collaborative study between Carnegie Mellon University, University of Michigan, University of Minnesota, and University of Pittsburgh, "Using Social Psychology to Motivate Contributions to Online Communities". During Spring in 2015, a search for "movielens" produced 2,750 results in Google Books and 7,580 in Google Scholar.
Recommendations
MovieLens bases its recommendations on input provided by users of the website, such as movie ratings. The site uses a variety of recommendation algorithms, including collaborative filtering algorithms such as item-item, user-user, and regularized SVD. In addition, to address the cold-start problem for new users, MovieLens uses preference elicitation methods. The system asks new users to rate how much they enjoy watching various groups of movies. The preferences recorded by this survey allow the system to make initial recommendations, even before the user has rated a large number of movies on the website. For each user, MovieLens predicts how the user will rate any given movie on the website. Based on these predicted ratings, the system recommends movies that the user is likely to rate highly. The website suggests that users rate as many fully watched films as possible, so that the recommendations given will be more accurate, since the system would then have a better sample of the user's film tastes. However, MovieLens' rating incentive approach is not always particularly effective, as researchers found more than 20% of the movies listed in the system have so few ratings that the recommender algorithms cannot make accurate predictions about whether subscribers will like them or not. The recommendations on movies cannot contain any marketing values that can tackle the large number of movie ratings as a "seed dataset". In addition to movie recommendations, MovieLens also provides information on individual films, such as the list of actors and directors of each film. Users may also submit and rate tags, which may be used to increase the film recommendations system's accuracy. The ratings in MovieLens could happen any time, in fact, it could happen years later after watching a movie. Users would often enter numerous ratings at once hoping that they would get more personalized recommendations or just for satisfaction.
Reception
By September 1997, the website had reached over 50,000 users. When the Akron Beacon Journals Paula Schleis tried out the website, she was surprised at how accurate the website was in terms of recommending new films for her to watch based on her film tastes. Outside of the realm of movie recommendations, data from MovieLens has been used by Solution by Simulation to make Oscar predictions.
Research
In 2004, a collaborative study with researchers from Carnegie Mellon University, University of Michigan, University of Minnesota and University of Pittsburgh designed and tested incentives derived from the social psychology principles of social loafing and goal-setting on MovieLens users. The researchers saw that under-contribution seemed to be a problem for the community and set up a study to discern the most effective way to motivate users to rate and review more films. The study executed two field experiments; one involved email messages that reminded users of the uniqueness of their contributions and the benefits that follow from them, and the other gave users a range of individual or group goals for contribution. The first experiment, based on the analysis of the MovieLens community’s cumulative response, found that users were more likely to contribute to the community when they were reminded of their uniqueness, leading them to think that their contributions are not duplicates of what other users are able to provide. Contrary to the researchers’ hypothesis, they also found that users were less likely to contribute when it was made salient to them the benefit they receive from rating or the benefit others receive when they rate. Lastly, they found no support for the relationship between uniqueness and benefit. The second experiment found that users were also more likely to contribute when they were given specific and challenging goals and were led to believe that their contributions were needed in order to accomplish the group’s goal. The study found that in this particular context, giving users group-level goals actually increased contributions compared to individual goals, where the researchers predicted that the reverse would be true due to the effects of social loafing. The relationship between goal difficulty and user contributions in both the group and individual cases gave weak evidence that beyond a certain difficulty threshold, performance drops, instead of plateaus as previously hypothesized in Locke and Latham’s goal-setting theory.
Datasets
, a human-computer interaction research lab at the University of Minnesota, provides the rating data sets collected from MovieLens website for research use. The full data set contains 26,000,000 ratings and 750,000 tag applications applied to 45,000 movies by 270,000 users. It also includes tag genome data with 12 million relevance scores across 1,100 tags. There are many types of research conducted based on the MovieLens data sets. Liu et al. used MovieLens data sets to test the efficiency of an improved random walk algorithm by depressing the influence of large-degree objects. GroupLens has terms of use for the dataset, and .