Caltech 101

Caltech 101 is a data set of digital images created in September 2003 and compiled by Fei-Fei Li, Marco Andreetto, Marc 'Aurelio Ranzato and Pietro Perona at the California Institute of Technology. It is intended to facilitate Computer Vision research and techniques and is most applicable to techniques involving image recognition classification and categorization. Caltech 101 contains a total of 9,146 images, split between 101 distinct object categories and a background category. Provided with the images are a set of annotations describing the outlines of each image, along with a Matlab script for viewing.

Purpose

Most Computer Vision and Machine Learning algorithms function by training on example inputs. They require a large and varied set of training data to work effectively. For example, the real-time face detection method used by Paul Viola and Michael J. Jones was trained on 4,916 hand-labeled faces.
Cropping, re-sizing and hand-marking points of interest is tedious and time-consuming.
Historically, most data sets used in computer vision research have been tailored to the specific needs of the project being worked on.A large problem in comparing computer vision techniques is the fact that most groups use their own data sets. Each set may have different properties that make reported results from different methods harder to compare directly. For example, differences in image size, image quality, relative location of objects within the images and level of occlusion and clutter present can lead to varying results.
The Caltech 101 data set aims at alleviating many of these common problems.

The images are cropped and re-sized.
Many categories are represented, which suits both single and multiple class recognition algorithms.
Detailed object outlines are marked.
Available for general use, Caltech 101 acts as a common standard by which to compare different algorithms without bias due to different data sets.

However, a recent study demonstrates that tests based on uncontrolled natural images can be seriously misleading, potentially guiding progress in the wrong direction.

Data set

Images

The Caltech 101 data set consists of a total of 9,146 images, split between 101 different object categories, as well as an additional background/clutter category.
Each object category contains between 40 and 800 images. Common and popular categories such as faces tend to have a larger number of images than others.
Each image is about 300x200 pixels. Images of oriented objects such as airplanes and motorcycles were mirrored to be left to right aligned and vertically oriented structures such as buildings were rotated to be off axis.

Annotations

A set of annotations is provided for each image. Each set of annotations contains two pieces of information: the general bounding box in which the object is located and a detailed human-specified outline enclosing the object.
A Matlab script is provided with the annotations. It loads an image and its corresponding annotation file and displays them as a Matlab figure.

Uses

The Caltech 101 data set was used to train and test several computer vision recognition and classification algorithms. The first paper to use Caltech 101 was an incremental Bayesian approach to one shot learning, an attempt to classify an object using only a few examples, by building on prior knowledge of other classes.
The Caltech 101 images, along with the annotations, were used for another one shot learning paper at Caltech.
Other Computer Vision papers that report using the Caltech 101 data set include:

Shape Matching and Object Recognition using Low Distortion Correspondence. Alexander C. Berg, Tamara L. Berg, Jitendra Malik. CVPR 2005
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. K. Grauman and T. Darrell. International Conference on Computer Vision, 2005
Combining Generative Models and Fisher Kernels for Object Class Recognition. Holub, AD. Welling, M. Perona, P. International Conference on Computer Vision, 2005
Object Recognition with Features Inspired by Visual Cortex. T. Serre, L. Wolf and T. Poggio. Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society Press, San Diego, June 2005.
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition. Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik. CVPR, 2006
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. CVPR, 2006
Empirical Study of Multi-Scale Filter Banks for Object Categorization. M.J. Mar韓-Jim閚ez, and N. P閞ez de la Blanca. December 2005
Multiclass Object Recognition with Sparse, Localized Features. Jim Mutch and David G. Lowe., pg. 11-18, CVPR 2006, IEEE Computer Society Press, New York, June 2006
Using Dependent Regions or Object Categorization in a Generative Framework. G. Wang, Y. Zhang, and L. Fei-Fei. IEEE Comp. Vis. Patt. Recog. 2006
Analysis and comparison

Advantages

Caltech 101 has several advantages over other similar data sets:

Uniform size and presentation:
*Almost all the images within each category are uniform in image size and in the relative position of interest objects. Caltech 101 users generally do not need to crop or scale images before they can be used.
Low level of clutter/occlusion:
*Algorithms concerned with recognition usually function by storing features unique to the object. However, most images taken have varying degrees of background clutter, which means algorithms may build incorrectly.
Detailed annotations
Weaknesses

Weaknesses to the Caltech 101 data set may be conscious trade-offs, but others are limitations of the data set. Papers that rely solely on Caltech 101 are frequently rejected.
Weaknesses include:

The data set is too clean:
*Images are very uniform in presentation, aligned from left to right, and usually not occluded. As a result, the images are not always representative of practical inputs that the algorithm might later expect to see. Under practical conditions, images are more cluttered, occluded and display greater variance in relative position and orientation of interest objects. The uniformity allows concepts to be derived using the average of a category, which is unrealistic.
Limited number of categories:
*The Caltech 101 data set represents only a small fraction of possible object categories.
Some categories contain few images:
*Certain categories are not represented as well as others, containing as few as 31 images.
*This means that. The number of images used for training must be less than or equal to 30, which is not sufficient for all purposes.
Aliasing and artifacts due to manipulation:
*Some images have been rotated and scaled from their original orientation, and suffer from some amount of artifacts or aliasing.
Other data sets
Caltech 256 is another image data set, created in 2007. It is a successor to Caltech 101. It is intended to address some of the weaknesses of Caltech 101. Overall, it is a more difficult data set than Caltech 101, but it suffers from comparable problems. It includes
*30,607 images, covering a larger number of categories
*Minimum number of images per category raised to 80
*Images are not left-right aligned
*More variation in image presentation
LabelMe is an open, dynamic data set created at MIT Computer Science and Artificial Intelligence Laboratory. LabelMe takes a different approach to the problem of creating a large image data set, with different trade-offs.
*106,739 images, 41,724 annotated images, and 203,363 labeled objects.
*Users may add images to the data set by upload, and add labels or annotations to existing images.
*Due to its open nature, LabelMe has many more images covering a much wider scope than Caltech 101. However, since each person decides what images to upload, and how to label and annotate each image, the images are less consistent.
VOC 2008 is a European effort to collect images for benchmarking visual categorization methods. Compared to Caltech 101/256, a smaller number of categories are collected. The number of images in each category, however, is larger.
Overhead Imagery Research Data Set is an annotated library of imagery and tools. OIRDS v1.0 is composed of passenger vehicle objects annotated in overhead imagery. Passenger vehicles in the OIRDS include cars, trucks, vans, etc. In addition to the object outlines, the OIRDS includes subjective and objective statistics that quantify the vehicle within the image's context. For example, subjective measures of image clutter, clarity, noise, and vehicle color are included along with more objective statistics such as ground sample distance, time of day, and day of year.
* ~900 images, containing ~1800 annotated images
* ~30 annotations per object
* ~60 statistical measures per object
* Wide variation in object context
* Limited to passenger vehicles in overhead imagery
MICC-Flickr 101 is an image data set created at the Media Integration and Communication Center, University of Florence, in 2012. It is based on Caltech 101 and is collected from Flickr. MICC-Flickr 101 corrects the main drawback of Caltech 101, i.e. its low inter-class variability and provides social annotations through user tags. It builds on a standard and widely used data set composed of a manageable number of categories and therefore can be used to compare object categorization performance in a constrained scenario and object categorization "in the wild" on the same 101 categories.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...