Caltech Football Numbers (CaltechFN)

Creators: Patrick Rim¹; Snigdha Saha¹; Marcus Rim²

1. California Institute of Technology
2. Vanderbilt University

Style

Description

Digit datasets are widely used as compact, generalizable benchmarks for novel computer vision models. However, modern deep learning architectures have surpassed the human performance benchmark on the current state-of-the-art digit datasets. These datasets largely contain images of digits that are smooth and fully visible, which limits the variability between the digits. On the other hand, the digits on American football jerseys are highly variable due to the propensity of jerseys to be wrinkled, stretched, twisted, and otherwise distorted in live action. Given that American football is a fast-paced sport, the digits on a jersey will likely be distorted in a different way from moment to moment, making it harder for artificial vision systems to differentiate between distinct digits. Furthermore, the digits on American football jerseys will often be partially occluded in a live-action capture due to the presence of other players and props. While the human brain is able to infer the identity of partially occluded digits by "filling in" visual gaps, artificial vision systems struggle to do this. To catalyze the improvement of computer vision models in these areas, we introduce CaltechFN, an image dataset of American football numbers that will serve as a new state-of-the-art benchmark for classification, detection, and localization tasks.

Files

Files (5.0 GB)

Name	Size	Actions
test.tar.gz md5:1d54f5b931f11323070c9c84392c2794	891.8 MB	Download
test_cropped.tar.gz md5:9d3c2f23d83c7fd774543f175c4a0572	106.5 MB	Download
train.tar.gz md5:3dfe7c839f770e9b4f77b6bea6d4600b	3.6 GB	Download
train_cropped.tar.gz md5:1190cbb743a4dba556010dc70a149141	429.5 MB	Download

Series information

The dataset is presented through four tar.gz files. We describe each: train.tar.gz: This file contains a folder, 'images', and a .json file, 'train.json'. 'images' contains 49,383 full-sized images of various dimensions. 'train.json' contains their bounding box digit annotations in COCO format. test.tar.gz: This file also contains a folder, 'images', and a .json file, 'test.json'. 'images' contains 12,345 full-sized images of various dimensions. 'test.json' contains their bounding box digit annotations in COCO format. train_cropped.tar.gz: This file contains a folder, 'images', and a .mat file, 'train.mat'. 'images' contains 211,661 images of various dimensions, but much smaller than the images in the previous two formats. 'train.mat' contains two arrays: 'names', which is a list of the image names, and 'y', which contains the ground-truth labels of the images, matching 'names'. test_cropped.tar.gz: This file contains a folder, 'images', and a .mat file, 'test.mat'. 'images' contains 52,911 images of various dimensions, but much smaller than the images in the previous two formats. 'test.mat' contains two arrays: 'names', which is a list of the image names, and 'y', which contains the ground-truth labels of the images, matching 'names'.

Other

We have benchmarked our dataset with the methods provided in this repository: https://github.com/snigdhasaha7/caltechfn_eval

Caltech Football Numbers (CaltechFN)

Citation

Description

Files

Series information

Other

Additional details