NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.

Click here to search the NCJRS Virtual Library

NCJ Number

309744

Author(s)

Manal Sultan; Lia Jacobs; Abby Stylianou; Robert Pless

Date Published

September 2023

Length

6 pages

Annotation

In this paper, researchers explore using CLIP for image retrieval.

Abstract

In this paper, researchers consider the ability of CLIP features to support text-driven image retrieval and find that there is a sweet-spot of detail in the text that gives best results and find that words describing the "tone" of a scene (such as messy, dingy) are quite important in maximizing text-image similarity. Traditional image-based queries sometimes misalign with user intentions due to their focus on irrelevant image components. To overcome this, the researchers explore the potential of text-based image retrieval, specifically using Contrastive Language-Image Pretraining (CLIP) models. CLIP models, trained on large datasets of image-caption pairs, offer a promising approach by allowing natural language descriptions for more targeted queries. The authors explore the effectiveness of text-driven image retrieval based on CLIP features by evaluating the image similarity for progressively more detailed queries. (Published Abstract Provided)

Downloads

HTML

Related Datasets

https://github.com/GWUvision/Hotels-50K

NCJRS Virtual Library

Exploring CLIP for Real World, Text-based Image Retrieval

Downloads

Related Datasets

Related Topics

NCJRS Virtual Library

Exploring CLIP for Real World, Text-based Image Retrieval

Additional Details

Downloads

Related Datasets

Related Topics