Tip of the Tongue Known-Item Retrieval: A Case Study in Movie Identification

  • Jaime Arguello ,
  • Adam Ferguson ,
  • Emery Fine ,
  • ,
  • Hamed Zamani ,
  • Fernando Diaz

Proceedings of the 6th international ACM SIGIR Conference on Human Information Interaction and Retrieval |

Published by ACM

Preprint | PDF

While current information retrieval systems are effective for known-item retrieval where the searcher provides a precise name or identifier for the item being sought, systems tend to be much less effective for cases where the searcher is unable to express a precise name or identifier. We refer to this as tip of the tongue (TOT) known-item retrieval, named after the cognitive state of not being able to retrieve an item from memory. Using movie search as a case study, we explore the characteristics of questions posed by searchers in TOT states in a community question answering website. We analyze how searchers express their information needs during TOT states in the movie domain. Specifically, what information do searchers remember about the item being sought and how do they convey this information? Our results suggest that searchers use a combination of information about: (1) the content of the item sought, (2) the context in which they previously engaged with the item, and (3) previous attempts to find the item using other resources (e.g., search engines). Additionally, searchers convey information by sometimes expressing uncertainty (i.e., hedging), opinions, emotions, and by performing relative (vs. absolute) comparisons with attributes of the item. As a result of our analysis, we believe that searchers in TOT states may require specialized query understanding methods or document representations. Finally, our preliminary retrieval experiments show the impact of each information type presented in information requests on retrieval performance.

Téléchargements de publications

Tip of the Tongue Known Item Retrieval Dataset for Movie Identification

août 17, 2021

The Tip of the Tongue (ToT) dataset is from the paper Tip of the Tongue Known-Item Retrieval: A Case Study in Movie Identification. It is comprised of 758 question/answer pairs scraped from the website iRememberThisMovie.com between 2013 and 2018. These question/answer pairs consist of REQUESTS, in which a user of the website describes a movie they have seen but whose title they have forgotten, and ANSWERS, which consist of different solutions to the request from other users of the website. We also attach Wikipedia/IMDB links for the films. We annotate the text of the REQUESTS on the sentence level using a handcrafted set of codes. This set of codes is used to identify trends in the data such as mentions of release/viewing dates, characters or locations remembered from the film, circumstances surrounding the viewing of the film, and others. A complete list of these codes (also available in table 1, 2 and 3 in section 4.3 of our paper) is presented below: Movie: Codes touching on the content of the movie Character: Describes a character Scene: Describes a scene Object: Describes a tangible object in a scene Location type: Describes a scene’s location type Plot summary: Describes the overall plot or premise Release date: Describes timeframe of movie release Visual style: Describes visual style (e.g., black and white, colour, CGI animation, etc.) Language: Describes the language spoken Regional Origin: Describes movie’s region of origin Specific location: Describes a scene’s specific location Quote/dialogue: Describes a quote from the movie Real person: Describes real person associated with movie Camera angle: Describes camera action Singular timeframe: Describes timeframe Multiple timeframe: Describes the passage of time in the movie Fictional person: Describes fictional person associated with movie (directly or indirectly) Actor nationality: Describes nationality or ethnicity associated with actor/actress Target audience: Describes movie’s target audience Compares music: Describes movie’s soundtrack Specific music: Describes specific song in the movie. Context: Codes touching on the context in which the movie was seen Temporal context: Describes when the movie was seen, either in absolute terms (e.g., around 2008) or relative terms (e.g., when I was a kid) Physical medium: References the physical medium associated with watching the movie (e.g., TV, theatre, VHS, etc.) Cross media: Describes exposure to movie through different media (e.g., trailer, DVD cover, poster, etc.) Contextual witness: Describes other people involved in the movie watching experience Physical location: Describes physical location where movie was watched Concurrent events: Describes events relevant to time period when movie was watched The following categories do not contain sub-codes Previous Search: Indicates that a previous attempt had been made to find the movie title Social: Indicates that the sentence is primarily a social nicety without content relating to the film Uncertainty: Indicates that the sentence contains language revealing uncertainty on the author’s part Opinion: Indicates that the sentence contains language conveying an opinion or judgement of the movie Emotion: Indicates that the sentence contains language conveying an emotion the movie made the author feel Relative Comparison: Indicates that the sentence contains language describing the movie using relative terms (such as comparisons to other movies, actors, locations, etc.)

TREC Tip-of-the-Tongue Track

avril 24, 2024

Tip-of-the-tongue (ToT) known-item retrieval is defined as "an item identification task in which the searcher has previously experienced an item but cannot recall a reliable identifier" (i.e., "It’s on the tip of my tongue…"). The TREC ToT track aims to develop IR systems that can successfully resolve ToT information needs. Progress in this area will likely benefit other IR systems that must deal with memory assistance, such as personal information management (PIM) systems (e.g., email re-finding).