US9280595B2: Apple’s Concept Matching Patent: How the App Store Understands What You Actually Mean

Patent US9280595B2 — “Application Query Conversion” Filed: August 30, 2012 · Granted: March 8, 2016 · Status: Active (expires 2034) Assignee: Apple Inc. · Inventors: Catherine A. Edwards, Natalia Hernandez-Gardiol Continuation: US20160171091A1 (filed January 2016)

The Problem: Keyword Counting Breaks Search

The patent opens with a simple example that explains why pure keyword matching fails.

A user searches for “email.” Two apps exist in the store. A mail app mentions “email” once across its title and description. A pool game mentions “email” three times in its description — something like “email your scores to friends, email challenges to opponents, email invitations to play.”

If you rank by raw keyword count, the pool game wins. That’s obviously wrong. This patent describes how Apple fixes it.

The Solution in Plain English

Instead of just counting how many times a keyword appears, Apple’s system asks a smarter question: does this app’s listing contain the words that real email apps typically contain?

The system builds a vocabulary of “indicator terms” for every common search query. For “email,” those indicator terms might be: account, reply, compose, message, filter, inbox, attachment, forward.

When ranking search results, the system checks whether each app’s listing contains these indicator terms. The mail app’s listing naturally includes words like “compose,” “reply,” and “inbox.” The pool game’s listing includes none of them. So the mail app ranks higher — even though it mentions “email” fewer times.

The key insight: Apple judges your app by the company your words keep, not just by the keyword itself.

Where Exactly Does Apple Look? Title vs. Description

The patent is specific about which parts of an app listing the algorithm reads. It refers to each app’s “document” — which includes the title, description, and metadata. But critically, the patent reveals that the algorithm does not treat all fields equally:

The system performs separate counts for separate sections of the document — specifically, title versus description. This means Apple tracks indicator term appearances in the title independently from appearances in the description. A match in the title carries different weight than a match in the description.

The patent also states that the relatedness score between terms can be influenced by whether two terms appear within the same field. If your search keyword and an indicator term both appear in the title, that’s a stronger signal than if one appears in the title and the other in the description.

Additionally, the system considers word proximity — how many words separate the query term from an indicator term within the same field. “Email” appearing right next to “compose” in a description signals stronger relevance than “email” at the top and “compose” at the bottom.

To summarize what the algorithm reads:

Listing Element	Checked by Algorithm	Notes
App Title	Yes — counted separately	Matches here likely carry more weight
Description	Yes — counted separately	Word proximity matters within this field
Metadata (including keyword field)	Yes — part of the “document”	Referenced alongside title and description
Download count / frequency	Yes	Used for non-textual ranking signals
User ratings	Yes	Used for non-textual ranking signals

The patent explicitly lists the app database as containing “titles, descriptions, metadata, download frequency or count, user ranking, and/or the apps themselves.” All of these feed into the system.

How Apple Builds Its Vocabulary: Two Methods

Method 1: Learning From What Users Actually Download

Think of this like a restaurant recommendation based on what similar customers ordered.

Apple logs every search and what happens next. When thousands of users search “email,” the system records which apps they actually download. Suppose 15% download Mail App, 7% download YourMail, 4% download MyMail, and 0% download the pool game.

The system then takes only the listings of apps that users chose — Mail, YourMail, MyMail — and throws away the pool game listing. It reads through the remaining listings and asks: which words appear consistently across all of these?

Words like “inbox,” “compose,” “reply,” and “attachment” appear in most of these mail app listings. The word “cue” or “billiards” appears in none of them. So “inbox,” “compose,” “reply,” and “attachment” become the indicator terms for the query “email.”

This method is powerful because it’s grounded in real user behavior. It doesn’t rely on a dictionary or a linguist deciding what “email” means — it learns meaning from what millions of users actually want when they search for it.

Important detail: Documents from more popular apps get more weight. If Mail App captures 15% of downloads and MyMail only 4%, the words in Mail App’s listing count roughly four times more when building the indicator term list. The vocabulary of winning apps shapes the vocabulary the algorithm expects.

Method 2: The Giant Comparison Table

The second approach works like a massive cross-reference chart.

Apple takes every meaningful word (nouns and verbs, ignoring articles and prepositions like “the,” “a,” “to”) used across all app listings in the store. Let’s say that’s 5,000 words. It builds a 5,000 × 5,000 table. Each cell in the table contains a score answering the question: “If an app listing contains Word A, how likely is it to also contain Word B?”

For example, the cell for (“email,” “compose”) would have a high score because most app listings that mention “email” also mention “compose.” But the cell for (“email,” “billiards”) would be near zero.

The scores are directional — meaning the relationship isn’t always equal in both directions. “Email” strongly predicts “compose,” but “compose” only weakly predicts “email” because “compose” also appears in music apps (compose a song) and writing apps (compose a letter).

This table is built ahead of time, not during a live search. So when a user types “email,” the system instantly looks up Row “email” in the table and retrieves the top indicator terms with their relevance scores. No delay.

The scoring formula works like this in simple terms: if “compose” appears in 80 out of 100 app listings that contain “email,” but only in 200 out of 50,000 total listings, the score is high — because “compose” is both common in email apps and rare overall, making it a strong signal. If a word like “free” appears in 60 out of 100 email-app listings but also in 30,000 out of 50,000 total listings, its score is low — it’s everywhere and tells you nothing specific.

How Indicator Terms Change Your App’s Ranking

When a user searches “email,” the ranking process works in five steps:

Step 1: Look up “email” in the indicator term table. Retrieve related terms: compose (weight: 0.9), reply (0.85), inbox (0.8), filter (0.6), attachment (0.55), etc.

Step 2: For each app in the database, count how many times the original keyword “email” and each indicator term appear — counting title and description separately.

Step 3: Multiply each count by that term’s weight. “Email” itself gets full weight (1.0). “Compose” appearances get multiplied by 0.9. “Filter” appearances by 0.6. And so on.

Step 4: Add up all the weighted counts for each app. This produces a document score.

Step 5: Rank apps by their document score. Non-textual factors like download popularity, ratings, and price can also adjust the final ranking.

The patent also describes an alternative cascading approach: first find all apps matching “email,” then filter using the strongest indicator terms, then further refine with the next-strongest ones. Either way, the result is the same — apps that contain the right ecosystem of related terms rank higher than those that simply repeat the search keyword.

What This Means for ASO

Write your listing like a real app in your category, not like a keyword farm. Apple’s system has a vocabulary fingerprint for every popular search query. If your listing doesn’t contain the words that successful apps in your category use, you’ll score lower — even if you mention the target keyword more often.

The description matters more than you think. The patent proves Apple reads the description and counts indicator terms in it. The common ASO advice that “the description isn’t indexed” may be oversimplified. At minimum, this patent describes a system that uses description content for ranking signals.

Title is still king, but in a different way. The algorithm performs separate counts for title and description, and it checks whether terms co-occur within the same field. Having both your target keyword and related indicator terms in the title creates the strongest possible signal.

Study the top-ranking apps for your keywords. Their vocabulary is literally shaping what Apple’s algorithm expects to find. If the top five apps for “photo editor” all use words like “filter,” “crop,” “adjust,” and “retouch,” those are likely indicator terms. Use them naturally in your own listing.

Don’t just repeat keywords — surround them with context. Saying “email” five times is worse than saying “email” once alongside “compose,” “inbox,” “reply,” and “attachment.” The system explicitly penalizes high keyword count with low indicator term presence.

Technical Details

Field	Value
Patent Number	US9280595B2
Application	US13/599,722
Filed	August 30, 2012
Granted	March 8, 2016
Status	Active (expires April 25, 2034)
Continuation	US20160171091A1 (filed Jan 29, 2016)
Classification	G06F 16/3325 (Query reformulation); G06F 16/3347 (Vector-based model)
Inventors	Catherine A. Edwards, Natalia Hernandez-Gardiol

Connection to the Query Classifier Patent

This patent was filed by the same lead inventor (Catherine A. Edwards) just weeks after the Query Classifier patent (US9405832B2). They work as a pair: the classifier determines the type of search (navigational vs. functional vs. browse), and this patent handles query expansion for functional searches where exact title matching alone isn’t enough. Together, they describe a two-stage pipeline — first classify the intent, then expand the query with behavioral indicator terms, then rank.

Patent source: US9280595B2 via Google Patents

ASO Patents