Patent US9280595B2 — “Application Query Conversion” Filed: August 30, 2012 · Granted: March 8, 2016 · Status: Active (expires 2034) Assignee: Apple Inc. · Inventors: Catherine A. Edwards, Natalia Hernandez-Gardiol Continuation: US20160171091A1 (filed January 2016)
The Problem: Keyword Counting Breaks Search
The patent opens with a simple example that explains why pure keyword matching fails.
A user searches for “email.” Two apps exist in the store. A mail app mentions “email” once across its title and description. A pool game mentions “email” three times in its description — something like “email your scores to friends, email challenges to opponents, email invitations to play.”
If you rank by raw keyword count, the pool game wins. That’s obviously wrong. This patent describes how Apple fixes it.
The Solution in Plain English
Instead of just counting how many times a keyword appears, Apple’s system asks a smarter question: does this app’s listing contain the words that real email apps typically contain?
The system builds a vocabulary of “indicator terms” for every common search query. For “email,” those indicator terms might be: account, reply, compose, message, filter, inbox, attachment, forward.
When ranking search results, the system checks whether each app’s listing contains these indicator terms. The mail app’s listing naturally includes words like “compose,” “reply,” and “inbox.” The pool game’s listing includes none of them. So the mail app ranks higher — even though it mentions “email” fewer times.
The key insight: Apple judges your app by the company your words keep, not just by the keyword itself.
Where Exactly Does Apple Look? Title vs. Description
The patent is specific about which parts of an app listing the algorithm reads. It refers to each app’s “document” — which includes the title, description, and metadata. But critically, the patent reveals that the algorithm does not treat all fields equally:
The system performs separate counts for separate sections of the document — specifically, title versus description. This means Apple tracks indicator term appearances in the title independently from appearances in the description. A match in the title carries different weight than a match in the description.
The patent also states that the relatedness score between terms can be influenced by whether two terms appear within the same field. If your search keyword and an indicator term both appear in the title, that’s a stronger signal than if one appears in the title and the other in the description.
Additionally, the system considers word proximity — how many words separate the query term from an indicator term within the same field. “Email” appearing right next to “compose” in a description signals stronger relevance than “email” at the top and “compose” at the bottom.
To summarize what the algorithm reads:
| Listing Element | Checked by Algorithm | Notes |
|---|---|---|
| App Title | Yes — counted separately | Matches here likely carry more weight |
| Description | Yes — counted separately | Word proximity matters within this field |
| Metadata (including keyword field) | Yes — part of the “document” | Referenced alongside title and description |
| Download count / frequency | Yes | Used for non-textual ranking signals |
| User ratings | Yes | Used for non-textual ranking signals |
The patent explicitly lists the app database as containing “titles, descriptions, metadata, download frequency or count, user ranking, and/or the apps themselves.” All of these feed into the system.
How Apple Builds Its Vocabulary: Two Methods
Method 1: Learning From What Users Actually Download
Think of this like a restaurant recommendation based on what similar customers ordered.
Apple logs every search and what happens next. When thousands of users search “email,” the system records which apps they actually download. Suppose 15% download Mail App, 7% download YourMail, 4% download MyMail, and 0% download the pool game.
The system then takes only the listings of apps that users chose — Mail, YourMail, MyMail — and throws away the pool game listing. It reads through the remaining listings and asks: which words appear consistently across all of these?
Words like “inbox,” “compose,” “reply,” and “attachment” appear in most of these mail app listings. The word “cue” or “billiards” appears in none of them. So “inbox,” “compose,” “reply,” and “attachment” become the indicator terms for the query “email.”
This method is powerful because it’s grounded in real user behavior. It doesn’t rely on a dictionary or a linguist deciding what “email” means — it learns meaning from what millions of users actually want when they search for it.
Important detail: Documents from more popular apps get more weight. If Mail App captures 15% of downloads and MyMail only 4%, the words in Mail App’s listing count roughly four times more when building the indicator term list. The vocabulary of winning apps shapes the vocabulary the algorithm expects.
Method 2: The Giant Comparison Table
The second approach works like a massive cross-reference chart.
Apple takes every meaningful word (nouns and verbs, ignoring articles and prepositions like “the,” “a,” “to”) used across all app listings in the store. Let’s say that’s 5,000 words. It builds a 5,000 × 5,000 table. Each cell in the table contains a score answering the question: “If an app listing contains Word A, how likely is it to also contain Word B?”
For example, the cell for (“email,” “compose”) would have a high score because most app listings that mention “email” also mention “compose.” But the cell for (“email,” “billiards”) would be near zero.
The scores are directional — meaning the relationship isn’t always equal in both directions. “Email” strongly predicts “compose,” but “compose” only weakly predicts “email” because “compose” also appears in music apps (compose a song) and writing apps (compose a letter).
This table is built ahead of time, not during a live search. So when a user types “email,” the system instantly looks up Row “email” in the table and retrieves the top indicator terms with their relevance scores. No delay.
The scoring formula works like this in simple terms: if “compose” appears in 80 out of 100 app listings that contain “email,” but only in 200 out of 50,000 total listings, the score is high — because “compose” is both common in email apps and rare overall, making it a strong signal. If a word like “free” appears in 60 out of 100 email-app listings but also in 30,000 out of 50,000 total listings, its score is low — it’s everywhere and tells you nothing specific.
How Indicator Terms Change Your App’s Ranking
When a user searches “email,” the ranking process works in five steps:
Step 1: Look up “email” in the indicator term table. Retrieve related terms: compose (weight: 0.9), reply (0.85), inbox (0.8), filter (0.6), attachment (0.55), etc.
Step 2: For each app in the database, count how many times the original keyword “email” and each indicator term appear — counting title and description separately.
Step 3: Multiply each count by that term’s weight. “Email” itself gets full weight (1.0). “Compose” appearances get multiplied by 0.9. “Filter” appearances by 0.6. And so on.
Step 4: Add up all the weighted counts for each app. This produces a document score.
Step 5: Rank apps by their document score. Non-textual factors like download popularity, ratings, and price can also adjust the final ranking.
The patent also describes an alternative cascading approach: first find all apps matching “email,” then filter using the strongest indicator terms, then further refine with the next-strongest ones. Either way, the result is the same — apps that contain the right ecosystem of related terms rank higher than those that simply repeat the search keyword.
What This Means for ASO
Write your listing like a real app in your category, not like a keyword farm. Apple’s system has a vocabulary fingerprint for every popular search query. If your listing doesn’t contain the words that successful apps in your category use, you’ll score lower — even if you mention the target keyword more often.
The description matters more than you think. The patent proves Apple reads the description and counts indicator terms in it. The common ASO advice that “the description isn’t indexed” may be oversimplified. At minimum, this patent describes a system that uses description content for ranking signals.
Title is still king, but in a different way. The algorithm performs separate counts for title and description, and it checks whether terms co-occur within the same field. Having both your target keyword and related indicator terms in the title creates the strongest possible signal.
Study the top-ranking apps for your keywords. Their vocabulary is literally shaping what Apple’s algorithm expects to find. If the top five apps for “photo editor” all use words like “filter,” “crop,” “adjust,” and “retouch,” those are likely indicator terms. Use them naturally in your own listing.
Don’t just repeat keywords — surround them with context. Saying “email” five times is worse than saying “email” once alongside “compose,” “inbox,” “reply,” and “attachment.” The system explicitly penalizes high keyword count with low indicator term presence.
Technical Details
| Field | Value |
|---|---|
| Patent Number | US9280595B2 |
| Application | US13/599,722 |
| Filed | August 30, 2012 |
| Granted | March 8, 2016 |
| Status | Active (expires April 25, 2034) |
| Continuation | US20160171091A1 (filed Jan 29, 2016) |
| Classification | G06F 16/3325 (Query reformulation); G06F 16/3347 (Vector-based model) |
| Inventors | Catherine A. Edwards, Natalia Hernandez-Gardiol |
Connection to the Query Classifier Patent
This patent was filed by the same lead inventor (Catherine A. Edwards) just weeks after the Query Classifier patent (US9405832B2). They work as a pair: the classifier determines the type of search (navigational vs. functional vs. browse), and this patent handles query expansion for functional searches where exact title matching alone isn’t enough. Together, they describe a two-stage pipeline — first classify the intent, then expand the query with behavioral indicator terms, then rank.
Patent source: US9280595B2 via Google Patents
Bir yanıt yazın