App Engine Search API

Add Google-like searches to your application

  • Powerful query language
  • Advanced features including scoring and snippeting
  • GeoPoints for location-based searches

What you'll learn: how to use the Search API and add full-text search to your own applications

App Engine Search API

overview

In this Webinar

  • Sample Application
  • Indexes
  • Documents
  • Queries
  • Geosearch
  • Best Practices
  • Resources

About this Webinar

Introducing an Example Product Search Application

Indexes and Documents

Indexes

Get an Index (Python):

from google.appengine.api import search

index = search.Index(name=‘productsearch1’)
...
index = search.Index(name=‘yourindex’, namespace=‘yournamespace’)

Indexes

Get an Index (Java):

import com.google.appengine.api.search.SearchServiceFactory;

Index index = SearchServiceFactory.getSearchService()
    .getIndex(IndexSpec.newBuilder().setName("productsearch1"));
...

Index index = SearchServiceFactory.getSearchService("yournamespace")
    .getIndex(IndexSpec.newBuilder().setName("yourindex"));

What are Documents?

  • Documents hold an Index’s searchable content.
  • You add a Document to an index.
  • Once the document is indexed, you can query its contents.
  • Some of the Document field types:
    • TextField
    • AtomField
    • NumberField
    • DateField

"Product" Documents in the Example Application

All product documents share some core fields.

Each category has some additional fields.

products

Building a 'Product' Document

from google.appengine.api import search

fields = [
    search.TextField(name=PID, value=pid), 
    search.DateField(name=UPDATED, value=datetime.datetime.now().date()),
    search.TextField(name=PRODUCT_NAME, value=name),
    search.TextField(name=DESCRIPTION, value=description),
    search.AtomField(name=CATEGORY, value=category),
    search.NumberField(name=AVG_RATING, value=0.0),
    search.NumberField(name=PRICE, value=price) ]

doc = search.Document(doc_id=product_id, fields=fields)
    

Building a 'Product' Document (Java)

import com.google.appengine.api.search.Document;
import com.google.appengine.api.search.Field;

Document doc = Document.newBuilder().
    .setId(productId)
    .addField(Field.newBuilder().setName(PID).setText(pid))
    .addField(Field.newBuilder().setName(UPDATED).setDate(Field.date(date))
    .addField(Field.newBuilder().setName(PRODUCT_NAME).setText(name))
    .addField(Field.newBuilder().setName(DESCRIPTION).setText(description))
    .addField(Field.newBuilder().setName(CATEGORY).setAtom(category))
    .addField(Field.newBuilder().setName(AVG_RATING).setNumber(0.0))
    .addField(Field.newBuilder().setName(PRICE).setNumber(price))
    .build();
    

Indexing Documents

from google.appengine.api import search
...
doc = search.Document(doc_id=product_id, fields=fields)
try:
  add_result = search.Index(name=INDEX_NAME).add(doc)
  doc_id = add_result[0].id
except search.Error:
  # ...
 

Indexing Documents (Java)

import com.google.appengine.api.search.AddResponse;
import com.google.appengine.api.search.Document;
import com.google.appengine.api.search.StatusCode;
...
Document doc = Document.newBuilder()
    .setId(productId)
    .addField(...)
    .build();

try {
    AddResponse response = index.add(doc);
    String docId = response.getIds().get(0);
} catch (AddException e) {
    if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())) {
        // retry adding document
    }
}
 

Basic Search Queries

Search using a Query String

from google.appengine.api import search

query = "writings collection"
index = search.Index(INDEX_NAME)
try:
  search_results = index.search(query)
  for doc in search_results:
    # process doc ...
except search.Error:
  # ...
    

Search using a Query String (Java)

import com.google.appengine.api.search.Index;
import com.google.appengine.api.search.Results;
import com.google.appengine.api.search.ScoredDocument;
import com.google.appengine.api.search.SearchException;
import com.google.appengine.api.search.SearchServiceFactory;

String query = "writings collection";
Index index = SearchServiceFactory.getSearchService()
    .getIndex(IndexSpec.newBuilder().setName("productsearch1"));

try {
    Results<ScoredDocument> results = index.search(query);
    for (ScoredDocument doc : results) {
        // process doc ...
    }
} catch (SearchException e) {
    if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())) {
        // retry
    }
}
    

Search using a Query Object

from google.appengine.api import search

index = search.Index(INDEX_NAME)
user_query = "writings collection"
query = search.Query(
    query_string=user_query,
    options=search.QueryOptions(limit=doc_limit))
search_results = index.search(query)        
    

Search using a Query Object (Java)

import com.google.appengine.api.search.Index;
import com.google.appengine.api.search.Query;
import com.google.appengine.api.search.QueryOptions;
import com.google.appengine.api.search.Results;
import com.google.appengine.api.search.ScoredDocument;
import com.google.appengine.api.search.SearchException;
import com.google.appengine.api.search.SearchServiceFactory;

Index index = SearchServiceFactory.getSearchService()
    .getIndex(IndexSpec.newBuilder().setName(INDEX_NAME));
String userQuery = "writings collection";

Query query = Query.newBuilder()
    .setOptions(QueryOptions.newBuilder().setLimit(docLimit))
    .build(userQuery);
Results<ScoredDocument> results = index.search(query);        
    

Processing the Query Results

index = search.Index(INDEX_NAME)
try:
  search_results = index.search(query)
  returned_count = len(search_results.results)
  number_found = search_results.number_found
  for doc in search_results:
    doc_id = doc.doc_id
    fields = doc.fields
    # etc.
except search.Error:
  # ...
  

Processing the Query Results (Java)

Index index = ...

try {
    Results<ScoredDocument> results = index.search(query);
    int numberReturned = results.getNumberReturned();
    long numberFound = results.getNumberFound();
    for (ScoredDocument doc : results) {
        String docId = doc.getId();
        Iterable<Field> fields = doc.getFields();
        // etc.
    }
} catch (SearchException e) {
    // ...
}
  

Query Options

Introducing Query Options

query = search.Query(
    query_string=user_query,
    options=search.QueryOptions(
        limit=doc_limit, 
        offset=offsetval,
        sort_options=sortopts,
        snippeted_fields=[DESCRIPTION],   
        returned_expressions=[
            search.FieldExpression(name='adjusted_price', expression='price * 1.08')],
        returned_fields=[PID, DESCRIPTION, CATEGORY, AVG_RATING, PRICE, PRODUCT_NAME]
        ))
    

Introducing Query Options (Java)

String userQuery = ...;
SortOptions sortOpts = ...;

Query query = Query.newBuilder()
    setOptions(QueryOptions.newBuilder()
        .setLimit(docLimit) 
        .setOffset(offsetVal)
        .setSortOptions(sortOpts)
        .setFieldsToSnippet(DESCRIPTION)   
        .addExpressionToReturn(
            FieldExpression.newBuilder()
              .setName("adjusted_price")
              .setExpression("price * 1.08")
        .setFieldsToReturn(PID, DESCRIPTION, CATEGORY, AVG_RATING, PRICE, PRODUCT_NAME))
    .build(userQuery);
    

Query Offsets and Limits

search.QueryOptions(
    limit=doc_limit,
    offset=offsetval,
    ...)      
    
returned_count = len(search_results.results)
number_found = search_results.number_found
   

Query Offsets and Limits (Java)

QueryOptions.newBuilder()
    .setLimit(docLimit)
    .setOffset(offsetVal)
    ...
    build(); 

int returnedCount = results.getNumberReturned()
long numberFound = results.getNumberFound();
   

Snippeting

search.QueryOptions(
  snippeted_fields=[DESCRIPTION],
  ...)
...
for doc in search_results:
  ...
  for expr in doc.expressions:  
    if expr.name == DESCRIPTION:
      description_snippet = expr.value
      break
  # do something with the document ...
    

Snippeting (Java)

QueryOptions.newBuilder()
    .setFieldsToSnippet(DESCRIPTION)
    ...
    .build();
...
for (ScoredDocument doc : results) {
    ...
    String descriptionSnippet;
    for (Field expr : doc.getExpressions()) {  
        if (DESCRIPTION.equals(expr.getName())) {
            descriptionSnippet = expr.getText();
            break;
        }
    }
    // do something with the document ...
}
    

Returned Expressions

  search.QueryOptions(
      returned_expressions=[
          search.FieldExpression(name='adjusted_price', expression='price * 1.08')],
      ...)
  ...

  for doc in search_results:
    ...
    for expr in doc.expressions: 
      if expr.name == 'adjusted_price':
        price = expr.value
    # do something with the document ...  
    

Returned Expressions (Java)

QueryOptions.newBuilder()
      .addExpressionToReturn(
          FieldExpression.newBuilder().setName("adjusted_price").setExpression("price * 1.08'"))
      ...
      build();
  ...

for (ScoredDocument doc : results) {
    ...
    Double price;
    for (Field expr : doc.getExpressions()) { 
        if ("adjusted_price".equals(expr.getName())) {
            price = expr.getNumber();
            break;
        }
    }
    // do something with the document ... 
} 
    

Sorting the Query Results

The SortOptions Class

from google.appengine.api import search

search.QueryOptions(
    sort_options=search.SortOptions(...)
    ...)      
    
import com.google.appengine.api.search.QueryOptions;
import com.google.appengine.api.search.SortOptions;

QueryOptions.newBuilder()
    .setSortOptions(SortOptions.newBuilder()...)
    .build();    
  

Match Scoring

  • Sort on term frequency-based score
search.QueryOptions(
    sort_options=search.SortOptions(
        match_scorer=search.MatchScorer())
    ...)      
    
QueryOptions.newBuilder()
    .setSortOptions(SortOptions.newBuilder()
        .setMatchScorer(MatchScorer.newBuilder()))
    ...
    .build();
    

Sort Expressions

search.SortExpression(expression='price', 
                      direction=search.SortExpression.ASCENDING, 
                      default_value=9999)

search.SortExpression(expression='name', 
                      direction=search.SortExpression.ASCENDING, 
                      default_value='zzz')

search.SortExpression(expression=’ar’, 
                      default_value=0)
    

Sort Expressions (Java)

SortExpression.newBuilder()
    .setExpression("price") 
    .setDirection(SortExpression.SortDirection.ASCENDING) 
    .setDefaultValueNumeric(9999.0)
    .build();

SortExpression.newBuilder()
    .setExpression("name") 
    .setDirection(SortExpression.SortDirection.ASCENDING) 
    .setDefaultValue("zzz")
    .build();

SortExpression.newBuilder()
    .setExpression("ar")
    .setDefaultValueNumeric(0.0)
    .build();
    

Using Sort Expressions in SortOptions

sortopts = search.SortOptions(expressions=[
    search.SortExpression(expression=PRICE, 
                          direction=search.SortExpression.ASCENDING, 
                          default_value=9999)])

search.QueryOptions(
    sort_options=sortopts
    ...)    
    

Using Sort Expressions in SortOptions (Java)

SortOptions sortOpts = SortOptions.newBuilder()
    .addSortExpression(SortExpression.newBuilder()
        .setExpression(PRICE)
        .setDirection(SortExpression.SortDirection.ASCENDING)
        .setDefaultValueNumeric(9999.0))
    .build();

QueryOptions.newBuilder()
    .setSortOptions(sortOpts)
    ...
    .build();
    

More Complex Sort Expressions

search.SortExpression(expression='ar + _score', default_value=0)
    

If you access _score in a sort expression, your SortOptions object should include a scorer.

search.SortOptions(
    match_scorer=search.MatchScorer(),
    expressions=[search.SortExpression(...),...])  
    

More Complex Sort Expressions (Java)

SortExpression sortExpr = SortExpression.newBuilder()
    .setExpression("ar + _score")
    .setDefaultValueNumeric(0.0)
    .build();
  

If you access _score in a sort expression, your SortOptions object should include a scorer.

SortOptions.newBuilder()
    .setMatchScorer(MatchScorer.newBuilder())
    .addSortExpression(sortExpr)
    ...
    .build();
    

Geosearch using the Search API

App Engine GeoSearch

geosearch

Using GeoPoint Fields in Documents

from google.appengine.api import search

geopoint = search.GeoPoint(-33.857, 151.215)
fields = [search.TextField(name=’name’, value=store_name),
          search.TextField(name=’address’, value=store_address),
          search.GeoField(name='store_location', value=geopoint)]      
    

Using GeoPoint Fields in Documents (Java)

import com.google.appengine.api.search.Document;
import com.google.appengine.api.search.Field;
import com.google.appengine.api.search.GeoPoint;

GeoPoint geoPoint = search.GeoPoint(-33.857, 151.215);
Document.newBuilder()
    .addField(Field.newBuilder().setName("name").setText(storeName))
    .addField(Field.newBuilder().setName("address").setText(storeAddress))
    .addField(Field.newBuilder().setName("store_location").setGeoPoint(geoPoint))
    .build();     
    

Using GeoField Search

  • A geosearch query has the general format:
    distance(<field-name>, geopoint(lat, lng)) < distance
from google.appengine.api import search

index = search.Index(STORE_INDEX_NAME)
query = "distance(store_location, geopoint(-33.857, 151.215)) < 4500"
try:
  search_results = index.search(query)
  for doc in search_results:
    # process doc ...
except search.Error:
  # ...
    

Using GeoField Search (Java)

  • A geosearch query has the general format:
    distance(<field-name>, geopoint(lat, lng)) < distance
Index index = SearchServiceFactory.getSearchService()
    .getIndex(IndexSpec.newBuilder().setName(STORE_INDEX_NAME));
String query = "distance(store_location, geopoint(-33.857, 151.215)) < 4500";
try {
    Results<ScoredDocument> results = index.search(query);
    for (ScoredDocument doc : results) {
        // process doc ...
    }
} catch (SearchException e) {
    ...
}
    

Sorting Geo Data by Distance

from google.appengine.api import search

index = search.Index(config.STORE_INDEX_NAME)
query = "distance(store_location, geopoint(-33.857, 151.215)) < 4500"
loc_expr = "distance(store_location, geopoint(-33.857, 151.215))"
sort_expr = search.SortExpression(
    expression=loc_expr,
    direction=search.SortExpression.ASCENDING, 
    default_value=4501)
search_query = search.Query(
    query_string=query,
    options=search.QueryOptions(sort_options=search.SortOptions(expressions=[sort_expr])))
results = index.search(search_query)      
    

Sorting Geo Data by Distance (Java)

Index index = SearchServiceFactory.getSearchService()
    .getIndex(IndexSpec.newBuilder().setName(STORE_INDEX_NAME));

String query = "distance(store_location, geopoint(-33.857, 151.215)) < 4500";
String locExpr = "distance(store_location, geopoint(-33.857, 151.215))";

SortExpression sortExpr = SortExpression.newBuilder()
    .setExpression(locExpr)
    .setDirection(SortExpression.SortDirection.ASCENDING)
    .setDefaultValueNumeric(4501.0)
    .build();
Query searchQuery = Query.newBuilder()
    .setOptions(QueryOptions.newBuilder().setSortOptions(
        SortOptions.newBuilder().addExpression(sortExpr)))
    .build(query);
Results<ScoredDocument> results = index.search(searchQuery);
    

Search API Best Practices

  • Updates
  • Index names
  • Rank
  • Use of task queue
  • Atom fields

Resources

<Thank You!>