π String fields
Most data types are queried straightforwardly, and match as one would expect with standard
MongoDB B-Tree indexes. However, string
fields are exceptional, in many ways.
There are several ways in which strings can be used:
- Matching exact (or case-insensitive) values. Good ol'
equals
andin
- To search for words, or variations thereof, within a string
- Partial, or sub-string, matching
- For sorting
- Faceting (not covered in this workshop)
- Highlighting (not covered in this workshop)
A BSON string
field in your collection can be mapped into one or more of the string-supporting Atlas Search types. A dynamic: true
mapping maps all string
field in one way: as full-text fields analyzed with lucene.standard
. This analyzed textual field type is called string
. An Atlas Search field indexed as string can only be queried by certain operators meant for full text queries. However, let's first cover the most basic way a string field can be indexed and used: as-is. In order to index a string field as its exact or normalized value and make use of equals
and in
operators to query it as the exact or normalized value, it must be statically mapped to type token
.
token
β
A token
type mapping indexes strings as-is (or normalized) in a way suitable for full string equals
and in
matching, as well as for $search.sort
'ing
autocomplete
β
This field type is paired with an operator by the same name to provide partial, sub-string, matching.
Careful though - while this field type sounds alluring and does match sub-strings, it does not by itself provide good relevancy or highlighting and can explode the index size.
The autocomplete
field type and operator use complex analysis and querying technique build upon the Analysis techniques in the next section and will be covered a bit more there.
string
β
The heart of Lucene - text! string
mapped fields are analyzed so that its individual terms are
searchable.