Search API¶
Skeleton¶
Examples¶
<search id="search-1" index="posts_v0">
<order type="geo-distance" field="location" latitude="74.12" longitude="12.34" reverse="true" />
<filters>
<if test="request.GET.phrase">
<phrase syntax="simple">{{ request.GET.phrase }}</phrase>
</if>
<if test="request.GET.lat and request.GET.lon">
<geo_distance field="location" latitude="{{ request.GET.lat }}" longitude="{{ request.GET.lon }}" radius="100km" />
</if>
<any_label>
<empty field="field-2" />
<badge field="field-3" />
</any_label>
<no_labels>
<value field="field-4">123456</value>
<value field="field-5">{{ 123456 }}</value>
<value field="extras:field-6" cast="float">123456</value>
</no_labels>
<any_range>
<between field="field-7" min="2020-01-01" max="2022-01-01)" cast="datetime" />
</any_range>
<no_ranges>
<more_than field="field-8" min="2020-01-01 12:30:45" cast="datetime" />
<less_than field="extras:field-9" max="123456789" cast="str" />
</no_ranges>
</filters>
</search>
<search id="search-2" index="posts_v0" _scan_limit="1000">
<order type="relevance" decay_field="published-at" decay_speed="5" />
<filters>
<if test="request.GET.phrase">
<phrase syntax="simple" weights="10,5,0">{{ request.GET.phrase }}</phrase>
</if>
<any_label>
<types>article,image,video</types>
</any_label>
<any_label>
<regular_sections>path/to/section-1,{{ 12345678 }}</regular_sections>
</any_label>
<no_labels>
<primary_sections>path/to/section-3,path/to/section/that/does/not/exist</primary_sections>
</no_labels>
<any_label>
<custom_fields>path.to.field1=value1,path.to.field2=value2</custom_fields>
</any_label>
<no_labels>
<custom_fields>path.to.field3=value3</custom_fields>
</no_labels>
<any_range>
<published_at after="2020-01-01" before="2022-01-01 12:44:55" />
</any_range>
</filters>
</search>
<search id="search-3" index="posts_v1" _scan_limit="1000">
<order type="field" field="published-at" reverse="true" />
<filters>
<if test="request.GET.another_phrase">
<phrase syntax="simple" weights="10,5,0">{{ request.GET.another_phrase }}</phrase>
</if>
<any_label>
<primary_tags>tag-1</primary_tags>
</any_label>
<no_labels>
<regular_tags>tag-2,tag-3</regular_tags>
</no_labels>
<no_ranges>
<custom_field path="path.to.field4" more_than="10" less_than="90" />
</no_ranges>
</filters>
</search>
<posts
source="search"
search_id="search-1"
...
/>
<posts
source="search"
search_id="search-2"
limit="10"
...
/>
<posts_count
source="search"
search_id="search-3"
...
/>
Endpoints¶
Get Indexes¶
GET /core/v1/search/indexes
200 OK¶
{
"indexes": [
{
"slug": "my-first-index",
"title": "My First Index",
"status": "created",
"mapping": <index mapping>
]
}
Hints:
<index mapping> (see: Mappings)
Create/Modify Index¶
PUT /core/v1/search/indexes/<index slug>
{
"commands": [
<command description>,
<command description>,
...
]
}
200 OK¶
{
"indexes": [
{
"slug": "my-second-index",
"title": "My Second Index",
"status": "created",
"mapping": <index mapping>
]
}
Hints:
Fill Index with Data¶
POST /core/v1/search/indexes/my-second-index/index-posts
POST /core/v1/search/indexes/my-second-index/index-users
GET /core/v1/search/indexes/my-second-index/index-posts?task_id=38e9eea7-fc5b-4373-90f0-cb2d59109113
GET /core/v1/search/indexes/my-second-index/index-users?task_id=38e9eea7-fc5b-4373-90f0-cb2d59109113
200 OK¶
{
"task": {
"id": "38e9eea7-fc5b-4373-90f0-cb2d59109113",
"is_ready": true,
"progress": null, // or same as "response"
"response": {
"entities": {
"total_count": 12343,
"indexed_count": 12343
}
},
"exception": null
}
Find records count¶
POST core/v1/search/indexes/my-second-index/find-count
{
"filters": [
<filter>,
<filter>,
...
]
}
200 OK¶
{
"records_count": 41234
}
Hints:
<filter> (see: Filters)
Find records¶
POST core/v1/search/indexes/my-second-index/find
{
"filters": [
<filter>,
<filter>,
...
],
"chunk": {
"order": <order>,
"limit": 10,
"offset": 0,
"cursor": <null or cursor>
}
}
200 OK¶
{
"records": [
{"id": <taxonomy guid>, "cursor": "cursor-1"},
{"id": <taxonomy guid>, "cursor": "cursor-2"},
...
]
}
Hints:
Commands¶
The commands allow to navigate an index through its lifecycle:
- Create index
- Update index title
{"set-title": "My Second Index"}
- Update index mapping:
{"set-mapping": <index mapping>}
- Start index. It’s not possible to change the mapping after an index is started
{"start": {}}
Fill the index with data
- Activate index. It makes it available for reads
{"activate": {}}
Use index for searching
- Stop index. It’s not possible to read or write from it anymore but the data is still stored
{"stop": {}}
- Remove index. It removes the index with all its data
{"remove": {}}
Hints:
<index mapping> (see: Mappings)
Mappings¶
{
"root": <taxonomy entity type>,
"texts": [
{
"weight": 8,
"analyzer": <text analyzer configuration>,
"extractor": <text extractor configuration>,
},
...
],
"shapes": [
{
"alias": "alias-for-the-geo-point",
"extractor": {"geo_point": {
"latitude": {"basic": {"field": {"post-custom-field": {"path": "location.latitude"}}}},
"longitude": {"basic": {"field": {"post-custom-field": {"path": "location.longitude"}}}}
}}
},
...
],
"values":[
{"type": <value type>, "field": <taxonomy field>},
{"type": <value type>, "field": <taxonomy field>},
...
],
"labels": [
{"field": <taxonomy field>},
{"field": <taxonomy field>},
...
]
}
Hints:
<taxonomy entity type> (see: Types)
<text analyzer configuration> (see: Text Analyzers)
<text extractor configuration> (see: Text Extractors)
<value type> (see: Value Types)
<taxonomy field> (see: Fields)
Text Analyzers¶
Text analysis is the process of converting unstructured text, like the body of an email or a product description, into a structured format that’s optimized for search.
- For full-text search
{"unicode": {}}
- For full-text search by English words
{"english": {}}
- For exact case-insensitive search by words
{"keyword": {"normalizers": [{"lowercase": {}}]}}
- For search by a case-insensitive substring with the length at least of 3 symbols (tri-gram tokens)
{"ngram": {"size": 3, "normalizers": [{"uppercase": {}}]}}
Text Extractors¶
Text extraction is the process of retrieving and combining texts from indexed entities. This text then goes through text analysis process before being stored in the index.
- Take an entity text as it is
{"basic": {"field": <taxonomy field>}}
- Combine text from a sequence of sub-extractors
{ "composite": { "extractors": [ <text extractor>, <text extractor>, ... ], } }
- Extract text based on a condition
{ "conditioned": { "condition": {"taxonomy": {"filters": [ <taxonomy filter>, <taxonomy filter>, ... ]}}, "then_extractor": <text extractor or null>, "else_extractor": <text extractor or null> } }
Hints:
Value Types¶
Value fields used for both sorting results as well as filtering by range. Therefore, it can only support some simple types:
- Strings
"str"
- Numbers
"int"
- Numbers with floating point
"float"
Filters¶
- Include only records that match the phrase with default syntax and text weights
{ "phrase": { "phrase": "Hello World" } }
- Include only records that match the strict syntax query only in the 2nd text field
{ "phrase": { "phrase": "\"Hello World\"", "syntax": {"strict": {}}, "weights": [0, 10, 0] } }
- Include only records that match
{ "geo-distance": { "shape": "alias-from-the-mapping", "point": {"latitude": "37.7749", "longitude": "-122.4194"}, "radius": {"miles": "1000.0"} } }
- Include only records that match labels and ranges
{ "taxonomy": <taxonomy filter> }
Hints:
<taxonomy filter> (see: Filters)
Orders¶
- Sort by one of the value fields
{"value": {"field": <taxonomy field>, "reverse": true}}
- Sort by relevance to the phrase
{"relevance": {}}
- Sort by relevance to the phrase penalizing “older” records relevance score
{"relevance": {"decay": {"field": <timestamp taxonomy field>}}}
- Sort by distance to the point
{"geo-distance": {"shape": "alias-from-the-mapping", "point": {"latitude": "37.7749", "longitude": "-122.4194"}}, "reverse": false}
Hints:
<taxonomy field> (see: Fields)