Search API

Skeleton

Examples

<search id="search-1" index="posts_v0">
    <order type="geo-distance" field="location" latitude="74.12" longitude="12.34" reverse="true" />
    <filters>
        <if test="request.GET.phrase">
            <phrase syntax="simple">{{ request.GET.phrase }}</phrase>
        </if>
        <if test="request.GET.lat and request.GET.lon">
            <geo_distance field="location" latitude="{{ request.GET.lat }}" longitude="{{ request.GET.lon }}" radius="100km" />
        </if>
        <any_label>
            <empty field="field-2" />
            <badge field="field-3" />
        </any_label>
        <no_labels>
            <value field="field-4">123456</value>
            <value field="field-5">{{ 123456 }}</value>
            <value field="extras:field-6" cast="float">123456</value>
        </no_labels>
        <any_range>
            <between field="field-7" min="2020-01-01" max="2022-01-01)" cast="datetime" />
        </any_range>
        <no_ranges>
            <more_than field="field-8" min="2020-01-01 12:30:45" cast="datetime" />
            <less_than field="extras:field-9" max="123456789" cast="str" />
        </no_ranges>
    </filters>
</search>

<search id="search-2" index="posts_v0" _scan_limit="1000">
    <order type="relevance" decay_field="published-at" decay_speed="5" />
    <filters>
        <if test="request.GET.phrase">
            <phrase syntax="simple" weights="10,5,0">{{ request.GET.phrase }}</phrase>
        </if>
        <any_label>
            <types>article,image,video</types>
        </any_label>
        <any_label>
            <regular_sections>path/to/section-1,{{ 12345678 }}</regular_sections>
        </any_label>
        <no_labels>
            <primary_sections>path/to/section-3,path/to/section/that/does/not/exist</primary_sections>
        </no_labels>
        <any_label>
            <custom_fields>path.to.field1=value1,path.to.field2=value2</custom_fields>
        </any_label>
        <no_labels>
            <custom_fields>path.to.field3=value3</custom_fields>
        </no_labels>
        <any_range>
            <published_at after="2020-01-01" before="2022-01-01 12:44:55" />
        </any_range>
    </filters>
</search>

<search id="search-3" index="posts_v1" _scan_limit="1000">
    <order type="field" field="published-at" reverse="true" />
    <filters>
        <if test="request.GET.another_phrase">
            <phrase syntax="simple" weights="10,5,0">{{ request.GET.another_phrase }}</phrase>
        </if>
        <any_label>
            <primary_tags>tag-1</primary_tags>
        </any_label>
        <no_labels>
            <regular_tags>tag-2,tag-3</regular_tags>
        </no_labels>
        <no_ranges>
            <custom_field path="path.to.field4" more_than="10" less_than="90" />
        </no_ranges>
    </filters>
</search>

<posts
    source="search"
    search_id="search-1"
    ...
/>

<posts
    source="search"
    search_id="search-2"
    limit="10"
    ...
/>

<posts_count
    source="search"
    search_id="search-3"
    ...
/>

Endpoints

Get Indexes

GET /core/v1/search/indexes

200 OK

{
  "indexes": [
    {
      "slug": "my-first-index",
      "title": "My First Index",
      "status": "created",
      "mapping": <index mapping>
  ]
}

Hints:

Create/Modify Index

PUT /core/v1/search/indexes/<index slug>
{
  "commands": [
    <command description>,
    <command description>,
    ...
  ]
}

200 OK

{
  "indexes": [
    {
      "slug": "my-second-index",
      "title": "My Second Index",
      "status": "created",
      "mapping": <index mapping>
  ]
}

Hints:

Fill Index with Data

POST /core/v1/search/indexes/my-second-index/index-posts
POST /core/v1/search/indexes/my-second-index/index-users
GET /core/v1/search/indexes/my-second-index/index-posts?task_id=38e9eea7-fc5b-4373-90f0-cb2d59109113
GET /core/v1/search/indexes/my-second-index/index-users?task_id=38e9eea7-fc5b-4373-90f0-cb2d59109113

200 OK

{
  "task": {
    "id": "38e9eea7-fc5b-4373-90f0-cb2d59109113",
    "is_ready": true,
    "progress": null,  // or same as "response"
    "response": {
      "entities": {
        "total_count": 12343,
        "indexed_count": 12343
      }
    },
    "exception": null
}

Find records count

POST core/v1/search/indexes/my-second-index/find-count
{
  "filters": [
    <filter>,
    <filter>,
    ...
  ]
}

200 OK

{
  "records_count": 41234
}

Hints:

Find records

POST core/v1/search/indexes/my-second-index/find
{
  "filters": [
    <filter>,
    <filter>,
    ...
  ],
  "chunk": {
    "order": <order>,
    "limit": 10,
    "offset": 0,
    "cursor": <null or cursor>
  }
}

200 OK

{
  "records": [
    {"id": <taxonomy guid>, "cursor": "cursor-1"},
    {"id": <taxonomy guid>, "cursor": "cursor-2"},
    ...
  ]
}

Hints:

Commands

The commands allow to navigate an index through its lifecycle:

  1. Create index
    • Update index title

      {"set-title": "My Second Index"}

    • Update index mapping:

      {"set-mapping": <index mapping>}

  2. Start index. It’s not possible to change the mapping after an index is started

    {"start": {}}

    • Fill the index with data

  3. Activate index. It makes it available for reads

    {"activate": {}}

    • Use index for searching

  4. Stop index. It’s not possible to read or write from it anymore but the data is still stored

    {"stop": {}}

  5. Remove index. It removes the index with all its data

    {"remove": {}}

Hints:

Mappings

{
  "root": <taxonomy entity type>,
  "texts": [
    {
      "weight": 8,
      "analyzer": <text analyzer configuration>,
      "extractor": <text extractor configuration>,
    },
    ...
  ],
  "shapes": [
    {
      "alias": "alias-for-the-geo-point",
      "extractor": {"geo_point": {
        "latitude": {"basic": {"field": {"post-custom-field": {"path": "location.latitude"}}}},
        "longitude": {"basic": {"field": {"post-custom-field": {"path": "location.longitude"}}}}
      }}
    },
    ...
  ],
  "values":[
    {"type": <value type>, "field": <taxonomy field>},
    {"type": <value type>, "field": <taxonomy field>},
    ...
  ],
  "labels": [
    {"field": <taxonomy field>},
    {"field": <taxonomy field>},
    ...
  ],
  "filters": [
    <taxonomy filter>,
    <taxonomy filter>,
    ...
  ]
}

Hints:

Text Analyzers

Text analysis is the process of converting unstructured text, like the body of an email or a product description, into a structured format that’s optimized for search.

  • For full-text search

    {"unicode": {}}

  • For full-text search by English words

    {"english": {}}

  • For exact case-insensitive search by words

    {"keyword": {"normalizers": [{"lowercase": {}}]}}

  • For search by a case-insensitive substring with the length at least of 3 symbols (tri-gram tokens)

    {"ngram": {"size": 3, "normalizers": [{"uppercase": {}}]}}

Text Extractors

Text extraction is the process of retrieving and combining texts from indexed entities. This text then goes through text analysis process before being stored in the index.

  • Take an entity text as it is

    {"basic": {"field": <taxonomy field>}}

  • Combine text from a sequence of sub-extractors
    {
      "composite": {
        "extractors": [
          <text extractor>,
          <text extractor>,
          ...
        ],
      }
    }
    
  • Extract text based on a condition
    {
      "conditioned": {
        "condition": {"taxonomy": {"filters": [
          <taxonomy filter>,
          <taxonomy filter>,
          ...
        ]}},
        "then_extractor": <text extractor or null>,
        "else_extractor": <text extractor or null>
      }
    }
    

Hints:

  • <taxonomy field> (see: Fields)

  • <taxonomy filter> (see: Filters)

Value Types

Value fields used for both sorting results as well as filtering by range. Therefore, it can only support some simple types:

  • Strings

    "str"

  • Numbers

    "int"

  • Numbers with floating point

    "float"

Filters

  • Include only records that match the phrase with default syntax and text weights
    {
      "phrase": {
        "phrase": "Hello World"
      }
    }
    
  • Include only records that match the strict syntax query only in the 2nd text field
    {
      "phrase": {
        "phrase": "\"Hello World\"",
        "syntax": {"strict": {}},
        "weights": [0, 10, 0]
      }
    }
    
  • Include only records that match
    {
      "geo-distance": {
        "shape": "alias-from-the-mapping",
        "point": {"latitude": "37.7749", "longitude": "-122.4194"},
        "radius": {"miles": "1000.0"}
      }
    }
    
  • Include only records that match labels and ranges
    {
      "taxonomy": <taxonomy filter>
    }
    

Hints:

Orders

  • Sort by one of the value fields

    {"value": {"field": <taxonomy field>, "reverse": true}}

  • Sort by relevance to the phrase

    {"relevance": {}}

  • Sort by relevance to the phrase penalizing “older” records relevance score

    {"relevance": {"decay": {"field": <timestamp taxonomy field>}}}

  • Sort by distance to the point

    {"geo-distance": {"shape": "alias-from-the-mapping", "point": {"latitude": "37.7749", "longitude": "-122.4194"}}, "reverse": false}

Hints:

  • <taxonomy field> (see: Fields)