Skip to content
This repository was archived by the owner on Sep 28, 2022. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
64b2d17
wip [ci skip]
y-abs Mar 12, 2020
a2da48b
searchresult
y-abs Mar 13, 2020
be2c706
wip test
y-abs Mar 19, 2020
69ae9aa
conflict [ci skip]
y-abs Mar 19, 2020
334d940
Merge branch '3-dev' into KZL-1350-document-search
y-abs Mar 19, 2020
7fc17ea
update documentcontroller.java
y-abs Mar 19, 2020
7701c57
doc wip [ci skip]
y-abs Mar 25, 2020
efb6025
doc snippets tests
y-abs Apr 3, 2020
8377f92
snippets tests
y-abs Apr 8, 2020
16ab5fe
[ci skip] conflict
y-abs Apr 8, 2020
4da440e
Merge branch '3-dev' into KZL-1350-document-search
y-abs Apr 8, 2020
1984ed3
Merge branch '3-dev' into KZL-1350-document-search
y-abs Apr 8, 2020
27563c1
tests conflict
y-abs Apr 8, 2020
a292e30
fix tests
y-abs Apr 8, 2020
d5ab56d
search options doc
y-abs Apr 8, 2020
06624bd
deadlink
y-abs Apr 8, 2020
4822a8a
unit test
y-abs Apr 8, 2020
11ced1b
Merge branch '3-dev' into KZL-1350-document-search
y-abs Apr 10, 2020
e4aaf5c
Update doc/3/controllers/document/search/index.md
y-abs Apr 10, 2020
967fae7
Update doc/3/controllers/document/search/index.md
y-abs Apr 10, 2020
a307024
Update doc/3/core-classes/search-options/index.md
y-abs Apr 10, 2020
29b889f
Update doc/3/core-classes/search-result/introduction/index.md
y-abs Apr 10, 2020
425491f
Update doc/3/core-classes/search-result/next/snippets/scroll.java
y-abs Apr 10, 2020
31d27c7
Merge branch '3-dev' into KZL-1350-document-search
scottinet Apr 15, 2020
23f182b
update doc
y-abs Apr 15, 2020
c73f377
Merge branch 'KZL-1350-document-search' of github.com:kuzzleio/sdk-ja…
y-abs Apr 15, 2020
3dd7325
searchResult next async
y-abs Apr 15, 2020
de094fd
Merge branch '3-dev' into KZL-1350-document-search
y-abs Apr 15, 2020
4be6e86
update doc
y-abs Apr 15, 2020
4d72cce
remove debug
y-abs Apr 17, 2020
3142fcb
Merge branch '3-dev' into KZL-1350-document-search
y-abs Apr 20, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .ci/doc/templates/default.tpl.java
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@
import io.kuzzle.sdk.Options.KuzzleOptions;
import java.util.concurrent.ConcurrentHashMap;
import io.kuzzle.sdk.CoreClasses.Responses.Response;
import io.kuzzle.sdk.CoreClasses.SearchResult;
import io.kuzzle.sdk.Options.SubscribeOptions;
import io.kuzzle.sdk.Options.UpdateOptions;
import io.kuzzle.sdk.Options.CreateOptions;
import io.kuzzle.sdk.Options.SearchOptions;

public class SnippetTest {
private static Kuzzle kuzzle;
Expand Down
6 changes: 6 additions & 0 deletions doc/3/controllers/collection/search-specifications/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
code: true
type: page
title: SearchSpecifications
description: Searches collection specifications.
---
78 changes: 78 additions & 0 deletions doc/3/controllers/document/search/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
code: true
type: page
title: search
description: Searches a document
---

# search

Searches document.

::: warning
There is a limit to how many documents can be returned by a single search query.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this should be a warning

That limit is by default set at 10000 documents, and you can't get over it even with the from and size pagination options.
:::

::: info
When processing a large number of documents (i.e. more than 1000), it is advised to paginate the results using [SearchResult.next](/sdk/java/3/core-classes/search-result/next) rather than increasing the size parameter.
:::

::: warning
When using a cursor with the `scroll` option, Elasticsearch has to duplicate the transaction log to keep the same result during the entire scroll session.
It can lead to memory leaks if a scroll duration too great is provided, or if too many scroll sessions are open simultaneously.
:::

::: info
<SinceBadge version="Kuzzle 2.2.0"/>
You can restrict the scroll session maximum duration under the `services.storage.maxScrollDuration` configuration key.
:::

---

## Arguments

```java
public CompletableFuture<SearchResult> search(
final String index,
final String collection,
final ConcurrentHashMap<String, Object> searchQuery,
final SearchOptions options) throws NotConnectedException, InternalException

```

| Arguments | Type | Description |
| ------------------ | -------------------------------------------- | --------------------------------- |
| `index` | <pre>String</pre> | Index |
| `collection` | <pre>String</pre> | Collection |
| `searchQuery` | <pre>ConcurrentHashMap</pre> | Search query |
| `options` | <pre>SearchOptions</pre> | Query options |
---

### searchQuery body properties:

- `query`: the search query itself, using the [ElasticSearch Query DSL](https://www.elastic.co/guide/en/elasticsearch/reference/7.3/query-dsl.html) syntax.
- `aggregations`: control how the search results should be [aggregated](https://www.elastic.co/guide/en/elasticsearch/reference/7.3/search-aggregations.html)
- `sort`: contains a list of fields, used to [sort search results](https://www.elastic.co/guide/en/elasticsearch/reference/7.3/search-request-sort.html), in order of importance.

An empty body matches all documents in the queried collection.

### options

A [SearchOptions](/sdk/java/3/core-classes/search-options) object.

The following options can be set:

| Options | Type<br/>(default) | Description |
| ---------- | ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `from` | <pre>Integer</pre><br/>(`0`) | Offset of the first document to fetch |
| `size` | <pre>Integer</pre><br/>(`10`) | Maximum number of documents to retrieve per page |
| `scroll` | <pre>String</pre><br/>(`""`) | When set, gets a forward-only cursor having its ttl set to the given value (ie `1s`; cf [elasticsearch time limits](https://www.elastic.co/guide/en/elasticsearch/reference/7.3/common-options.html#time-units)) |

## Return

Returns a [SearchResult](/sdk/java/3/core-classes/search-result) object.

## Usage

<<< ./snippets/search.java
53 changes: 53 additions & 0 deletions doc/3/controllers/document/search/snippets/search.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
ConcurrentHashMap<String, Object> suv = new ConcurrentHashMap<>();
suv.put("category", "suv");
ConcurrentHashMap<String, Object> limousine = new ConcurrentHashMap<>();
limousine.put("category", "limousine");

CreateOptions options = new CreateOptions();
options.setWaitForRefresh(true);

for (int i = 0; i < 5; i += 1) {
kuzzle.getDocumentController().create("nyc-open-data", "yellow-taxi", suv, options).get();
}

for (int i = 0; i < 10; i += 1) {
kuzzle.getDocumentController().create("nyc-open-data", "yellow-taxi", limousine, options).get();
}

ConcurrentHashMap<String, Object> searchQuery = new ConcurrentHashMap<>();
ConcurrentHashMap<String, Object> query = new ConcurrentHashMap<>();
ConcurrentHashMap<String, Object> match = new ConcurrentHashMap<>();
match.put("category", "suv");
query.put("match", match);
searchQuery.put("query", query);

SearchResult results = kuzzle
.getDocumentController()
.search("nyc-open-data", "yellow-taxi", searchQuery).get();

System.out.println("Successfully retrieved " + results.total + " documents");

/*
{
"aggregations"=undefined,
"hits"=[
{
"_id"="AWgi6A1POQUM6ucJ3q06",
"_score"=0.046520017,
"_source"={
"category"="suv",
"_kuzzle_info"={
"author"="-1",
"createdAt"=1546773859655,
"updatedAt"=null,
"updater"=null
}
}
},
...
]
},
"total"=5,
"fetched"=5,
"scroll_id"=undefined
*/
10 changes: 10 additions & 0 deletions doc/3/controllers/document/search/snippets/search.test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: document#search
description: Search for documents
hooks:
before: |
curl -XDELETE kuzzle:7512/nyc-open-data
curl -XPOST kuzzle:7512/nyc-open-data/_create
curl -XPUT kuzzle:7512/nyc-open-data/yellow-taxi
after:
template: default
expected: Successfully retrieved 5 documents
31 changes: 31 additions & 0 deletions doc/3/core-classes/search-options/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
code: true
type: page
title: SearchOptions
description: SearchOptions class documentation
order: 110
---

# SearchOptions

This class represents the options usable with the search related methods.

It can be used with the following methods:
- [document:search](/sdk/java/3/controllers/document/search)
- [collection:searchSpecifications](/sdk/java/3/controllers/collection/search-specifications)

## Namespace

You must include the following package:

```java
import io.kuzzle.sdk.Options.SearchOptions;
```

## Properties

| Property | Type | Description |
| -------- | --------------------- | ------------------------------------- |
| `from` | <pre>Integer</pre> | Offset of the first document to fetch |
| `scroll` | <pre>String</pre> | When set, gets a forward-only cursor having its ttl set to the given value (ie `1s`; cf [elasticsearch time limits](https://www.elastic.co/guide/en/elasticsearch/reference/7.3/common-options.html#time-units)) |
| `size` | <pre>Integer</pre> | Maximum number of documents to retrieve per page |
6 changes: 6 additions & 0 deletions doc/3/core-classes/search-result/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
code: true
type: branch
title: SearchResult
description: SearchResult documentation
---
42 changes: 42 additions & 0 deletions doc/3/core-classes/search-result/introduction/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
code: true
type: page
title: constructor
description: SearchResult:constructor
order: 1
---

# SearchResults

This class represents a paginated search result.

It can be returned by the following methods:
- [document:search](/sdk/java/3/controllers/document/search)
- [collection:searchSpecifications](/sdk/java/3/controllers/collection/search-specifications)

## Namespace

You must include the following package:

```java
import io.kuzzle.sdk.CoreClasses.SearchResult;
```

## Properties

| Property | Type | Description |
| -------------- | ------------------------------------------------------- | ------------------ |
| `aggregations` | <pre>ConcurrentHashMap<String, Object></pre> | Search aggregations (can be undefined) |
| `hits` | <pre>ArrayList<ConcurrentHashMap<String, Object>></pre> | Page results |
| `total` | <pre>Integer</pre> | Total number of items that _can_ be retrieved |
| `fetched` | <pre>Integer</pre> | Number of retrieved items so far |

### hits

Each element of the `hits` ArrayList is a `ConcurrentHashMap<String, Object>` containing the following properties:

| Property | Type | Description |
| --------- | ------------------ | ---------------------- |
| `_id` | <pre>String</pre> | Document ID |
| `_score` | <pre>Integer</pre> | [Relevance score](https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-intro.html) |
| `_source` | <pre>ConcurrentHashMap<String, Object></pre> | Document content |
74 changes: 74 additions & 0 deletions doc/3/core-classes/search-result/next/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
code: true
type: page
title: next
description: SearchResult next method
order: 200
---

# next

Advances through the search results and returns the next page of items.

## Arguments

```java
public CompletableFuture<SearchResult> next()
```

## Returns

Returns a `SearchResult` object, or `null` if no more pages are available.

## Throw

This method throws an exception if:

- No pagination strategy can be applied (see below)
- If invoking it would lead to more than 10 000 items being retrieved with the `from/size` strategy

## Pagination strategies

Depending on the arguments given to the initial search, the `next` method will pick one of the following strategies, by decreasing order of priority.

### Strategy: scroll cursor

If the original search query is given a `scroll` parameter, the `next` method uses a cursor to paginate results.

The results from a scroll request are frozen, and reflect the state of the index at the time the initial `search` request.
For that reason, this method is guaranteed to return consistent results, even if documents are updated or deleted in the database between two pages retrieval.

This is the most consistent way to paginate results, however, this comes at a higher computing cost for the server.

::: warning
When using a cursor with the `scroll` option, Elasticsearch has to duplicate the transaction log to keep the same result during the entire scroll session.
It can lead to memory leaks if ascroll duration too great is provided, or if too many scroll sessions are open simultaneously.
:::

::: info
<SinceBadge version="Kuzzle 2.2.0"/>
You can restrict the scroll session maximum duration under the `services.storage.maxScrollDuration` configuration key.
:::

<<< ./snippets/scroll.java

### Strategy: sort / size

If the initial search contains `sort` and `size` parameters, the `next` method retrieves the next page of results following the sort order, the last item of the current page acting as a live cursor.

To avoid too many duplicates, it is advised to provide a sort combination that will always identify one item only. The recommended way is to use the field `_uid` which is certain to contain one unique value for each document.

Because this method does not freeze the search results between two calls, there can be missing or duplicated documents between two result pages.

This method efficiently mitigates the costs of scroll searches, but returns less consistent results: it's a middle ground, ideal for real-time search requests.

### Strategy: from / size

If the initial search contains `from` and `size` parameters, the `next` method retrieves the next page of result by incrementing the `from` offset.

Because this method does not freeze the search results between two calls, there can be missing or duplicated documents between two result pages.

It's the fastest pagination method available, but also the less consistent, and it is not possible to retrieve more than 10000 items using it.
Above that limit, any call to `next` throws an Exception.

<<< ./snippets/fromsize.java
51 changes: 51 additions & 0 deletions doc/3/core-classes/search-result/next/snippets/fromsize.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
ArrayList<ConcurrentHashMap<String, Object>> documents = new ArrayList<>();
ConcurrentHashMap<String, Object> body = new ConcurrentHashMap<>();

body.put("category", "suv");
for (int i = 0; i < 100; i++) {
ConcurrentHashMap<String, Object> document = new ConcurrentHashMap<>();
document.put("_id", "suv_no" + i);
document.put("body", body);
documents.add(document);
}

kuzzle
.getDocumentController()
.mCreate("nyc-open-data", "yellow-taxi", documents, true).get();

SearchOptions options = new SearchOptions();
options.setFrom(1);
options.setSize(5);

ConcurrentHashMap<String, Object> searchQuery = new ConcurrentHashMap<>();
ConcurrentHashMap<String, Object> query = new ConcurrentHashMap<>();
ConcurrentHashMap<String, Object> match = new ConcurrentHashMap<>();
match.put("category", "suv");
query.put("match", match);
searchQuery.put("query", query);

SearchResult results = kuzzle.getDocumentController().search(
"nyc-open-data",
"yellow-taxi",
searchQuery, options).get();

// Fetch the matched items by advancing through the result pages
ArrayList<ConcurrentHashMap<String, Object>> matched = new ArrayList<>();

while (results != null) {
matched.addAll(results.hits);
results = results.next().get();
}

/*
{ _id="suv_no1",
_score=0.03390155,
_source=
{ _kuzzle_info=
{ author="-1",
updater=null,
updatedAt=null,
createdAt=1570093133057 },
category="suv" } }
*/
System.out.println("Successfully retrieved " + matched.size() + " documents");
11 changes: 11 additions & 0 deletions doc/3/core-classes/search-result/next/snippets/fromsize.test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
name: searchresult#fromsize
description: Next method with from/size
hooks:
before: |
curl -XDELETE kuzzle:7512/nyc-open-data
curl -XPOST kuzzle:7512/nyc-open-data/_create
curl -XPUT kuzzle:7512/nyc-open-data/yellow-taxi
after: |
curl -XDELETE kuzzle:7512/nyc-open-data
template: default
expected: Successfully retrieved 100 documents
Loading