Invalid field name: _allow_permissions,_deny_permissions

I'm using 7.12.
I installed it from enterprise-search-7.12.1.rpm.

You'll want to upgrade to the latest version (7.13.2). The endpoint you're trying to hit (/api/ws/v1/sources) was introduced in 7.13.0.

Will I lose the custom content I registered when I upgrade?

It seems to conflict with past versions. It seems that I cannot simply upgrade.

What I am trying to do is very simple.
I want to get the ID of the registered content and the documents within the content.
I want to get the IDs of the registered content and documents within the content, because I need them to edit the content of the documents.
Is this not possible with the 7.12 version?
Also, is it possible to do this with a basic license?
It was easy to register, so I hope it will be easy to browse and edit.

This is the result of running the example listed as List all external identities.

# curl -X GET http://localhost:3002/api/ws/v1/sources/[CONTENT_SOURCE_ID]/external_identities \
> -H "Authorization: Bearer [ACCESS_TOKEN]"
{"errors":["This feature is only available for Workplace Search deployments running against an Elasticsearch cluster with a valid Trial or Platinum license applied. For more information, visit the Enterprise Search licensing reference at https://www.elastic.co/pricing/."]}

Could it be that users with a basic license are not even allowed to see their own registered content?

Will I lose the custom content I registered when I upgrade?

No, upgrades should preserve your data. Here are some docs: Upgrading & migrating | Enterprise Search documentation [8.11] | Elastic.

It seems to conflict with past versions. It seems that I cannot simply upgrade.

Are you experiencing errors when trying to upgrade? I can try to help if you can share them here.

What I am trying to do is very simple. ...

Yes, this is possible, but only with API functionality introduced in 7.13.

Could it be that users with a basic license are not even allowed to see their own registered content?

The concept of external identities, including the API endpoint you're trying to use, belongs to Permissions and Access Control (Permissions & Access Control | Workplace Search documentation [8.11] | Elastic). This permissions functionality is limited to platinum licenses. You wont need it unless you're trying to control the content certain users can see.

The error when upgrading is something like this

File XXXX (from package enterprise-search-7.13.2-1.noarch) conflicts with file from package enterprise-search-7.12.1-1.noarch

When I heard that the data was being retained, I took the plunge and removed the old version of EnterpriseSearch and installed the latest EnterpriseSearch.
I rebuilt the configuration file, and now EnterpriseSearch is running successfully. Thank you very much.

Now I can finally get started!

I was able to get the content source using the method that Ross taught me.

# curl -u 'enterprise_search:[REDACTED]' http://localhost:3002/api/ws/v1/sources
{"meta":{"page":{"current":1, "total_pages":1, "total_results":18, "size":25}}, "results":[{... snip ... , "name": "test", "context": "organization", "is_searchable":true, "schema":{"created_at": "text", "title": "text", "body": "text", "type":" text", "url": "text"}, "display":{"title_field": "title", "subtitle_field": "url", "description_field": "body", "url_field": "url", "detail_ fields":[{"field_name": "url", "label": "URL"}], "color": "#000000"}, "document_count":1, "last_indexed_at": "2021-06-10T09:58:14+00:00 "},... .snip...

I was also able to get the content source in the same way in python.

contentSources = workplace_search.list_content_sources()
print(contentSources)
{"meta":{"page":{"current":1, "total_pages":1, "total_results":18, "size":25}}, "results":[{... snip ... , "name": "test", "context": "organization", "is_searchable":true, "schema":{"created_at": "text", "title": "text", "body": "text", "type":" text", "url": "text"}, "display":{"title_field": "title", "subtitle_field": "url", "description_field": "body", "url_field": "url", "detail_ fields":[{"field_name": "url", "label": "URL"}], "color": "#000000"}, "document_count":1, "last_indexed_at": "2021-06-10T09:58:14+00:00 "},... .snip...

However, while this result gives us the content ID, it does not give us the document ID.

In the python example, the document ID is needed to retrieve the document.

workplace_search.get_document(
    content_source_id=[CONTENT_SOURCE_ID],
    document_id=[DOCUMENT_ID], document_id=[DOCUMENT_ID
)

Is there any way to know the document ID?
Following the content source example, it would be nice to have something like list_documents, but it doesn't seem to be there.

Is there any way to know the document ID?

Yes, you can use the Search API: Search API Reference | Workplace Search documentation [8.11] | Elastic. Also, keep in mind that the Search API uses a different form of authentication.

Am I correct in assuming that POST http://localhost:3002/api/ws/v1/search is the command to retrieve the document?

I have created one custom content and registered one document in the sample.

curl -X POST http://localhost:3002/api/ws/v1/sources/[ID]/documents/bulk_create \
-H "Authorization: Bearer [AUTH_TOKEN]" \
-H "Content-Type: application/json" \
-d '[
  {
    "id" : 1234,
    "title" : "The Meaning of Time",
    "body" : "Not much. It is a made up thing.",
    "url" : "https://example.com/meaning/of/time",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  },
}'

This does indeed work, and I can see from the WorkplaceSearch administration screen that one document exists.

However, the following command yields no results.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": ""
}'

Why is that?
I am expecting to get the above content registered.
Especially since I need the document ID when I change the document, I would like to know how to at least get the document ID.

What response do you get? That should work.

It does not return any value.

Return an empty string. Would it be more accurate to say.

In any case, it will not return the expected results.

I suspect something is wrong with the request, or the response is not being correctly retrieved. Whether the request fails or succeeds with zero documents returned, you shouldn't be receiving a response of an empty string with that curl request.

As strange as it may seem, I have not actually received any results.

I'm starting to think that part of the problem may be that I'm using a basic license.

Could you please try to run the following two commands (which are very similar) with a basic license?
I'll include my results as well.


cf. API Authentication Reference | Workplace Search documentation [8.11] | Elastic

$ curl -X POST http://localhost:3002/api/ws/v1/search \
-u "[USERNAME]:[REDACTED]" \
-H "Content-Type: application/json" \
-d '{
  "query": "denali"
}'
{"errors":["This feature is only available for Workplace Search deployments running against an Elasticsearch cluster with a valid Trial or Platinum license applied. For more information, visit the Enterprise Search licensing reference at https://www.elastic.co/pricing/."]}

cf. Search API Reference | Workplace Search documentation [8.11] | Elastic

$ curl -X POST http://localhost:3002/api/ws/v1/search \
-u "[USERNAME]:[REDACTED]" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "denali"
}'
$.
(; nothing will be marked, indicating that you will be in the input state to receive the next command)

As you can see from the URLs, these are actual samples from the documentation.
The difference between the two commands is the presence of Authorization.
Is it possible that the message is not being displayed because some check is missing?


As a preface, there are two things I want to say in this comment.

One is, as mentioned in this topic, is there any way to know the document ID of the basic license?

The other is that I found a case where the command does not output any message, and I would like to report it. (This will probably need to be fixed in the future, but I won't discuss that request here.

I have specifically considered a scenario where a document is registered in a content source and the content of that document is searched for in all content sources.
Could you please confirm this?


  1. Create Custom API Source
    Register a content source named "test" as a custom content source.
Source Name: test
Source Identifier: 60c1e1ca84c212ce70633b83
Access Token: [AUTH_TOKEN]
  1. indexing_a_document
    Register a document to the custom content source "test" you created.
$ curl -X POST http://localhost:3002/api/ws/v1/sources/60c1e1ca84c212ce70633b83/documents/bulk_create \
-H "Authorization: Bearer [AUTH_TOKEN]" \
-H "Content-Type: application/json" \
-d '[.
  {
# "_allow_permissions": ["permission1"], # need platinum license
# "_deny_permissions": [], # need platinum license
    "id" : 1234,
    "title" : "The Meaning of Time",
    "body" : "Not much. It is a made up thing.",
    "url" : "https://example.com/meaning/of/time",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  },
]'
  1. list-content-sources-api
    Get the list of content sources.
    (I have a lot of content sources registered, so the meta.page.total_results shows 18 documents.
    (Also, you can use the name "test" as a key to get the desired content source ID (results.id by results.name).
$ curl -u [USERNAME]:[REDACTED] -X GET http://localhost:3002/api/ws/v1/sources
{"meta":{"page":{"current":1, "total_pages":1, "total_results":18, "size":25}}, "results":[... snip ... {"id": "60c1e1ca84c212ce70633b83", "service_type": "custom", "created_at": "2021-06-10T09:56:26+00:00", "last_updated_at": "2021-06-22T07 :15:29+00:00", "is_remote":false, "details":[], "groups":[{"id": "60acc08084c21247b452c8d9", "name": "Default"}], "name": "test", "context ": "organization", "is_searchable":true, "schema":{"created_at": "text", "title": "text", "body": "text", "type": "text", "url": "text"},"" display":{"title_field": "title", "subtitle_field": "url", "description_field": "body", "url_field": "url", "detail_fields":[{"field_name ": "url", "label": "URL"}], "color": "#000000"}, "document_count":1, "last_indexed_at": "2021-06-22T07:15:29+00:00"}, ... snip ...]}
  1. get-content-source-api
    Get the desired content source from the content source ID.
    (You can get some information about the document here, but not the document ID.
$ curl -u [USERNAME]:[REDACTED] -X GET http://localhost:3002/api/ws/v1/sources/60c1e1ca84c212ce70633b83
{"id": "60c1e1ca84c212ce70633b83", "service_type": "custom", "created_at": "2021-06-10T09:56:26+00:00", "last_updated_at": "2021-06-22T07 :15:29+00:00", "is_remote":false, "details":[], "groups":[{"id": "60acc08084c21247b452c8d9", "name": "Default"}], "name": "test", "context ": "organization", "is_searchable":true, "schema":{"created_at": "text", "title": "text", "body": "text", "type": "text", "url": "text"},"" display":{"title_field": "title", "subtitle_field": "url", "description_field": "body", "url_field": "url", "detail_fields":[{"field_name ": "url", "label": "URL"}], "color": "#000000"}, "document_count":1, "last_indexed_at": "2021-06-22T07:15:29+00:00"}
  1. get-document-by-id-api
    (If you know the document ID somehow)
    (For example, in this case, I registered the document, so I know the document ID.
    Get the document from the content source ID and the document ID.
$ curl -u [USERNAME]:[REDACTED] -X GET http://localhost:3002/api/ws/v1/sources/60c1e1ca84c212ce70633b83/documents/1234
{"title": "The Meaning of Time", "body": "Not much. It is a made up thing.", "url": "https://example.com/meaning/of/time", "created_at": "2019-06- 01T12:00:00+00:00", "type": "list", "source": "custom", "content_source_id": "60c1e1ca84c212ce70633b83", "last_updated": "2021-06-22T07:15 :49+00:00", "id": "1234"}

Of the steps 1-4, there is a big gap for me between steps 3 and 4. (Because there is no way to get the document ID)

Is there a way to get the document ID that fills in the gaps between steps 3 and 4, or shouldn't the document ID be given in step 3?

There is no license requirement to use the search endpoint. I wonder if your content sources are in some unexpected state that causes them to require a platinum license to be searched over. I'm not aware of any such issue, but you could try to rule out that possibility by deleting your existing test content sources and try your search request again.

What I can confirm for you is that the way to get the document ids is to use the search endpoint. It is provided to solve for the use case you're attempting.

I am aware that the only content sources that require a platinum license are GMAIL and Slack.

For example, if you've gone through the trial license and are using the BASIC license, it's possible that such content is included, but I'm not using the trial license.

Also, the only content sources I have registered are GoogleDrive and Custom Sources.
Is there any chance that I will still need a Platinum license in that case?

Hey @its-ogawa,

I need to apologize for leading you down an incorrect assumption just now. I was wrong, the Search API requires a platinum license. You can always try a free trial if you'd like to experiment with that functionality.

In the meantime, you'll need to provide and keep track of your own document ids while indexing custom API sources if you then need to retrieve those individual documents again later.

I hope that helps. Sorry for the incorrect information earlier.

Ross

Could you please explain this method in more detail?

The document ID is an arbitrary alphanumeric symbol, so I think it is a unique ID, so to speak.

Does this mean that it cannot be retrieved by the WorkplaceSearch API?
Do I need to write down the document ID I registered on a piece of paper?

I was simply describing that if you need to access indexed documents over the API, you'll need to keep track of the documents' ids externally in order to retrieve those documents again using the single-document API. Note that if you don't supply a document id while indexing a document, an id will be automatically generated for you and returned in the response.

This is all assuming that your license level is not sufficient to use the search API endpoint. Related to that, we'll be modifying the search API endpoint to call out the license requirement. Sorry again for that confusion.

I see.
I'm a little confused, so let me clarify.
Is it correct to understand as follows?

  • About Document ID

    • If you specify an ID when you register a document, the document ID will be used; if you do not specify an ID, the ID issued by the system will be used.
  • About API for searching document IDs

    • For a basic license, there is no way to know the document ID via API.
    • If you want to know the document ID via API, you need a platinum license.
  • Updating and deleting documents

    • If you have a basic license, you need to keep a separate record of the document ID at the time of registration.
    • If you have a platinum license, you can refer to the document ID at any time using the search API.

If the above understanding is correct, I will close this topic for now.
I'm confused, but I thank for your help. (Since you have helped me in other topics as well.)


Note that if the above understanding is correct, there is no way to know the document ID via API in case of basic license, which is quite critical for companies starting small.
That's because if you lose the document ID at registration, you cannot update or delete that document.
In the worst case, you can delete the registered content source and re-register it, but this is not very practical if you have a lot of registrations or if they change frequently.

I would like to submit a request to include the document ID in the return value of the get-content-source-api shown above, is that possible?
If it is possible, please let me know the proper way to submit the request.