Conceptual Question from a n00b - Nested documents or _type?

Hi Everyone.

First post so please forgive any miscommunications and correct me if I'm doing something wrong.

I have a JSON Firebase database for a social/dating app with some data that looks like this in it:

+users
          +0MMj1nO3n4SanKztLC9AYYHrQl03
                 age: 28
                 bodyType: "Average"
                 children: "No"

The big alpha numeric code there is a unique user_id generated by Firebase when the user signs up. I'm importing these users to Elasticsearch to be able to search on the fields such as "age" and "children" and a bunch of other ones.

I'm using a Google Cloud function to copy the data over. However, it's not going perfectly. I used dynamic mapping to create the users index on my Elasticsearch instance. It created an index as follows:

    {
    "users": {
        "aliases": {},
        "mappings": {
            "0MMj1nO3n4SanKztLC9AYYHrQl03": {
                "properties": {
                    "age": {
                        "type": "long"
                    },
                    "bodyType": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },

 "settings": {
        "index": {
            "creation_date": "1533332041249",
            "number_of_shards": "5",
            "number_of_replicas": "1",
            "uuid": "lvUvZFiUQm6Fwf6Q_RjN2g",
            "version": {
                "created": "6020499"
            },
            "provided_name": "users"
        }

The issue here is it seems to be using the unique user_id generated by Firebase as the _type. This isn't right is it?

Doing a search of all user documents in Elasticsearch, I receive the following, which shows an _type field populated by the unique user_id. I thought type was deprecated so I'm confused as to why it's defining one with dynamic indexing.

 "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
        {
            "_index": "users",
            "_type": "0MMj1nO3n4SanKztLC9AYYHrQl03",
            "_id": "JJG2AWUBdN6okhW2LpLJ",
            "_score": 1,
            "_source": {
                "age": 28,
                "bodyType": "Average",
                "children": "No",

Is this right? Is this nested? I'm a little confused as to the terminology being new.

I'm unable to continue copying more documents in with my Cloud Function because Elasticsearch throws the following error:

StatusCodeError: 400 - {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Rejecting mapping update to [users] as the final mapping would have more than 1 type: [1pWIylpcRahbf5I1VrvB4bmZCmF2, 0MMj1nO3n4SanKztLC9AYYHrQl03]"}]

Notice the "0MMj1nO3n4SanKztLC9AYYHrQl03" or user_id for my first user as being the type Elasticsearch is complaining about. Why is it thinking every new user is a new type? Why aren't types deprecated like I keep reading about? I'm on version 6.2.

Finally, I'm new to JavaScript as well. Here is my Cloud Function which is doing the copying of the data from Firebase into Elasticsearch. Is something here possibly the cause or do I need to change some settings in Elasticsearch to get this to work? Why is it treating each user as a different type when types are deprecated? I'm lost on that.

const functions = require('firebase-functions');
const request = require('request-promise')
exports.indexUsersToElastic = functions.database.ref('/users/{user_id}')
  .onWrite((change, context) => {
let userData = change.after.val();
let user_id = context.params.user_id;
//Comment out next line to turn off Debugging in Firebase Console Log
console.log('Indexing user', userData);
let elasticSearchConfig = functions.config().elasticsearch;
let elasticSearchUrl = 'http://MY.IP.ADDRESS.HERE//elasticsearch/'+'users/'+user_id;

let elasticSearchMethod = userData ? 'POST' : 'DELETE';

console.log('URL is ', elasticSearchUrl);

let elasticsearchRequest = {
  method: elasticSearchMethod,
  url: elasticSearchUrl,
  auth: {
      username: 'MYUSERNAME',
      password: 'MYPASSWORD',
  },
  body: userData,
  json: true
};
  return request(elasticsearchRequest) .then(response => {
    //Comment out next line to turn off Debugging in Firebase Console Log
    console.log('Elastic Search Response...', response);
  });

Summary:

Elasticsearch is dynamically mapping the above index and putting each user's user_id as the type. I thought types were deprecated? When it goes to create the next document in the users index using the same mapping, it throws the following error:

StatusCodeError: 400 - {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Rejecting mapping update to [users] as the final mapping would have more than 1 type: [1pWIylpcRahbf5I1VrvB4bmZCmF2, 0MMj1nO3n4SanKztLC9AYYHrQl03]"}]

Really, it shouldn't be remapping anymore I'm thinking. But it is. And it's using the user_id as a type.

Can anyone help me figure this out or point me to a solution that's out there already? I haven't found it.

Thanks!

My Javascript is a bit rusty, but here's what I think is going on. This line in your code is causing the issue:

let elasticSearchUrl = 'http://MY.IP.ADDRESS.HERE//elasticsearch/'+'users/'+user_id;

When indexing a document using that url, your document will get an autogenerated _id, and the _type will be set equal to the user_id. I don't think that's what you want to do. It is the cause of the error that you're seeing, as Elasticsearch only accepts one document type per index.

What you probably want to do is have one document type per index, and give your documents an _id equal to the user_id. I would try changing the line into:

let elasticSearchUrl = 'http://MY.IP.ADDRESS.HERE//elasticsearch/'+'users/_doc/'+user_id;

Now, all documents will get a type _doc (which is the convention that we use at Elastic), and an _id equal to the user_id.

You will also want to use a PUT instead of a POST. With a PUT you provide the _id, while with a POST you let Elasticsearch autogenerate the _id. So, you would also need to change this line:

let elasticSearchMethod = userData ? 'POST' : 'DELETE';

into:

let elasticSearchMethod = userData ? 'PUT' : 'DELETE';

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.