How best to index giant .jl file

Hello, everyone!

So I have an elasticsearch-kibana setup, fresh out of the box. I need to index the entirety of the contents of a massive json lines file, and I don't know what is the most efficient way to do this. The schema is pretty simple; my mapping is as follows:

{
	"mappings" : {
		"properties" : {
			"board_title" : { "type" : "text" },
			"title" : { "type" : "text" },
			"url" : { "type" : "text" },
			"original_author_profile" : { "type" : "text" },
			"original_author" : { "type" : "text" },
			"datetime" : { "type" : "date" },
			"content" : {
				"type" : "nested",
				"properties" : {
					"content" : { "type" : "text" },
					"index" : { "type" : "numeric" },
					"datetime" : { "type" : "date" },
					"author_profile" : { "type" : "text" },
					"author" : { "type" : "text" }
				}
			}
		}
	}
}

The individual lines look like this (this is one line formatted into multiple for ease of reading):

{
	"board_title": [
		"Honda, Yamaha, Kawasaki, Isuszu Generators"
	],
	"title": [
		"Smokstak Yamaha generator sub forum"
	],
	"url": [
		"http://app.scrapingbee.com/api/v1/?api_key=T2OVHPI4YH90819Q9TK048U6I7N9EAHT7HWGO77YSS90P0RQAWG3N0UEG4N2H7CWCIZLQJHYCFDR64P3&render_js=True&url=https://www.smokstak.com/forum/threads/smokstak-yamaha-generator-sub-forum.128020/"
	],
	"original_author_profile": [
		"smokstack.com/forum/members/bassplayer1985.49042/"
	],
	"original_author": [
		"Bassplayer1985"
	],
	"datetime": [
		"Dec 11, 2013 at 10:18 PM"
	],
	"content": [
		{
			"content": [
				"Figured I'd be bold and go ask the question. I see we have a sub forum for Honda generators why not Yamaha? Wasn't sure if there was some crazy reason or not.  <br/>\n<br/>\nI finally got my 2013 Yamaha EF2000is today and the plan is to convert it to tri-fuel with US Carb. They claim their installation won't void the factory warranty, but I'll be sure to do some investigating before committing to the project."
			],
			"index": [
				"1"
			],
			"datetime": [
				"Dec 11, 2013 at 10:18 PM"
			],
			"author_profile": [
				"/forum/members/bassplayer1985.49042/"
			],
			"author": [
				"Bassplayer1985"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nMy guess would be that Honda outsells Yamaha by quite a margin, so maybe that is it. <br/>\n<br/>\nYears ago (mid 80's), I had a tiny Homelite HG600, that was actually a Yamaha painted red instead of blue. I would like to have another in either color."
			],
			"index": [
				"2"
			],
			"datetime": [
				"Dec 11, 2013 at 10:36 PM"
			],
			"author_profile": [
				"/forum/members/wayne-440.491/"
			],
			"author": [
				"Wayne 440"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nI have always wondered why there isn't a subforum for Delco Light generators?<br/>\n<br/>\nAnd here it is: <a class=\"link link--internal\" href=\"https://www.smokstak.com/forum/forumdisplay.php?f=175\">https://www.smokstak.com/forum/forumdisplay.php?f=175</a>"
			],
			"index": [
				"3"
			],
			"datetime": [
				"Dec 11, 2013 at 10:44 PM"
			],
			"author_profile": [
				"/forum/members/netpirate8.7071/"
			],
			"author": [
				"netpirate8"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nI don't know about the sales between the companies, but I chose the EF2000is over the honda EU2000i for some pretty small creature comforts.<br/>\n<br/>\n1. Gas gauge <br/>\n2. DC charge cords included<br/>\n3. Separate fuel shut off & ignition switch vs honda's combined switch<br/>\n4. Valve-train is steel vs honda's plastic composite<br/>\n5. Smaller displacement engine (79cc vs honda's 98.5cc) for longer engine run time while providing the same wattage as the honda. <br/>\n<br/>\nIts small stuff like that. Both have a great reliability track record. Given the facts above it was a no brainer for me.<img alt=\":p\" class=\"smilie smilie--sprite smilie--sprite7\" data-shortname=\":p\" src=\"data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" title=\"Stick out tongue    :p\"/><br/>\n<br/>\nP.S. Blue is my favorite color, but i agree, blue or red looks awesome!...racing colors <img alt=\":D\" class=\"smilie smilie--sprite smilie--sprite8\" data-shortname=\":D\" src=\"data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" title=\"Big grin    :D\"/>"
			],
			"index": [
				"4"
			],
			"datetime": [
				"Dec 11, 2013 at 10:53 PM"
			],
			"author_profile": [
				"/forum/members/bassplayer1985.49042/"
			],
			"author": [
				"Bassplayer1985"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nJust look at the statistics, there are only 691 posts in the Honda section, vs 5,000+ in the Generac section, 11,000+ in the Kohler section, and an incredible almost 94,000 in the Onan section, the better question is why do we bother having a Honda section."
			],
			"index": [
				"5"
			],
			"datetime": [
				"Dec 12, 2013 at 1:20 AM"
			],
			"author_profile": [
				"/forum/members/isaac-1.17741/"
			],
			"author": [
				"Isaac-1"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\ndon't forget Kawasaki.  those engines managed to find themselves hooked up to a Japanese made Onan labeled generators.. and Generac generators as well as making there own name branded generators.   they do make some nice engines for garden tractors and walk behind mowers and other industrial applications as well..  made in Maryville, MO since 1989<br/>\n<br/>\ndon't think we need them tho.. the general discussion area seems pretty good for those not as mass produced well known or mixed matched gen/engine generators"
			],
			"index": [
				"6"
			],
			"datetime": [
				"Dec 12, 2013 at 2:47 AM"
			],
			"author_profile": [
				"/forum/members/yellowlister.32048/"
			],
			"author": [
				"YellowLister"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nHow about a Japanese Generator forum that would include Honda, Yamaha, Kawasaki, and their clones? Hmmmm....."
			],
			"index": [
				"7"
			],
			"datetime": [
				"Dec 12, 2013 at 11:50 AM"
			],
			"author_profile": [
				"/forum/members/btpost.1307/"
			],
			"author": [
				"BTPost"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nAnd a Chinese generator forum.  Sure they are clones, but they come with their own set of headaches.  A very large and complete set...."
			],
			"index": [
				"8"
			],
			"datetime": [
				"Dec 12, 2013 at 12:25 PM"
			],
			"author_profile": [
				"/forum/members/pegasuspinto.7254/"
			],
			"author": [
				"pegasuspinto"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nWell the lack of honda post may be a reflection of reliability,verses a long list of questions and problems with onan and the forsaken china clones.<br/>\n<br/>\nMy experience with onan engines on tractors is they are great till they break,then you have to put on your ski mask and get out the cap pistol and hijack a wells Fargo truck to pay for the parts."
			],
			"index": [
				"9"
			],
			"datetime": [
				"Dec 12, 2013 at 1:15 PM"
			],
			"author_profile": [
				"/forum/members/uglyblue66.65747/"
			],
			"author": [
				"uglyblue66"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\n<blockquote class=\"bbCodeBlock bbCodeBlock--expandable bbCodeBlock--quote\">\n<div class=\"bbCodeBlock-title\">\n<a class=\"bbCodeBlock-sourceJump\" data-content-selector=\"#post-987701\" data-xf-click=\"attribution\" href=\"/forum/goto/post?id=987701\">BTPost said:</a>\n</div>\n<div class=\"bbCodeBlock-content\">\n<div class=\"bbCodeBlock-expandContent\">\n\t\t\tHow about a Japanese Generator forum that would include Honda, Yamaha, Kawasaki, and their clones? Hmmmm.....\n\t\t</div>\n<div class=\"bbCodeBlock-expandLink\"><a>Click to expand...</a></div>\n</div>\n</blockquote>Not a bad idea, but just title it to the three major brands you just mentioned. Clones arent worthy of their own \"named\" sub forum in my eyes. <img alt=\":p\" class=\"smilie smilie--sprite smilie--sprite7\" data-shortname=\":p\" src=\"data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" title=\"Stick out tongue    :p\"/>"
			],
			"index": [
				"10"
			],
			"datetime": [
				"Dec 12, 2013 at 2:17 PM"
			],
			"author_profile": [
				"/forum/members/bassplayer1985.49042/"
			],
			"author": [
				"Bassplayer1985"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\nHey guys!<br/>\n<br/>\nFor those wondering. If you convert a Yamaha generator to tri-fuel (gas propane, NG) with a kit from US Carburetion (<a class=\"link link--external\" href=\"http://www.motorsnorkel.com\" rel=\"nofollow noopener\" target=\"_blank\">http://www.motorsnorkel.com</a>) Yamaha <u><b>WILL HONOR</b></u> the factory 3 year warranty even if you install the kit yourself (which is easy btw)<br/>\n<br/>\nThe only catch the lady from Yamaha told me is if any damage occurs as a result of the kit installation, then the warranty is voided. So be sure you install it right.<br/>\n<br/>\nSome good info to know."
			],
			"index": [
				"11"
			],
			"datetime": [
				"Dec 13, 2013 at 12:11 PM"
			],
			"author_profile": [
				"/forum/members/bassplayer1985.49042/"
			],
			"author": [
				"Bassplayer1985"
			]
		},
		{
			"content": [
				"<b>Re: Why doesn't Smokstak have a Yamaha generator sub forum?</b><br/>\n<br/>\n<blockquote class=\"bbCodeBlock bbCodeBlock--expandable bbCodeBlock--quote\">\n<div class=\"bbCodeBlock-title\">\n<a class=\"bbCodeBlock-sourceJump\" data-content-selector=\"#post-987744\" data-xf-click=\"attribution\" href=\"/forum/goto/post?id=987744\">Bassplayer1985 said:</a>\n</div>\n<div class=\"bbCodeBlock-content\">\n<div class=\"bbCodeBlock-expandContent\">\n\t\t\tNot a bad idea, but just title it to the three major brands you just mentioned. Clones aren't worthy of their own \"named\" sub forum in my eyes. <img alt=\":p\" class=\"smilie smilie--sprite smilie--sprite7\" data-shortname=\":p\" src=\"data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7\" title=\"Stick out tongue    :p\"/>\n</div>\n<div class=\"bbCodeBlock-expandLink\"><a>Click to expand...</a></div>\n</div>\n</blockquote>I think that would be a good idea. But I can't see just one section on a generator that may get one question in a month. I haven't seen a Yamaha or Kawasaki generator in years"
			],
			"index": [
				"12"
			],
			"datetime": [
				"Dec 14, 2013 at 4:17 PM"
			],
			"author_profile": [
				"/forum/members/billy-j-shafer.579/"
			],
			"author": [
				"Billy J Shafer"
			]
		}
	]
}

There's like 130,000 more lines like that. I tried to make it easier to digest for a different product by splitting it into files which contained as many lines as could fit into a less-than-100kb file and then wrapping it in brackets and putting commas at the end of each line to put it into valid json, so I can do that if necessary, but I think that would put however many lines were in the file into one document, which is undesirable.

Basically I just need someone to tell me the best way of loading the data into the index. Right now I'm just sending individual cURL posts with each line, and I need something better than that.

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.