Brand new ES user here, so I'm not quite up to speed with the terminology ..
I would like to use ES to index a git repo, so that the source code can be searched (like github) . Is this possible ?
It would seem an obvious thing to be able to do, but I can't seem to find relevant information - a google search on elasticsearch and git is not nearly specific enough
Add: If you want to do it yourself, I can only recommend to read throught Elastic docs for Elasticsearch and Logstash capabilities and use-cases. Also look if you can find usefull logstash input plugin or Beat.
You need to figure out how to retrieve contents from github. Perhaps there is an API you can use? Once you have access to the github content you need to figure how to get it into elasticsearch. That is the easy part. The hard part is figuring out how to handle changes to the repo. The easy way out is to ignore changes.
If you need inspiration, this blog post shows how to index bitbucket repositories using python and a 3 year old version of elasticsearch. However, the principles should be similar to what you would do today.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.