Using elasticSearch to index git repos

Brand new ES user here, so I'm not quite up to speed with the terminology ..

I would like to use ES to index a git repo, so that the source code can be searched (like github) . Is this possible ?

It would seem an obvious thing to be able to do, but I can't seem to find relevant information - a google search on elasticsearch and git is not nearly specific enough :wink:

thanks

ping .. really would appreciate any thoughts or ideas on this

Github repositories are indexed in Elasticsearch (source). Try this page https://github.com/search/advanced.

Add: If you want to do it yourself, I can only recommend to read throught Elastic docs for Elasticsearch and Logstash capabilities and use-cases. Also look if you can find usefull logstash input plugin or Beat.

You need to figure out how to retrieve contents from github. Perhaps there is an API you can use? Once you have access to the github content you need to figure how to get it into elasticsearch. That is the easy part. The hard part is figuring out how to handle changes to the repo. The easy way out is to ignore changes.

If you need inspiration, this blog post shows how to index bitbucket repositories using python and a 3 year old version of elasticsearch. However, the principles should be similar to what you would do today.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.