Namaste 🙏 , Welcome to the sixth weekly blog of GSoC’20 for CDLI.
This week was lighter on the coding part but mostly setting up ElasticSearch. The Access Setup for the framework for Admin, Editor and Public roles is set up as per requirement, still granular roles access is yet to be set up as functionality for such roles are not fully developed. The granular roles will be handled in the final phase.
Now, it was time to work on the Simple Search. For optimizing search, we are going to integrate ElasticSearch.
What is ElasticSearch? Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.
ElasticSearch was successfully set up and then we need to insert data into ElasticSearch Server. Since we are using MySQL DB and ElasticSearch supports the NoSQL format, we have to fetch data from DB and insert it into ElasticSearch. So there are many ways to do it but most commonly used was the Logstash.
What is Logstash? Logstash is a light-weight, open-source, server-side data processing pipeline that allows you to collect data from a variety of sources, transform it on the fly, and send it to your desired destination.
At the end of the week, I was able to set up ElasticSearch along with Logstash and set up an initial configuration to see if indexing of required data was working as expected.
Next week will be more focused on writing optimised SQL queries to fetch data in ElasticSearch using Logstash.
On initial setup of ElasticSearch using docker, it requires a lot of memory (of course the volatile one!!) and additionally I have 4 docker containers (necessary one) running for the project. So let’s see if my system can handle the required processing in the upcoming week.
# | Day | Date | A short description of the work done |
---|---|---|---|
1 | Monday | 2020/07/06 | a. WIP : Access setup for Public, Editor and Admin access. |
2 | Tuesday | 2020/07/07 | a. Done with access setup for Public, Editor and Admin. PR !137 b. Access Setup for Granular Roles will be addressed in buffer week. |
3 | Wednesday | 2020/07/08 | a. Exploring Elastic Search for Cakephp. b. Setting up Elastic Search. |
4 | Thursday | 2020/07/09 | a. Done with setup of Elastic Search. b. Exploring search query using Elastic Search. |
5 | Friday | 2020/07/10 | a. Created GeneralFunctionsComponent for commonly used functions. b. WIP : Simple Search. |
6 | Saturday | 2020/07/11 | a. Exploring ways to migrate MySQL Data to Elastic Search. |
7 | Sunday | 2020/07/12 | a. ElasicSearch + Logstash docker setup completed. b. Testing indexing from MariaDB to ElasticSearch using Logstash. |