My dear reader, how are you? السلام عليكم
“Find joy in the journey.” Thomas S. Monson
This is the third part of StackDuplica web-Application tutorial series. We will build up a full-text search-engine for questions using elasticsearch.
Few useful links to practically follow the project:
- StackDuplica GitHub repository — DirectMe
- All other tutorials on StackDuplica — DirectMe
- Set your GitHub repo to StackDuplica Part 2 using the following command:
git fetch origin 9b4f6b8a3330399f47aeeb8b86b76202aa94a68a
Start by Setting-up Elasticsearch
Elasticsearch is a search engine based on the Lucene library and provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. It is developed in Java and is open-source under various open-source licenses (mostly the Apache License).
(stackenv)StackDuplica$ sudo apt-get update (stackenv)StackDuplica$ sudo apt install docker.io curl # Obtain Elasticsearch image and run service using docker as shown below (stackenv)StackDuplica$ sudo docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:6.0.0 # Create an index for qanda application (stackenv)StackDuplica$ curl -XPUT "localhost:9200/qanda?pretty"
Let us now create an elasticsearch service code that will load the list of questions in the database as shown below
# Create qanda/service/elasticsearch.py and add the ollowing program into it from elasticsearch import Elasticsearch from elasticsearch.helpers import streaming_bulk from django.conf import settings import logging ALREADY_EXISTS_EXCEPTION = 'resource_already_exists_exception' FAILED_TO_LOAD_ERROR = 'Failed to load {}: {!r}' ISO_DATE_TIME_FORMAT = '%Y-%m-%dT%H:%M:%S.%fZ' logger = logging.getLogger(__name__) def get_client(): return Elasticsearch(hosts=[ {'host': settings.ES_HOST, 'port': settings.ES_PORT,} ]) def bulk_load(questions): all_ok = True es_questions = (q.as_elasticsearch_dict() for q in questions) for ok, result in streaming_bulk( get_client(), es_questions, index=settings.ES_INDEX, raise_on_error=False, ): if not ok: all_ok = False action, result = result.popitem() logger.error(FAILED_TO_LOAD_ERROR.format(result['_id'], result)) return all_ok def search_for_questions(query): client = get_client() result = client.search(index=settings.ES_INDEX, body={ 'query': { 'match': { 'text': query, }, }, }) return (h['_source'] for h in result['hits']['hits']) def upsert(question_model): client = get_client() question_dict = question_model.as_elasticsearch_dict() doc_type = question_dict['_type'] del question_dict['_id'] del question_dict['_type'] response = client.update( settings.ES_INDEX, doc_type, id=question_model.id, body={ 'doc': question_dict, 'doc_as_upsert': True, } ) return response
Update Question Model
Once the elastic service code is in place, we will update our questions model as shown below
# Open qanda/models.py and add the following function in Question model def as_elasticsearch_dict(self): return { '_id': self.id, '_type': 'doc', 'text': '{}\n{}'.format(self.title, self.question), 'question_body': self.question, 'title': self.title, 'id': self.id, 'created': self.created, }
Install elesticsearch using pip3 and create manage.py command to load questions in elasticsearch
Let us not install elasticsearch using pip3 for our Django project. Add elasticsearch==6.00 in requirements.py and using the following command to install it
(stackenv)StackDuplica$ pip3 install -r requirements.py
Create a manage.py command to load the questions Django models to elasticsearch as shown below
# create qanda/management/commands/load_questions_into_elasticsearch.py and add the following program into it. from django.core.management import BaseCommand from qanda.service import elasticsearch from qanda.models import Question class Command(BaseCommand): help = 'Load all questions into Elasticsearch' def handle(self, *args, **options): queryset = Question.objects.all() all_loaded = elasticsearch.bulk_load(queryset) if all_loaded: self.stdout.write(self.style.SUCCESS( 'Successfully loaded all questions into Elasticsearch.')) else: self.stdout.write( self.style.WARNING('Some questions not loaded ' 'successfully. See logged errors') )
Load the questions into the elastic search database
(stackenv)StackDuplica$ python3 manage.py load_questions_into_elasticsearch
create a search view
# open qanda/views.py and add the following from django.views.generic import TemplateView class SearchView(TemplateView): template_name = 'qanda/search.html' def get_context_data(self, **kwargs): query = self.request.GET.get('q', None) ctx = super().get_context_data(query=query, **kwargs) if query: results = search_for_questions(query) ctx['hits'] = results return ctx
Add a route for the search view
# Open qanda/urls.py and add the following urlpatterns = [ path('q/search', views.SearchView.as_view(), name='question_search'), ]
Create a template for search view
# create qanda/tempates/qanda/search.html and add the following program {% extends "base.html" %} {% load markdownify %} {% block body %} <h2 >Search</h2 > <form method="get" class="form-inline" > <input class="form-control mr-2" placeholder="Search" type="search" name="q" value="{{ query }}" > <button type="submit" class="btn btn-primary" >Search</button > </form > {% if query %} <h3>Results from search query '{{ query }}'</h3 > <ul class="list-unstyled search-results" > {% for hit in hits %} <li > <a href="{% url "qanda:question_detail" pk=hit.id %}" > {{ hit.title }} </a > <div > {{ hit.question_body|markdownify|truncatewords_html:20 }} </div > </li > {% empty %} <li >No results.</li > {% endfor %} </ul > {% endif %} {% endblock %} # open StackApp/templates/base.html and update the base template as shown below {% load static %} <!DOCTYPE html> <html lang="en" > <head > <meta charset="UTF-8" > <title >{% block title %}StackDuplica{% endblock %}</title > <link href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-beta.2/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-PsH8R72JQ3SOdhVi3uxftmaW6Vc51MKb0q5P2rRUpPvrszuE4W1povHYgTpBfshb" crossorigin="anonymous" > <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-wvfXpqpZZVQGK6TAh5PVlGOfQNHSoD2xbE+QkPxCAFlNEevoEH3Sl0sibVcOQVnN" crossorigin="anonymous" > <link rel="stylesheet" href="{% static "base.css" %}" > </head > <body > <nav class="navbar navbar-expand-lg bg-light" > <div class="container" > <a class="navbar-brand" href="/" >StackDuplica</a > <ul class="navbar-nav" > <li class="nav-item" > <a class="nav-link" href="{% url "qanda:ask" %}" >Ask</a > </li > <li class="nav-item" > <a class="nav-link" href="{% url "qanda:index" %}" > Today's Questions </a > </li > <li class="nav-item" > <form class="form-inline" action="{% url "qanda:question_search" %}" method="get"> <input class="form-control mr-sm-2" type="search" name="q" placeholder="Search"> <button class="btn btn-outline-primary my-2 my-sm-0" type="submit" > Search </button > </form > </li > </ul > </div > </nav > <div class="container" > {% block body %}{% endblock %} </div > </body > </html >
If you run the development server at this point in time you should be able to see the added functionalities at http://127.0.0.1:8000/
I hope you find this tutorial useful. If you find any errors or feel any need for improvement, let me know in your comments below.
Signing off for today. Stay tuned and I will see you next week! Happy learning.