My dear reader, how are you? السلام عليكم

“Find joy in the journey.” Thomas S. Monson

This is the third part of StackDuplica web-Application tutorial series. We will build up a full-text search-engine for questions using elasticsearch.


Few useful links to practically follow the project:

  1. StackDuplica GitHub repository — DirectMe
  2. All other tutorials on StackDuplica — DirectMe
  3. Set your GitHub repo to StackDuplica Part 2 using the following command:
git fetch origin 9b4f6b8a3330399f47aeeb8b86b76202aa94a68a

Start by Setting-up Elasticsearch

Elasticsearch is a search engine based on the Lucene library and provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. It is developed in Java and is open-source under various open-source licenses (mostly the Apache License).

(stackenv)StackDuplica$ sudo apt-get update
(stackenv)StackDuplica$ sudo apt install docker.io curl 

# Obtain Elasticsearch image and run service using docker as shown below 

(stackenv)StackDuplica$ sudo docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:6.0.0

# Create an index for qanda application 

(stackenv)StackDuplica$ curl -XPUT "localhost:9200/qanda?pretty"

Let us now create an elasticsearch service code that will load the list of questions in the database as shown below

# Create qanda/service/elasticsearch.py and add the ollowing program into it

from elasticsearch import Elasticsearch
from elasticsearch.helpers import streaming_bulk

from django.conf import settings

import logging

ALREADY_EXISTS_EXCEPTION = 'resource_already_exists_exception'

FAILED_TO_LOAD_ERROR = 'Failed to load {}: {!r}'

ISO_DATE_TIME_FORMAT = '%Y-%m-%dT%H:%M:%S.%fZ'

logger = logging.getLogger(__name__)


def get_client():
    return Elasticsearch(hosts=[
        {'host': settings.ES_HOST, 'port': settings.ES_PORT,}
    ])


def bulk_load(questions):
    all_ok = True
    es_questions = (q.as_elasticsearch_dict() for q in questions)
    for ok, result in streaming_bulk(
            get_client(),
            es_questions,
            index=settings.ES_INDEX,
            raise_on_error=False,
    ):
        if not ok:
            all_ok = False
            action, result = result.popitem()
            logger.error(FAILED_TO_LOAD_ERROR.format(result['_id'], result))
    return all_ok


def search_for_questions(query):
    client = get_client()
    result = client.search(index=settings.ES_INDEX, body={
      'query': {
          'match': {
              'text': query,
          },
      },
    })
    return (h['_source'] for h in result['hits']['hits'])


def upsert(question_model):
    client = get_client()
    question_dict = question_model.as_elasticsearch_dict()
    doc_type = question_dict['_type']
    del question_dict['_id']
    del question_dict['_type']
    response = client.update(
        settings.ES_INDEX,
        doc_type,
        id=question_model.id,
        body={
            'doc': question_dict,
            'doc_as_upsert': True,
        }
    )
    return response

Update Question Model

Once the elastic service code is in place, we will update our questions model as shown below

# Open qanda/models.py and add the following function in Question model

def as_elasticsearch_dict(self):
        return {
            '_id': self.id,
            '_type': 'doc',
            'text': '{}\n{}'.format(self.title, self.question),
            'question_body': self.question,
            'title': self.title,
            'id': self.id,
            'created': self.created,
        }

Install elesticsearch using pip3 and create manage.py command to load questions in elasticsearch

Let us not install elasticsearch using pip3 for our Django project. Add elasticsearch==6.00 in requirements.py and using the following command to install it

(stackenv)StackDuplica$ pip3 install -r requirements.py

Create a manage.py command to load the questions Django models to elasticsearch as shown below

# create qanda/management/commands/load_questions_into_elasticsearch.py and add the following program into it.

from django.core.management import BaseCommand

from qanda.service import elasticsearch
from qanda.models import Question

class Command(BaseCommand):
    help = 'Load all questions into Elasticsearch'

    def handle(self, *args, **options):
        queryset = Question.objects.all()
        all_loaded = elasticsearch.bulk_load(queryset)
        if all_loaded:
            self.stdout.write(self.style.SUCCESS(
                'Successfully loaded all questions into Elasticsearch.'))
        else:
            self.stdout.write(
                self.style.WARNING('Some questions not loaded '
                                   'successfully. See logged errors')
            )

Load the questions into the elastic search database

(stackenv)StackDuplica$ python3 manage.py load_questions_into_elasticsearch

create a search view

# open qanda/views.py and add the following 

from django.views.generic import TemplateView

class SearchView(TemplateView):
    template_name = 'qanda/search.html'

    def get_context_data(self, **kwargs):
        query = self.request.GET.get('q', None)
        ctx = super().get_context_data(query=query, **kwargs)
        if query:
            results = search_for_questions(query)
            ctx['hits'] = results
        return ctx

Add a route for the search view

# Open qanda/urls.py and add the following

urlpatterns = [
path('q/search', views.SearchView.as_view(), name='question_search'),
]

Create a template for search view

# create qanda/tempates/qanda/search.html and add the following program 

{% extends "base.html" %}
{% load markdownify %}

{% block body %}
  <h2 >Search</h2 >
  <form method="get" class="form-inline" >
    <input class="form-control mr-2"
           placeholder="Search"
           type="search"
           name="q" value="{{ query }}" >
    <button type="submit" class="btn btn-primary" >Search</button >
  </form >
  {% if query %}
    <h3>Results from search query '{{ query }}'</h3 >
    <ul class="list-unstyled search-results" >
      {% for hit in hits %}
        <li >
          <a href="{% url "qanda:question_detail" pk=hit.id %}" >
            {{ hit.title }}
          </a >
          <div >
            {{ hit.question_body|markdownify|truncatewords_html:20 }}
          </div >
        </li >
      {% empty %}
        <li >No results.</li >
      {% endfor %}
    </ul >
  {% endif %}
{% endblock %}

# open StackApp/templates/base.html and update the base template as shown below 

{% load static %}
<!DOCTYPE html>
<html lang="en" >
<head >
  <meta charset="UTF-8" >
  <title >{% block title %}StackDuplica{% endblock %}</title >
  <link
      href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-beta.2/css/bootstrap.min.css"
      rel="stylesheet"
      integrity="sha384-PsH8R72JQ3SOdhVi3uxftmaW6Vc51MKb0q5P2rRUpPvrszuE4W1povHYgTpBfshb"
      crossorigin="anonymous" >
  <link
      href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css"
      rel="stylesheet"
      integrity="sha384-wvfXpqpZZVQGK6TAh5PVlGOfQNHSoD2xbE+QkPxCAFlNEevoEH3Sl0sibVcOQVnN"
      crossorigin="anonymous" >
  <link rel="stylesheet" href="{% static "base.css" %}" >
</head >
<body >
<nav class="navbar navbar-expand-lg  bg-light" >
  <div class="container" >
    <a class="navbar-brand" href="/" >StackDuplica</a >
    <ul class="navbar-nav" >
	     <li class="nav-item" >
        <a class="nav-link" href="{% url "qanda:ask" %}" >Ask</a >
      </li >
      <li class="nav-item" >
        <a
            class="nav-link"
            href="{% url "qanda:index" %}" >
          Today's Questions
        </a >
      </li >

      <li class="nav-item" >
        <form class="form-inline"
              action="{% url "qanda:question_search" %}"
              method="get">
          <input class="form-control mr-sm-2" type="search"
                 name="q"
                 placeholder="Search">
          <button class="btn btn-outline-primary my-2 my-sm-0" type="submit" >
            Search
          </button >
        </form >
      </li > 
    </ul >
  </div >
</nav >
<div class="container" >
  {% block body %}{% endblock %}
</div >
</body >
</html >

If you run the development server at this point in time you should be able to see the added functionalities at http://127.0.0.1:8000/


I hope you find this tutorial useful. If you find any errors or feel any need for improvement, let me know in your comments below.

Signing off for today. Stay tuned and I will see you next week! Happy learning.

LEAVE A REPLY

Please enter your comment!
Please enter your name here