What is Elasticsearch?
Elasticsearch is a powerful search engine built upon the Lucene library. This software offers a distributed, multi-tenant capable platform for conducting full-text searches through HTTP web interfaces, handling JSON documents without strict schemas. Developed primarily in Java, Elasticsearch has gained immense popularity since its launch in 2010 and is widely recognized as a leading search engine.
Its applications encompass a variety of functions, including log analytics, full-text search, security intelligence, business analytics, and operational intelligence. Esteemed companies like eBay, Facebook, Uber, and GitHub utilise Elasticsearch to construct their products, integrating features such as search, aggregation, analytics, and more.
What is Elasticsearch used for?
Elasticsearch allows you to store, search, and analyse huge volumes of data quickly and in near real-time and give back answers in milliseconds. It’s able to achieve fast search responses because instead of searching the text directly, it searches an index.
Elasticsearch – some basic concepts:
- Index – a collection of different types of documents and document properties. For example, a document set may contain the data of a social networking application.
- Type/Mapping − a collection of documents sharing a set of standard fields in the same index. For example, an index contains data from a social networking application; there can be a specific type for user profile data, a different kind for messaging data, and yet another for comments data.
- Document − a collection of fields defined in the JSON format in a specific manner. Every document belongs to a type and resides inside an index. Every document is associated with a unique identifier called the UID.
- Field – Elasticsearch fields can include multiple values of the same type (essentially a list). In SQL, on the other hand, a column can contain exactly one value of the said type.
Using Elasticsearch with Django:
To take advantage of Elasticsearch with Django, we’ll use a few beneficial packages:
- Elasticsearch DSL – a high-level library that helps write and run queries against Elasticsearch. It’s built on top of the official low-level client (elasticsearch-py).
- Django Elasticsearch DSL – a package that allows easy integration and configuration of Elasticsearch with Django. It’s built as a thin wrapper around elasticsearch-dsl-py so that you can use all the features developed by the elasticsearch-dsl-py team.
- Django Elasticsearch DSL DRF – integrates Elasticsearch DSL and the Django REST framework. It provides us with Elasticsearch using API most efficiently.
Download and install the .zip package
- Elasticsearch 8.9.0 from:https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.9.0-windows-x86_64.zip
- Unzip it with your elasticsearch .zip package.
- Run the elasticsearch-service.bat script in the bin\ folder to install the service and potentially start and stop the service from the command line.
C:\elasticsearch-8.9.0\bin>elasticsearch.bat
fter running the file for the very first time, we will get a password and certificate. In our project, which is for demonstration purposes or local development, it is not needed. So, we will disable the security-related configuration by editing a file named elasticsearch.yml in the config directory. For this, we need to look for an entry ‘xpack.security.enabled,’ which will be true by default. We need to change this to false. After this modification, there will be no issues related to security while connecting to Elasticsearch. This can be verified by visiting the Elasticsearch URL:
Installing Elasticsearch and Django-Elasticsearch Integration:
Step 1:
- Install Elasticsearch and the Python client for Elasticsearch:pip install elasticsearch
- Install Django Elasticsearch DSL:pip install django-elasticsearch-dsl
Configuring Django Settings:
Step 2:
- Then add django_elasticsearch_dsl to the INSTALLED_APPS
123INSTALLED_APPS = ['django_elasticsearch_dsl',]
Step 3:
- You must define ELASTICSEARCH_DSL in your django settings.
12345ELASTICSEARCH_DSL={'default': {'hosts':"localhost:9200"},}
Step 4:
- Define the Model
123456789from django.db import modelsclass Post(models.Model):title = models.CharField(max_length=200)content = models.TextField()pub_date = models.DateTimeField('date published')def __str__(self):return self.title
- Don’t forget to run migrations and migrate:
- python manage.py makemigrations
- python manage.py migrate
- To make this model work with Elasticsearch, create a subclass of django_elasticsearch_dsl.Document, create a class Index inside the Document class to define your Elasticsearch indices, names, settings etc and at last register the class using registry.register_document decorator. It is required to define the Document class in documents.py in your app directory
Step 5:Indexing Data in document.py
- Create a file named documents.py in the blog app to define the Elasticsearch document:
1234567891011121314151617181920212223242526272829303132333435from django_elasticsearch_dsl import Document,Index,fieldsfrom .models import Postblog_posts = Index('post_index')@blog_posts.doc_typeclass PostDocument(Document):title = fields.KeywordField()content = fields.KeywordField()pub_date = fields.DateField()class Index:name = 'post_index' # Name of the Elasticsearch indexsettings = {'number_of_shards': 1,'number_of_replicas': 0,'blocks': {'read_only_allow_delete': False}}class Django:model = Postfields = ['title','content','pub_date',]# Ignore auto updating of Elasticsearch when a model is saved or deleted:ignore_signals = False# Don't perform an index refresh after every update (overrides global setting):auto_refresh = False
- Don’t forget to run migrations and migrate:
- python manage.py makemigrations
- python manage.py migrate
- To make this model work with Elasticsearch, create a subclass of django_elasticsearch_dsl.Document, create a class Index inside the Document class to define your Elasticsearch indices, names, settings etc and at last register the class using registry.register_document decorator. It is required to define the Document class in documents.py in your app directory
Populate:
- To create and populate the Elasticsearch index and mapping use the search_index command: $python manage.py search_index — rebuild
- For more help use $python manage.py search_index — help command
Step 6: Search Data in Elasticsearch
- You can now search for blog posts using Elasticsearch. Here’s a simple search function in views.py:
123456789101112131415161718192021from django.shortcuts import renderfrom elasticsearch import Elasticsearchfrom .documents import PostDocumentdef search(request):query = request.GET.get('powerful')if query:es = Elasticsearch()search_results = es.search(index='blog_posts',body={'query': {'match': {'query': query,'fields': ['title']}}})hits = search_results['hits']['hits']Return JsonResponse({'hits': hits})
Step 8: URL Configuration (urls.py)
1 2 3 4 5 6 |
from django.urls import path from . import views urlpatterns = [ path('search/', views.search, name='search'), ] |
After making a search API request with the query “powerful,” the resulting JSON output with matched titles is structured as follows:.
Output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
{ "hits": { "total": { "value": 2, "relation": "eq" }, "hits": [ { "_index": "post_index", "_type": "_doc", "_id": "1", "_score": 1.0, "_source": { "title": "The Powerful Technology", "content": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque vel...", "pub_date": "2023-10-05T12:34:56" } }, { "_index": "post_index", "_type": "_doc", "_id": "2", "_score": 0.8, "_source": { "title": "powerful empowering Innovations", "content": "Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua...", "pub_date": "2023-09-30T09:00:00" } } ] }, "query": "powerful" } |
The output includes a total of 2 hits with the respective titles, content, and publication dates matching the query “powerful.” Each hit provides information about the title, content, and metadata of the posts in the Elasticsearch index.
Manisha Kumari
2024-05-16