Hero Image

A Spotlight on Migrating ElasticSearch to Atlas Search

 

13 Sep 2022 | Daniel Krogulec & Mateusz Zaremba

The Power of Search

Search is everywhere in modern software and is a big part of how we use applications. Imagine that you want to get some suggestions for your next vacation; most likely, you’re going to type some phrase into a search bar and try to find an interesting blog post that gives you answers for your questions. To provide this kind of functionality, developers usually have to leverage what’s called an ELK (ElasticSearch, Logstash and Kibana) stack in their application design. This is a common design approach to search, but it comes with drawbacks around the complexity of the solution and total cost of ownership (TCO).

Atlas Search is a powerful alternative approach that is a combination of those three systems in one: a search engine, a database and a sync mechanism to provide powerful text search without adding complexity to the tech stack. It allows full-text search on top of MongoDB in the cloud, making data more valuable and easily discoverable, and all in a fully managed cloud environment. This out-of-the box search functionality allows developers to focus on delivering functionality and business value instead of managing infrastructure.

Through our partnership with MongoDB, gravity9 is seeing several clients engaging in migrations from ElasticSearch over to Atlas Search. This article outlines the pros and cons of each technology stack and our viewpoint on potential migration paths.

 

Atlas Search vs ElasticSearch

Elasticsearch is a distributed search and analytics engine built on top of Apache Lucene and developed by Elastic. It extends Lucene’s indexing and search functionalities using RESTful APIs, and it achieves the distribution of data on multiple servers using the index and shards concept.

Atlas Search is an Apache Lucene search engine embedded directly alongside your MongoDB database so that data is automatically synchronized between the two systems. This means that developers work with a single driver and API, there is no separate Search system to run and pay for, and everything is fully-managed for you in the Atlas platform.

The differences in the structures of these two tools carry important implications with them.

 

Bolt-On vs Integrated Approach

ElasticSearch is an external search engine which needs to be introduced into your application infrastructure. Introducing this external tool mandates synchronizing data between the two systems, meaning engineering teams need to create a synchronization mechanism that replicates data from the database to the search engine. Once the synchronization mechanism has been deployed, it needs to be monitored and managed, adding more engineering overhead.  While users get the rich search experience they expect, this comes at the cost of the application stack becoming more complex and unwieldy.

Atlas Search by contrast is a built-in feature on the Atlas platform and does not require any additional infrastructure work to use. Data synchronization in Atlas Search is handled automatically and does not involve additional resources.

 

Functionality

As of today ElasticSearch has more features than Atlas Search. Additional functionality currently available in ElasticSearch include features for observability tasks like application log gathering, alerting and monitoring; multi-index searches; and PDF indexing with use of an additional plugin.

Common use cases for ElasticSearch and migration strategies

 

Full-Text Search Engine

In this scenario ElasticSearch is added on top of an existing database to enrich it with advanced full-text search capabilities like stemming, stop words, autocompletion etc.  Logstash is most commonly used as a synchronization mechanism between the database and ElasticSearch. This scenario is a good fit for migration to Atlas Search.

 

Migration Plan

If client is already using MongoDB as their datasource the migration process is very simple:

  1. Remove existing logstash pipeline configuration responsible for ingesting data to ElasticSearch.
  2. Create Mongo Atlas Search indexes on the ingested data.
  3. Rewrite parts of application logic responsible for querying ElasticSearch to query Mongo.

If the datasource is not MongoDB and client wants to use Atlas Search while maintaining the data in the current relational database then one extra step is required as compared to the above process:

  1. Copy the existing logstash pipeline configuration responsible for ingesting data to ElasticSearch
  2. Change the “output” pipeline stage to ingest that data to Mongo (Using the logstash mongo plugin)
  3. Create Mongo Atlas Search indexes on the ingested data
  4. Rewrite parts of application logic responsible for querying ElasticSearch to query Mongo

 

Public Data Analysis with the Help of Connectors

In this scenario data analysis is performed based on public data sources like Twitter.  This scenario is another good fit for migration to Atlas Search.

 

Migration Plan

If Logstash is used to fetch the data from the public API, we can take advantage of the fact that Logstash can fetch data from internal data sources as well as from public APIs. Leveraging this, the migration process will be fairly similar as for the above full-text search engine scenario :

  1. Copy existing logstash pipeline configuration responsible for ingesting data to ElasticSearch
  1. Change “output” pipeline stage to ingest that data to Mongo (Using logstash mongo plugin)
  1. Create Mongo Atlas Search indexes on the ingested data
  1. Rewrite parts of application logic responsible for querying ElasticSearch to query Mongo

If Logstash is not used to fetch the data, it is likely that there is an in-house solution being used to fetch the data from the public API. The steps for migration in this case would be:

  1. Modify the in-house software responsible for saving the data in ElasticSearch to save it in MongoDB. If the software is a third-party product we should check if it has MongoDB integration so we can reconfigure it. If not, we would need to implement our own solution or just use Logstash
  1. Create Mongo Atlas Search indexes on the ingested data
  1. Rewrite parts of application logic responsible for querying ElasticSearch to query Mongo

 

Observability Tasks (Log Analysis and Metrics Monitoring)

One of the primary usages for ELK stack (ElasticSearch, Logstash, Kibana) is real-time log analysis and real-time metric monitoring. Atlas Search is not currently designed for observability tasks, so this scenario is generally not a good fit for migration from ElasticSearch to Atlas Search.

It is possible to migrate these scenarios to Atlas Search, but it would be difficult and it may not be possible to achieve exactly the same functionality afterwards. Logging scenarios would require rewriting the application parts responsible for logging and introducing additional components dependent on infrastructure. To migrate existing logs one could use the Mongoes tool.

Unfortunately there is no substitute for Kibana as of yet, so viewing ingested logs would be less convenient. To achieve similar results for metrics visualization one would need to introduce a metric exporter which could push metrics data to Kafka or write metrics to a file which could be exported to Mongo using Logstash. Data onboarded to Mongo could be visualized using MongoDB Charts. Unfortunately there are no alternatives for alerting purposes.

In highly complicated logging and metric gathering cases, the amount of work to implement those flows using Atlas Search is likely to exceed the benefits, so we would recommend keeping the solution that uses ElasticSearch.

 

Pros and Cons of Atlas Search

Atlas Search is an evolving product and comes with some pros and cons. Here’s how gravity9 evaluates its current capabilities, starting with the positives:

1. Improved developer productivity

Implementing database and search features requires no additional infrastructure and can be achieved using the same query API and MongoDB driver used to interact with MongoDB.

2. Simplified data architecture

Data synchronization is automatic for data and schema changes. You do not need to introduce additional sync components.

3. Maintainability

You get security, performance, reliability and multi-cloud deployment.

4. GUI

MongoDB has advanced native GUI clients (Compass, Studio 3T, DataGrip, DBeaver).

Now let’s look at any negatives:

1. Atlas Search is not currently designed for observability tasks

One of the common use cases for full text search engines is to aggregate application logs and provide query possibilities. Atlas Search is not currently designed for log analytics typically used in DevOps observability or security and threat hunting applications. What is more there are no alternatives for some more advanced features (like alerting) which would require creating custom implementations to cover them.

2. No multi-index searches

MongoDB does not support multi-index searches- only one can be specified in a query.

3. Array indexing

MongoDB does not index arrays which contain numeric, date or Boolean values.

 

Summary

Atlas Search is a great fit for projects that are looking for full text search capabilities. It can be introduced without any effort when MongoDB Atlas is already used as part of the application infrastructure, and migration from the ElasticSearch suite of tools is achievable in most cases.

Are you planning an Atlas Search implementation or migration from ElasticSearch? We’d love to hear about your project and offer advice from our experience then get in touch at hello@gravity9.com.

References