Cross-Cluster Search: The Future of Federated Search In ElasticSearch

Sure, you know and trust #Elasticsearch – the Lucene-based, distributed, RESTful search and #analyticsengine – but have you heard about Elasticsearch 5.3.0? How about Cross-Cluster Search? 

Elastic recently released Elasticsearch 5.3.0 – the latest version of the world’s most favored enterprise search engine. Their announcement suggests that the best just got better and signals the evolution of a widely-used system.

Version 5.3.0 highlights a slew of new features, most important the functionality known as, “Cross-Cluster Search.” In a post on the Elastic blog, Luca Cavanna and Simon Willnauer break down the dynamic of Cross-Cluster Search, dubbing it the “future of federated search in Elasticsearch.”

While we examine Cross-Cluster Search and its inner-workings, as well as compare its abilities with its predecessor, Tribe Node, the value and impact of Elastic’s revolutionary update will be proven unmatched. 

So, what is Cross-Cluster Search?

Consider #Cross-Cluster Search the evolution of the Tribe Node. Elastic developed the Tribe Node to provide the ability to search across multiple clusters. Despite its unique merging capabilities, Tribe Node has faced issues proven difficult to remedy. Cross-Cluster Search was created to address and refine many of the challenges that existed with its predecessor.

With Cross-Cluster Search, users are not limited to local indices when composing searches – they can search across clusters as well. What does that mean? It means you can now search against data that belongs to other remote clusters. 

What is the Tribe Node?

If you’re reading this post, you’re probably familiar with the Tribe Node. But as a refresher, and a crash course for the uninitiated, here are the key points you need to know about the Tribe Node:

  • The Tribe Node is a separate node
  • Its primary function is sniffing the cluster states of the remote clusters to merge them, which it does by linking all the remote clusters
  • This feature makes the Tribe Node a very special node, indeed – it does not belong to a cluster of its own. Instead, it latches onto various clusters

How Cross-Cluster Search Works

We thought you’d never ask. This latest version of Elasticsearch provides users with the ability to

register remote clusters via the cluster update settings API under the search.remote namespace.

Clusters are identified by a cluster alias, as well as a list of seed nodes. The cluster alias and seed nodes come into play when discovering other nodes belonging to the remote cluster, as those are the identifiers used.

Problems Solved by Cross-Cluster Search

As mentioned above, Cross-Cluster Search has solved many of the problems that existed with the Tribe Node. Improvements that address previous issues include:

  • Nodes can now act as a cross-cluster search client
  • Dynamic configuration of remote cluster, available to update without restarting
  • Lightweight connections to other clusters are limited to 3 connections per cluster, in comparison to the Tribe Node, which connected to every node in every cluster
  • Cluster states no longer merge from remote clusters, removing much of the overhead

Simplicity

The Tribe Node supports many Elasticsearch APIs. This feature allows for actions such as the retrieval of the cluster state or nodes stats by way of the Tribe Node. When this process occurs, information gathered from all the remote clusters will be returned and merged into one, single view. It should be noted that merging information derived from various sources is a quick, uncomplicated task on the client’s side Federated search on the server side, however, is a much more difficult endeavor. 

The _search API

The _search API is noteworthy and versatile as it allows Elasticsearch to perform the following and more:

  • Searches
  • Queries
  • Aggregations
  • Suggestions

These actions are executed against multiple indices composed of one or more shards.

Other Noteworthy Items

The retrieval capacity of Elasticsearch is unmatched, especially considering that large cluster size users can search across unlimited shards. However, the current version of Elasticsearch does carry a soft limit– action.search.shard_count.limit. Expansive search requests of more than 1,000 shards will be rejected by this command.

The Bottom Line

Cross-Cluster Search is the future of federated search. Elastic’s introduction of this feature within Elasticsearch 5.3.0 signals the eventual retirement of the Tribe Node. Users who adopt Cross-

Cluster Search will notice significant improvements from the common issues associated with the Tribe Node, as outlined above. While the Tribe Node has proven a worthy tool, its inevitable replacement should be welcome news and cause for celebration among the Elasticsearch faithful.

Learn More

Weblink Technologies is your source for information on Elasticsearch. Contact us today to learn more about this new development, as well as how we can put our IT expertise and resources to work for you.

Contact us at sales@weblinktechs.com to learn more about how we can help you leverage Elastic products for high-performing, easy-to-maintain, and scalable search and analytics solutions.