学技术学英语: 数据结构 - Elasticsearch BKD tree

I had worked on Elasticsearch back in 2015, when it was more known for its text searching capabilities using inverted indexes. As I looked to pick it up again last year for another project, I saw that Elasticsearch had added core support for other data types from text like numbers, IP addresses, geospatial data types, etc.

As I looked to understand the main differences which could allow optimized search over such data types, I stumbled upon BKD trees. Surprisingly, there is not much written about BKD trees apart from a white paper and some blogs. The blog post will look to cover elements leading up to the development of BKD trees and its advantages starting from KD trees.

We will start with the BST (Binary Search Tree) which will be the base for our post. A BST is a binary tree which has lesser elements to its left and greater elements to its right for all nodes. The article will not contain more information regarding the insertion, deletion and searching of elements further since there are numerous sources out there.

BST or other similar implementations of BST like AVL trees, leverage the capability of dividing the search space by 2 at each node during the traversal, thus resulting in a O(logN) search in the best case scenario. It is possible to balance BSTs by rotating the tree with the pivot.

A major flaw or lack of ability with the BST is the ability to deal with multiple dimensions or spaces. For example, if we have a store of latitudes and longitudes, and we are asked to search for a specific set of latitude and longitude. It is easy for us to use BST to search for either the latitude or longitude but not both the elements together since BST is capable of handling only one dimension in its store.

What do we do if we have multiple dimensions or multiple metrics across which we need to run our search queries?

KD or K-Dimensional trees

Th

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值