1. Introduction
Elasticsearch is best known for its full-text search capabilities but it also features full geospatial support.
We can find more about setting up Elasticsearch and getting started in this previous article.
Let’s take a look on to how we can save geo-data in Elasticsearch and how we can search those data using geo queries.
2. Geo Data Type
To enable geo-queries, we need to create the mapping of the index manually and explicitly set the field mapping.
Dynamic mapping won’t work while setting mapping for geo types.
Elasticsearch offers two ways to represent geodata:
- Latitude-longitude pairs using geo-point field type
- Complex shape defined in GeoJSON using geo-shape field type
Let’s take a more in-depth look at each of the above categories:
2.1. Geo Point Data Type
Geo-point field type accepts latitude-longitude pairs that can be used to:
- Find points within a certain distance of central point
- Find points within a box or a polygon
- Aggregate documents geographically or by distance from the central point
- Sort documents by distance
Below is sample mapping for the field to save geo point data:
PUT /index_name { "mappings": { "TYPE_NAME": { "properties": { "location": { "type": "geo_point" } } } } }
As we can see from above example, type for location field is geo_point. Thus, we can now provide latitude-longitude pair in the location in the location field.
2.2. Geo Shape Data Type
Unlike geo-point, geo shape provides the functionality to save and search complex shapes like polygon and rectangle. Geo shape data type must be used when we want to search documents which contains shapes other than geo points.
Let’s take a look at mapping for geo shape data type:
PUT /index_name { "mappings": { "TYPE_NAME": { "properties": { "location": { "type": "geo_shape", "tree": "quadtree", "precision": "1m" } } } } }
The above mapping will index location field with quadtree implementation with a precision of one meter.
Elasticsearch breaks down the provided geo shape into series of geo hashes consisting of small grid-like squares called raster.
Depending on our requirement, we can control the indexing of geo shape fields. For example, when we’re searching documents for navigation, then precision up to one meter becomes very critical as it may lead to an incorrect path.
Whereas if we’re looking for some sightseeing places, a precision of up to 10-50 meters can be acceptable.
One thing that we need to keep in mind while indexing geo shape data is, we’re always compromising performance with accuracy. With higher precision, Elasticsearch generates more terms – which leads to increased memory usage. Hence we need to very cautious when selecting mapping for the geo shape.
We can find more mapping options for geo-shape data type at official ES site.
3. Different Ways to Save Geo Point Data
3.1. Latitude Longitude Object
PUT index_name/index_type/1 { "location": { "lat": 23.02, "lon": 72.57 } }
Here, geo-point location is saved as an object with latitude and longitude as keys.
3.2. Latitude Longitude Pair
{ "location": "23.02,72.57" }
Here, location is expressed as a latitude-longitude pair in a plain string format. Please note, the sequence of latitude and longitude in string format.
3.3. Geo Hash
{ "location": "tsj4bys" }
We can also provide geo point data in the form of geo hash as shown in the example above. We can use the online tool to convert latitude-longitude to geo hash.
3.4. Longitude Latitude Array
{ "location": [72.57, 23.02] }
The sequence of latitude-longitude is reversed when latitude and longitude are supplied as an array. Initially, the latitude-longitude pair was used in both string and in an array, but later it was reversed in order to match the format used by GeoJSON.
4. Different Ways to Save Geo Shape Data
4.1. Point
POST /index/type { "location" : { "type" : "point", "coordinates" : [72.57, 23.02] } }
Here, the geo shape type that we’re trying to insert is a point. Please take a look at location field, we have nested object consisting of fields type and coordinates. These meta-fields helps Elasticsaerch in identifying the geo shape and its actual data.
4.2. LineString
POST /index/type { "location" : { "type" : "linestring", "coordinates" : [[77.57, 23.02], [77.59, 23.05]] } }
Here, we’re inserting linestring geo shape. The coordinates for linestring consists of two points i.e. start and endpoint. LineString geo shape is very helpful for navigation use case.
4.3. Polygon
POST /index/type { "location" : { "type" : "polygon", "coordinates" : [ [ [10.0, 0.0], [11.0, 0.0], [11.0, 1.0], [10.0, 1.0], [10.0, 0.0] ] ] } }
Here, we’re inserting polygon geo shape. Please take a look at the coordinates in above example, first and last coordinates in polygon should always match i.e a closed polygon.
Elasticsearch also supports other GeoJSON structures as well. A complete list of other supported formats is as below:
- MultiPoint
- MultiLineString
- MultiPolygon
- GeometryCollection
- Envelope
- Circle
We can find examples of above-supported formats on the official ES site.
For all structures, the inner type and coordinates are mandatory fields. Also, sorting and retrieving geo shape fields are currently not possible in Elasticsearch due to their complex structure. Thus, the only way to retrieve geo fields is from the source field.
5. ElasticSearch Geo Query
Now, that we know how to insert documents containing geo shapes, let’s dive into fetching those records using geo shape queries. But before we start using Geo Queries, we’ll need following maven dependencies to support Java API for Geo Queries:
<dependency> <groupId>org.locationtech.spatial4j</groupId> <artifactId>spatial4j</artifactId> <version>0.7</version> </dependency> </dependency> <groupId>com.vividsolutions</groupId> <artifactId>jts</artifactId> <version>1.13</version> <exclusions> <exclusion> <groupId>xerces</groupId> <artifactId>xercesImpl</artifactId> </exclusion> </exclusions> </dependency>
We can search for above dependencies in Maven Central repository as well.
Elasticsearch supports different types of geo queries and they are as follow:
5.1. Geo Shape Query
This requires the geo_shape mapping.
Similar to geo_shape type, geo_shape uses GeoJSON structure to query documents.
Below is sample query to fetch all documents that fall within given top-left and bottom-right coordinates:
{ "query":{ "bool": { "must": { "match_all": {} }, "filter": { "geo_shape": { "region": { "shape": { "type": "envelope", "coordinates" : [[75.00, 25.0], [80.1, 30.2]] }, "relation": "within" } } } } } }
Here, relation determines spatial relation operators used at search time.
Below is the list of supported operators:
- INTERSECTS – (default) returns all documents whose geo_shape field intersects the query geometry
- DISJOINT – retrieves all documents whose geo_shape field has nothing in common with the query geometry
- WITHIN – gets all documents whose geo_shape field is within the query geometry
- CONTAINS – returns all documents whose geo_shape field contains the query geometry
Similarly, we can query using different GeoJSON shapes.
Java code for above query is as below:
QueryBuilders .geoShapeQuery( "region", ShapeBuilder.newEnvelope().topLeft(75.00, 25.0).bottomRight(80.1, 30.2)) .relation(ShapeRelation.WITHIN);
5.2. Geo Bounding Box Query
Geo Bounding Box query is used to fetch all the documents based on point location. Below is a sample bounding box query:
{ "query": { "bool" : { "must" : { "match_all" : {} }, "filter" : { "geo_bounding_box" : { "location" : { "bottom_left" : [28.3, 30.5], "top_right" : [31.8, 32.12] } } } } } }
Java code for above bounding box query is as below:
QueryBuilders .geoBoundingBoxQuery("location").bottomLeft(28.3, 30.5).topRight(31.8, 32.12);
Geo Bounding Box query supports similar formats like we have in geo_point data type. Sample queries for supported formats can be found on the official site.
5.3. Geo Distance Query
Geo distance query is used to filter all documents that come with the specified range of the point.
Here’s a sample geo_distance query:
{ "query": { "bool" : { "must" : { "match_all" : {} }, "filter" : { "geo_distance" : { "distance" : "10miles", "location" : [31.131,29.976] } } } } }
And here’s the Java code for above query:
QueryBuilders .geoDistanceQuery("location") .point(29.976, 31.131) .distance(10, DistanceUnit.MILES);
Similar to geo_point, geo distance query also supports multiple formats for passing location coordinates. More details on supported formats can be found at the official site.
5.4. Geo Polygon Query
A query to filter all records that have points that fall within the given polygon of points.
Let’s have a quick look at a sample query:
{ "query": { "bool" : { "must" : { "match_all" : {} }, "filter" : { "geo_polygon" : { "location" : { "points" : [ {"lat" : 22.733, "lon" : 68.859}, {"lat" : 24.733, "lon" : 68.859}, {"lat" : 23, "lon" : 70.859} ] } } } } } }
And at the Java code for this query:
QueryBuilders .geoPolygonQuery("location") .addPoint(22.733, 68.859) .addPoint(24.733, 68.859) .addPoint(23, 70.859);
Geo Polygon Query also supports formats mentioned below:
- lat-long as an array: [lon, lat]
- lat-long as a string: “lat, lon”
- geo hash
geo_point data type is mandatory in order to use this query.
6. Conclusion
In this article, we discussed different mapping options for indexing geo data i.e geo_point and geo_shape.
We also went through different ways to store geo-data and finally, we observed geo-queries and Java API to filter results using geo queries.
As always, the code is available in this GitHub project.