apdev: Weekend with GWT, Google maps and K-D trees

It started from desire to remind Java language programming. How to program in Java with nice web output - obvious answer is GWT. One may say that it is obsolescent programming environment, but on the other hand it is quite mature and easy to fast-start examples. Of course client side Java is rather limited but you are always free to do normal Java programming on server side.
So, Eclipse + GWT + Chrome makes convenient programming environment for Java development with basic user web interface included (plus friendly debugging). But the biggest advantage is lack of Javascript - oh, how I hate this language, its horrible syntax, dynamic typing leading to "who knows what it is and where it is from".

Next, as I've entered Google world it is tempting and easy to use Google technologies. What I needed was access to maps.
The task is following - efficient (and quite simple - as for one weekend) K-D tree implementation with presentation on Google maps. But how to easily access GMaps from GWT. One might expect easy integration (as for one company solutions). Indeed there is library for GWT that allows to access GMap, the project is hosted hosted at gwt-google-apis. But GMaps library presented there is for ver.2 and deprecated. If we dig dipper we can find library gwt-maps-3.8.0-pre1.zip for GMapa ver.3 there, but it is in prerelease stage. It is there since Mar 30, 2012 - it shows best how Google carries about GWT technology and why I called it obsolescent.

Lets go back to the task - K-D tree. This old and quite simple algorithm allows for efficient lookup of closest point in n-dimensional space (see wikipedia). In case of map, space is 2-dimensional, but the problem/inconvenience appears - it is not planar (Cartesian) space, but latitude-longitude coordinates on sphere (or rather GPS/WGS 84 ellipsoid). Of course, to calculate distance between coordinate points simple spherical Earth model is considered in presented example (the error is negligible even for Google as they use spherical Mercator projection - read wikipedia).
Digression: interesting page on map projections is at kartoweb.itc.nl, there is also copy of book on the topic.

First step is to create K-D tree for subsequent efficient queries. Algorithm is described in details in wikipedia, therefore only simple Java-pseudo-but-working-code implementation is presented here:

private PointNode createKDTree(LatLngCommon[] points, int from, int to, int depth) {
if (from >= to) {
return null;
}

final int axis = depth % 2;
Arrays.sort(points, from, to, new Comparator<LatLngCommon>(){
@Override
public int compare(LatLngCommon ll1, LatLngCommon ll2) {
return (0==axis)
?(Double.compare(ll1.lat, ll2.lat))
:(Double.compare(ll1.lng, ll2.lng));
}
});
int med = (from + to) / 2;
PointNode res = new PointNode();
res.point = points[med];
res.left = createKDTree(points, from, med, depth+1);
res.right = createKDTree(points, med+1, to, depth+1);
return res;
}

In code above partition is along parallels for even and meridians for odd tree levels. Thanks to that partition is achieved by just comparing coordinates.

More difficult is second method - finding nearest point using lat-lng coords. The problem is calculation of distance between two points on sphere. One cannot use Euclidean distance as coords are not planar. Distance has to be calculated as great circle distance (shortest distance on sphere between two points).
Standard way of such distance calculation is using spherical law of cosines. But there are two issues here:

first - derivation of equation for lat-lng coords. From law of cosines we have:
cos(c) = cos(a) cos(b) + sin(a) sin(b) cos(C),
where 'a','b','c' are great circle distances and 'C' is angle between edges of length 'a' and 'b' (opposite to edge with length 'c').
It might be tempting to think of 'C' as angle between meridian and parallel, which is right angle and cos(C) would be equal to 0. In such case 'a' would be difference between latitudes and 'b' difference between longitudes, and equation would be as simple as
cos(c) = cos(a) cos(b),
of course everybody knows that equation is different - why? Because parallels are not great circle distances (except for the Equator).
Correct equation for law of cosines in lat-lng coords is following:
cos(c) = sin(lat1) sin(lat2) + cos(lat1) cos(lat2) cos(lng1 - lng2)
where 'c' is interesting distance between points of coords (lat1, lng1) and (lat2, lng2).
One can find graphical representation of equation in fig.3-15 in book 'Mapping Hacks' by Erle S., Gibson R., Walsh J..
second - numerical problems with cos/arccos calculation for small angles - standard way out of the problem is analytic transformation of equation to more numerically suitable called Haversine formula.

But do we really need to calculate distance? I mean distance value between points? Lets take a look at algorithm:

private LatLngCommon findNearest(PointNode root, LatLngCommon point, int depth) {

// end of recursion? if (root.left == null && root.right == null) { return root.point; }
// which partition in tree - lat or lng? final int axis = depth % 2; int cmp = (0==axis) ?(Double.compare(point.lat, root.point.lat)) :(Double.compare(point.lng, root.point.lng)); // deep with recursion to go to the bottom - find approx. best

LatLngCommon best; boolean left = false; if ((root.left != null) && (cmp < 0)) { best = findNearest(root.left, point, depth+1); left = true; } else if (root.right != null) {
best = findNearest(root.right, point, depth+1); }

// check if bottom best is better than actual point?

double best_dist = distance(point, best);

double new_dist = distance(point, root.point);

if (best_dist > new_dist) {

best = root.point;

best_dist = new_dist;

}

// there might be better on the other side of the fence

int otherside = (0==axis) ?(Double.compare(best_dist, distance_lat(point, root.point))) :(Double.compare(best_dist, distance_lng(point, root.point))); LatLngCommon best_otherside = null; if (otherside > 0) {

// get best from the other side - go to the bottom if (left) { if (root.right != null) { best_otherside = findNearest(root.right, point, depth+1); } } else { if (root.left != null) { best_otherside = findNearest(root.left, point, depth+1); } }

// if other side found and better - we have new best if (best_otherside != null) { if (best_dist > distance(point, best_otherside)) { best = best_otherside; } } }

return best;

}

Please take a look how distance is used. What we need to know is comparison result and not a distance value. Valuable is information if distance between one pair of points is larger/smaller than distance between other pair of points. Now, please note that when:

dist1 > dist2, then

cos(dist1) < cos(dist2),

assuming that dist1 and dist2 are reasonable, i.e. from -180 deg to 180 deg. As cos is even, we do not have to carry about absolute value of distance.
Therefore it is not needed for the algorithm to calculate arccos (or arcsin - for Haversine formula).

Can we do better?
Additional benefits can come from pre-calculating trigonometric functions values, to avoid calculation during nearest point search, i.e.:
cos(dist) = sin(lat1) sin(lat2) + cos(lat1) cos(lat2) cos(lng1 - lng2), then
cos(dist) = sin(lat1) sin(lat2) + cos(lat1) cos(lat2) [sin(lng1) sin(lng2) + cos(lng1) cos(lng2)].

Last optimization is passing as result best point together with distance (cos of distance). This avoids calculating distance (cos of distance) multiple times as result travels back from recursion.

Assuming, final code can look like:

private CosDistance findNearest(PointNode root, LatLngCommon point, int depth) { if (root.left == null && root.right == null) { return new CosDistance(point, root.point); } final int axis = depth % 2; int cmp = (0==axis) ?(Double.compare(point.lat, root.point.lat)) :(Double.compare(point.lng, root.point.lng)); CosDistance best; boolean left = false; if ((root.left != null) && (cmp < 0)) { best = findNearest(root.left, point, depth+1); left = true; } else if (root.right != null) {
best = findNearest(root.right, point, depth+1); }

CosDistance cos_new_dist = new CosDistance(point, root.point); if (best.cos_dist < cos_new_dist.cos_dist) { best = cos_new_dist; } int otherside = (0==axis) ?(Double.compare(best.cos_dist, cos_new_dist.sin_from_lat__sin_to_lat + cos_new_dist.cos_from_lat__cos_to_lat)) :(Double.compare(best.cos_dist, 1.0 + point.cos_lat*point.cos_lat*(cos_new_dist.sin_from_lng__sin_to_lng__cos_from_lng__cos_to_lng - 1.0))); if (otherside <= 0) { CosDistance best_otherside = null; if (left) { if (root.right != null) { best_otherside = findNearest(root.right, point, depth+1); } } else { if (root.left != null) { best_otherside = findNearest(root.left, point, depth+1); } } if (best_otherside != null) {

if (best.cos_dist < best_otherside.cos_dist) { best = best_otherside; } } }

return best; }

With following helper class used for result:

private class CosDistance {

public LatLngCommon to_point; public double sin_from_lat__sin_to_lat; public double cos_from_lat__cos_to_lat; public double sin_from_lng__sin_to_lng__cos_from_lng__cos_to_lng; public double cos_dist;

public CosDistance(LatLngCommon from, LatLngCommon to) { to_point = to; sin_from_lat__sin_to_lat = from.sin_lat*to.sin_lat; cos_from_lat__cos_to_lat = from.cos_lat*to.cos_lat; sin_from_lng__sin_to_lng__cos_from_lng__cos_to_lng = from.sin_lng*to.sin_lng + from.cos_lng*to.cos_lng; cos_dist = sin_from_lat__sin_to_lat + cos_from_lat__cos_to_lat*(sin_from_lng__sin_to_lng__cos_from_lng__cos_to_lng); } }

Other pages on the topic:
http://en.wikipedia.org/wiki/R-tree
http://www.drdobbs.com/a-template-for-the-nearest-neighbor-prob/184401449
http://www.drdobbs.com/cpp/accelerated-search-for-the-nearest-line/198900559
https://github.com/ubilabs/kd-tree-javascript

apdev

poniedziałek, 1 kwietnia 2013

Weekend with GWT, Google maps and K-D trees

Brak komentarzy:

Prześlij komentarz

Archiwum bloga

O mnie