Revision as of 10:17, 18 March 2013 by Mboutin (Talk | contribs)


The County Problem

by Alec McGail, proud Member of the Math Squad.


 keyword: tutorial, inside county, closed curve, curve 

INTRODUCTION In this tutorial / exploration, I'll talk a little about math, and a little about programming, with the goal of showing you that thinking mathematically can really get you out of a pickle sometimes.

 Contents
- The Problem
- A First Try
- The Solution
- The Code
- Questions and Comments

The problem

The problem we want to solve here is as follows: we want to create a function in some computer language which takes an Indiana longitude and latitude value as input, and returns the county in which this point lies. For example, in Java we would want:

String getCounty( float lat, float lon ) {
  //Code here
}

System.out.println( getCounty( 39.162147, -86.529045 ) );

to output "Monroe", because (39.162147, -86.529045) is in Monroe county.


A first try

At first glance, intuition tells you that this problem should be easy. After all, when one looks at a county and some point in the vicinity of the county, they can immediately tell whether the point is inside or outside of this county. But this mental action of discerning the "insideness", so to speak, of the point, is something we must inspect. What exactly do we do to determine the "insideness" of a point? Imagine we are looking at a county and a point, and set forth at the task of determining its insideness. What is the mental method we use to do this? In the simplest situations, it's extremely easy. Consider, for example, the case of a rectangular county, a somewhat common example:

Hendricks county.png

We can immediately see that the capital is inside the county. But how do we see that? Well, when nagged about a thorough explanation, I'd say that if the capital were outside the county, I could take it and drag it as far away as I like without it ever intersecting the sides of the county. But if I drag the point in any direction from where it's drawn, I'll almost immediately hit a side. Thus the point is inside the county. In other words, if I draw a line from the point in any direction, and it hits a side of the county, then the point is inside the county. So, is it that simple? It certainly suffices for the simplest of counties. Imagine I have the array of points for some county I'm interested in, and it's extremely simple:

//This creates a county which is a 10-sided polygon :)
float two_pi = 2 * 3.1415926;
float[][] myCounty = new float[10][2]

for( i=0; i<10; i++ ) {
  float angle = (i / 10) * two_pi;
  float x = Math.cos( angle );
  float y = Math.sin( angle );
  myCounty.push( [x, y] );
}

Then in our function, we could just look for some straight line coming out from the point we're interested in (commonly called a ray) which intersects the outline of the county. So our function would just involve looping through each segment in a county's outline and seeing if the ray intersects that segment. If such an intersecting ray does exist, the point is inside. If not, it's outside. This would be wonderfully simple if it were completely true, but what if the counties are more complicated? If you think for a bit, you can surely find a counterexample. To prove me wrong, you need to find a hypothetical county in which my function would return the wrong answer. One example is Vanderburgh county in Indiana:

Vanderburgh county.png


The solution

So my function fails, and it's clear that in even more complicated situations, my function would fail almost exclusively. One could imagine, in a rather odd society, that the county boundaries could become very complicated:

Complicated jordan curve.png

Source

If you look closely, this is in fact a valid boundary for a county! And if we were to put a point somewhere in the middle, our line test would certainly not yield the correct answer. But this approach still provides useful information. Imagine we decide to walk along the ray to freedom. Whether we're inside or outside the county, if we walk along the whole ray, we'll end up outside the county, and we know this. As we walk along this ray, we may never cross a border at all. It's clear in this case that we most certainly did not start inside the county. Thus if we draw a ray out from a point and it makes no intersections with the curve, the point must have been outside it. On the other hand, we also know that if we only cross the border once on our way out, we must have started inside the county! So if we count the number of intersections to be 1, we are positive the point is inside the county. What if there were two intersections? We ended up outside the county, and we crossed the border twice, so we must have gone in the county, and then back out again. In fact, if we cross the border twice at any time, our insideness before is the same as our insideness after. Thus if we find any even number of intersections with the ray, the insideness at the beginning of your journey is the same as your insideness at the end of your journey, and thus the point was outside the county. Likewise, if we find any odd number of intersections with the ray, the insideness at the beginning of your journey is the same as your insideness just before you cross your last border, and thus the point was inside the county.

This result is what I used to solve the problem:

If the ray intersects the curve an odd number of times, the point is inside. If the number of intersections is even, the point is outside. Thus if we count the number of intersections the ray has with the county's border, we can still find the "insideness" of the point.


The code

Now what's the best way to implement this fact? We are given a point and desire to determine which county in Indiana it belongs. I have a file containing the boundary points of every county in the USA (available for free online, with some Googling). If we give each county an index, 0 - 91, then we can store these points in a large multidimensional array. Let's say I've already figured out how to do that, and the points are in a variable called county_pts. Not only that, but I've stored the names of these counties in an array with the same order, so that we can access them later, county_names. Just so we're clear about what we're talking about

The first thing to note is that this problem is equivalent to the slightly less complicated task of creating a function to determine whether a point lies in a specific county. If we are able to create such a function, call it int inCounty( int county, float lat, float long ), then the function we want would just be:

String getCounty( float lat, float long ) {

  //Loop through the counties
  for( i = 0; i < counties.length; i++ ) {

    //For each county, check if the point given is in that county
    if( inCounty( i, lat, long ) ) {

      //If it is, return the name of that county
      return county_names[i];

    }

  }

  //If no county is found, return "County not found"
  return "County not found.";
}

So we should be ready to go, we just have to write the code-version of our theorem. We draw some ray from the point, and count the number of intersections with the sides of the county. Then we can say whether the point is inside or outside the county just by whether this number is even or odd. Now, because we are free to draw whatever ray we like, I decided to use the ray parallel to the x-axis, to make the calculations easier. Given the points of some segment (which belongs to the boundary of the county), and the point we're interested in, we want some mathematical conditions to verify if the ray we draw intersects the segment. Just to get an intuition of the situation, we're looking at a specific segment and some point, and seeing if we extend a ray in the positive x direction from the point, whether it will hit the segment:

Points and segment.png

The segment is between points $ p_{i} = (x_{i}, y_{i}) $ and $ p_{i+1} = (x_{i+1}, y_{i+1}) $. Call your "point of interest" $ (a, b) $ Thus, remembering our high school algebra, the line which goes through $ p_{i} $ and $ p_{i+1} $ is

$ f(x) = \frac{y_{i+1} - y_{i}}{x_{i+1} - x_{i}} (x - x_i) + y_i $

and furthermore, remember that the ray is parallel to the x-axis, so it's just the horizontal line which passes through the point $ (a, b) $:

$ f(x) = b $

Equating the two, we have

$ b = \frac{y_{i+1} - y_{i}}{x_{i+1} - x_{i}} (x - x_i) + y_i $

$ \frac{b - y_i}{ \frac{y_{i+1} - y_{i}}{x_{i+1} - x_{i}} } = x - x_i $

$ x = \frac{b - y_i}{ \frac{y_{i+1} - y_{i}}{x_{i+1} - x_{i}} } + x_i $

So now we know that if we extend the ray and the segment to lines, they will intersect at the point

$ \left( \frac{b - y_i}{ \frac{y_{i+1} - y_{i}}{x_{i+1} - x_{i}} } + x_i, b \right). $

Because we drew a 'ray' to the right of the point, if this intersection point is to the left, we know the ray didn't intersect the segment. In other words, we must have that

$ \frac{b - y_i}{ \frac{y_{i+1} - y_{i}}{x_{i+1} - x_{i}} } + x_i > a $

Also, the segment only spans the part of the line between $ p_i $ and $ p_{i+1} $, so if the y-coordinate of the intersection point is anywhere outside $ (y_i, y_{i+1}) $ (the set), then we know the ray didn't intersect the segment. In other words, we must have that

$ b > min( y_i, y_{i+1} ) $ and $ b < max( y_i, y_{i+1} ) $

Otherwise, the ray and the segment don't intersect. So now, equipped with these criteria of intersection, we're ready to write the code!

int inCounty( int county, float lat, float long ) {
  int count = 0;

  float a = lat;
  float b = long;

  for( i = 0; i < county_pts[county].length; i++ ) {
    float[] p1 = county_pts[i];

    //Here, we want to make sure that we check the final segment, the segment which goes from the last 
    //point in the array to the first point in the array.
    int next_point;
    if( i == county_pts[county].length - 1 )
      next_point = 0;
    else
      next_point = i+1;

    float[] p2 = county_pts[next_point];

    float x1 = p1[0];
    float x2 = p2[0];
    float y1 = p1[1];
    float y2 = p2[1];

    //Then we calculate the intersection points:
    float slope = ( y2 - y1 ) / ( x2 - x1 );

    float intersection_x = ( b - y1 ) / slope + x1;
    float intersection_y = b;

    float min_y = Math.min( y1, y2 );
    float max_y = Math.max( y1, y2 );

    //These are the conditions we derived above
    if( intersection_x > a && intersection_y > min_y && intersection_y < min_y )
      count++;
  }

  //In mathematical terms, this means "count mod 2", and returns 0 if count is even and 1 if count is odd
  //This will work, because we know inCounty should return true when the count is odd, and false when count is even.
  return count % 2;
}


Questions and comments

If you have any questions, comments, etc. please, please please post them below:

  • Comment / question 1
  • Comment / question 2

Back to Math Squad page

Alumni Liaison

Basic linear algebra uncovers and clarifies very important geometry and algebra.

Dr. Paul Garrett