(Created page with "Edgar Mejia")
 
 
Line 1: Line 1:
 +
Curse of Dimensionality
 +
 +
 +
 +
 +
 
Edgar Mejia
 
Edgar Mejia
 +
 +
 +
 +
 +
 +
 +
 +
MA 27101
 +
Professor Hans Walther
 +
December 6th, 2020
 +
 +
Contents
 +
Introduction ………………………………………………………………………………………………………………………………………….1<br />
 +
 +
Machine Learning ………………………………..……………………………………………………………………………………………….4<br />
 +
 +
How to Combat the Curse of Dimensionality ….………………………………………………………………………………….6<br />
 +
 +
Conclusion ………………………………………………………………………………………………………………………………..………….7<br />
 +
 +
Reference …………………………………………………………………………………………………………………………………………….8<br />
 +
 +
 +
 +
 +
 +
 +
 +
                                                  Introduction
 +
The curse of dimensionality sounds like a complex concept with potentially dangerous outcomes; however, the reality of this curse is far less harmful and much more interesting. Stated plainly, the curse of dimensionality is when “As the number of features or dimensions grows, the amount of data we need to generalize accurately grows exponentially”<sup>1</sup> As the number of variables increases, you need more data and information to make a prediction as accurate as it was before that variable was added. Perhaps the easiest way to understand this concept is with a visual example. Imagine only one variable (or dimension), with five points across the dimension as shown below
 +
 +
 +
[[File:1 Dimmension.jpg|thumbnail]]<sup>2</sup>
 +
If your purpose is to find a pattern to these points, the answer is easy to know. Because there is only one dimension, it only takes a relatively small number of data points to come to an accurate answer. The computation becomes harder when a second dimension is added, however. As seen below, when a second dimension is added, more data points are required to come to an answer with the same level of certainty
 +
 
 +
[[File:2 dimmensions.jpg|thumbnail]]
 +
<sup>2</sup>
 +
While in one dimension we only needed five data points, when a second dimension is added (the dimension of height in this case) twenty-five data points are necessary to make an accurate answer. Finally, let’s imagine we add a third dimension
 +
 +
[[File:3d.jpg|thumbnail]]
 +
  <sup>2</sup>
 +
While not fully shown in the above graphic, we would need 125 data points to make an accurate guess about a potential pattern. Obviously, we cannot continue this visual example past three dimensions however it should be clear now that the pattern of data points is 5d where d is the number of dimensions taken. Therefore, it follows that
 +
 +
[[File:Graph for proof.jpg|thumbnail]]
 +
As you can see in the graph, the number of necessary data points grows exponentially. While on the surface, understanding and explaining the curse of dimensionality just seems like a cool party trick, this concept has serious implications in the realm of machine learning.<br />
 +
 +
 +
Machine Learning<br />
 +
 +
Machine learning can be understood as teaching a computer to think and reason like a human being through data inputs and classification<sup>3</sup>. Machines learn at an exponential rate: at first, an engineer must input the data and tell the machine what the data means. However as more data is input into the machine, it learns what each data point means until the machine learns and can distinguish each data point independent of the human. One of the most fundamental ideas of machine learning is “More Data” where the machine will become more accurate and better at its job if it is given more data to understand and analyze. This idea leads many young engineers directly into the curse of dimensionality. For example, if the machine’s purpose was to determine if a song was going to be a Top 40, a good start would use two dimensions (length and tempo for example) and divide a large selection of songs into “Top 40” and “Not Top 40” based on these two dimensions. Depending on the number of data points, the machine might be able to accurately predict the success of a new song. But there is more to a song than length and tempo, so the engineer might add several more dimensions for a more accurate guess. But without adding more data points, adding these dimensions just scatters the data points in such a way that the machine can no longer make any accurate prediction.
 +
The solution seems simple enough: “More Data” but sometimes more data is impossible. If the phenomenon being studied is rare enough, there simply will not be enough data points to justify a certain number of dimensions. This is where Hughes Phenomenon comes in. Hughes Phenomenon states that as the number of features increases, the machine’s accuracy increases as well until we reach an optimal number of features. Past that point, adding more features decreases a machine's accuracy. This puts engineers and people interested in machine learning in a difficult spot: How to make the machine more accurate without adding more dimensions for the machine to analyze. Fortunately, there does seem to be a few answers to this question.<br />
 +
 +
 +
How to Combat the Curse of Dimensionality<br />
 +
 +
Perhaps the easiest way to solve the curse of Dimensionality is to simply use fewer dimensions. While this is the goal of the Principal Component Analysis (beyond the scope of this discussion), this method can be completely avoided by using fewer dimensions from the onset. This might be more difficult than it sounds because this requires the engineer to know which dimensions are important enough to measure and which to disregard. Many people reject this idea because human intuition often contradicts reality and ignoring multiple variables is not scientific. However, limiting dimensions is still commonly used in the real world because it is far more practical. Fortunately, some data collection allows us to minimize dimensions by virtue of consistency. For example,
 +
 
 +
[[File:PCA 1.jpg|thumbnail]]
 +
<sup>4</sup>
 +
In the top picture, we can see that there is little variance along the vertical axis while the horizontal axis has a significant amount of variance. Because of this fact, we can flatten the vertical dimension without losing any relatively important information. This practice is like dimension regulation discussed above; the only difference is the collection of data rather than just assuming a dimension isn’t relevant.
 +
 +
Conclusion<br />
 +
 +
While only discovered in 1957 by Mathematician Richard Bellman in his book “Dynamic Programming”5 the curse of dimensionality is proving to be a problem that threatens to hold back technological progress. Many academics of our time agree that the quickest way to an advanced civilization is through technological progress via machine learning. However, as of now, machines are tied to this curse and humans must always be at the ready to guide the machines to higher and higher levels. If the new generation of engineers and computer scientists can find a cure to this curse, machine learning could increase without bounds and the technological world could see a new renaissance. The discussion I have provided above is only a fraction of the current discord surrounding this mathematical phenomenon. For further reading and understanding, please see Vincent Spruyt’s paper on the topic titled “About the Curse of Dimensionality” written in 2014<sup>5</sup>.<br />
 +
 +
 +
References<br />
 +
 +
1 Shetty, B. (2019). Curse of Dimensionality. Retrieved December 05, 2020, from https://builtin.com/data-science/curse-dimensionality<br />
 +
 +
2 Jessica, Lain & G. (2020, October 06). What is Curse of Dimensionality in Machine Learning? Retrieved December 05, 2020, from https://www.mygreatlearning.com/blog/understanding-curse-of-dimensionality/<br />
 +
 +
3 Posted by Vincent Spruyt on June 6, 2. (2014). About the Curse of Dimensionality. Retrieved December 05, 2020, from https://www.datasciencecentral.com/profiles/blogs/about-the-curse-of-dimensionality<br />
 +
 +
4 Lipeles, A. (2019, September 10). The Curse of Dimensionality. Retrieved December 05, 2020, from https://towardsdatascience.com/the-curse-of-dimensionality-f07c66128fe1<br />
 +
 +
5 Spruyt, V. (2015, March 22). The Curse of Dimensionality in Classification. Retrieved December 05, 2020, from https://www.visiondummy.com/2014/04/curse-dimensionality-affect-classification/

Latest revision as of 13:07, 5 December 2020

Curse of Dimensionality



Edgar Mejia




MA 27101 Professor Hans Walther December 6th, 2020

Contents Introduction ………………………………………………………………………………………………………………………………………….1

Machine Learning ………………………………..……………………………………………………………………………………………….4

How to Combat the Curse of Dimensionality ….………………………………………………………………………………….6

Conclusion ………………………………………………………………………………………………………………………………..………….7

Reference …………………………………………………………………………………………………………………………………………….8




                                                  Introduction

The curse of dimensionality sounds like a complex concept with potentially dangerous outcomes; however, the reality of this curse is far less harmful and much more interesting. Stated plainly, the curse of dimensionality is when “As the number of features or dimensions grows, the amount of data we need to generalize accurately grows exponentially”1 As the number of variables increases, you need more data and information to make a prediction as accurate as it was before that variable was added. Perhaps the easiest way to understand this concept is with a visual example. Imagine only one variable (or dimension), with five points across the dimension as shown below


1 Dimmension.jpg
2

If your purpose is to find a pattern to these points, the answer is easy to know. Because there is only one dimension, it only takes a relatively small number of data points to come to an accurate answer. The computation becomes harder when a second dimension is added, however. As seen below, when a second dimension is added, more data points are required to come to an answer with the same level of certainty

2 dimmensions.jpg

2 While in one dimension we only needed five data points, when a second dimension is added (the dimension of height in this case) twenty-five data points are necessary to make an accurate answer. Finally, let’s imagine we add a third dimension

3d.jpg
 2

While not fully shown in the above graphic, we would need 125 data points to make an accurate guess about a potential pattern. Obviously, we cannot continue this visual example past three dimensions however it should be clear now that the pattern of data points is 5d where d is the number of dimensions taken. Therefore, it follows that

Graph for proof.jpg

As you can see in the graph, the number of necessary data points grows exponentially. While on the surface, understanding and explaining the curse of dimensionality just seems like a cool party trick, this concept has serious implications in the realm of machine learning.

  Machine Learning

Machine learning can be understood as teaching a computer to think and reason like a human being through data inputs and classification3. Machines learn at an exponential rate: at first, an engineer must input the data and tell the machine what the data means. However as more data is input into the machine, it learns what each data point means until the machine learns and can distinguish each data point independent of the human. One of the most fundamental ideas of machine learning is “More Data” where the machine will become more accurate and better at its job if it is given more data to understand and analyze. This idea leads many young engineers directly into the curse of dimensionality. For example, if the machine’s purpose was to determine if a song was going to be a Top 40, a good start would use two dimensions (length and tempo for example) and divide a large selection of songs into “Top 40” and “Not Top 40” based on these two dimensions. Depending on the number of data points, the machine might be able to accurately predict the success of a new song. But there is more to a song than length and tempo, so the engineer might add several more dimensions for a more accurate guess. But without adding more data points, adding these dimensions just scatters the data points in such a way that the machine can no longer make any accurate prediction. The solution seems simple enough: “More Data” but sometimes more data is impossible. If the phenomenon being studied is rare enough, there simply will not be enough data points to justify a certain number of dimensions. This is where Hughes Phenomenon comes in. Hughes Phenomenon states that as the number of features increases, the machine’s accuracy increases as well until we reach an optimal number of features. Past that point, adding more features decreases a machine's accuracy. This puts engineers and people interested in machine learning in a difficult spot: How to make the machine more accurate without adding more dimensions for the machine to analyze. Fortunately, there does seem to be a few answers to this question.

  How to Combat the Curse of Dimensionality

Perhaps the easiest way to solve the curse of Dimensionality is to simply use fewer dimensions. While this is the goal of the Principal Component Analysis (beyond the scope of this discussion), this method can be completely avoided by using fewer dimensions from the onset. This might be more difficult than it sounds because this requires the engineer to know which dimensions are important enough to measure and which to disregard. Many people reject this idea because human intuition often contradicts reality and ignoring multiple variables is not scientific. However, limiting dimensions is still commonly used in the real world because it is far more practical. Fortunately, some data collection allows us to minimize dimensions by virtue of consistency. For example,

PCA 1.jpg
4

In the top picture, we can see that there is little variance along the vertical axis while the horizontal axis has a significant amount of variance. Because of this fact, we can flatten the vertical dimension without losing any relatively important information. This practice is like dimension regulation discussed above; the only difference is the collection of data rather than just assuming a dimension isn’t relevant.

Conclusion

While only discovered in 1957 by Mathematician Richard Bellman in his book “Dynamic Programming”5 the curse of dimensionality is proving to be a problem that threatens to hold back technological progress. Many academics of our time agree that the quickest way to an advanced civilization is through technological progress via machine learning. However, as of now, machines are tied to this curse and humans must always be at the ready to guide the machines to higher and higher levels. If the new generation of engineers and computer scientists can find a cure to this curse, machine learning could increase without bounds and the technological world could see a new renaissance. The discussion I have provided above is only a fraction of the current discord surrounding this mathematical phenomenon. For further reading and understanding, please see Vincent Spruyt’s paper on the topic titled “About the Curse of Dimensionality” written in 20145.

  References

1 Shetty, B. (2019). Curse of Dimensionality. Retrieved December 05, 2020, from https://builtin.com/data-science/curse-dimensionality

2 Jessica, Lain & G. (2020, October 06). What is Curse of Dimensionality in Machine Learning? Retrieved December 05, 2020, from https://www.mygreatlearning.com/blog/understanding-curse-of-dimensionality/

3 Posted by Vincent Spruyt on June 6, 2. (2014). About the Curse of Dimensionality. Retrieved December 05, 2020, from https://www.datasciencecentral.com/profiles/blogs/about-the-curse-of-dimensionality

4 Lipeles, A. (2019, September 10). The Curse of Dimensionality. Retrieved December 05, 2020, from https://towardsdatascience.com/the-curse-of-dimensionality-f07c66128fe1

5 Spruyt, V. (2015, March 22). The Curse of Dimensionality in Classification. Retrieved December 05, 2020, from https://www.visiondummy.com/2014/04/curse-dimensionality-affect-classification/

Alumni Liaison

Have a piece of advice for Purdue students? Share it through Rhea!

Alumni Liaison