(New page: <center><font size="4"></font> <font size="4">Questions and Comments for: '''Support Vector Machine and its Applications in Classification Problems ''' </font> A [https://www.project...)
 
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<center><font size="4"></font>
+
<center><font size="4"></font>  
<font size="4">Questions and Comments for: '''[[Support Vector Machine and its Applications in Classification Problems ]]''' </font>  
+
<font size="4">Questions and Comments for: '''[https://kiwi.ecn.purdue.edu/rhea/index.php/Least_Squares_Support_Vector_Machine_and_its_Applications_in_Solving_Linear_Regression_Problems Support Vector Machine and its Applications in Classification Problems]''' </font>  
  
A [https://www.projectrhea.org/learning/slectures.php slecture] by Xing Liu
+
A [https://kiwi.ecn.purdue.edu/rhea/index.php/Least_Squares_Support_Vector_Machine_and_its_Applications_in_Solving_Linear_Regression_Problems slecture] by Xing Liu  
 
</center>  
 
</center>  
 
----
 
----
Line 10: Line 10:
 
----
 
----
  
= Questions and Comments =
+
= Questions and Comments =
  
 +
----
 +
 +
== Review  ==
 +
 +
This slecture is reviewed by Tao Jiang
 +
 +
This slecture is well written, and organized in a good structure. Besides, visualizing the decision boundary of SVM is good to comprehend how SVM works. Like maximize margin, the influence of kernel functions.
 +
 +
Following are some suggestions for improvement.
 +
 +
Some typos:
 +
 +
#Background of Linear Classification Part, some wiki syntax does not transform to formula correctly.
 +
#Support Vector Machine Part, formula of discriminate function, one more “)” was typed.
 +
#Effect of Kernel Parameters on SVM, kernel is mistakenly spelled as kernal.
 +
#By the way, formulas are not fit well in the text.
 +
 +
Support Vector Machine:
 +
 +
#"Support vector machines are an example of a linear two-class classifier." Actually SVM can also handle multi classification problem and have (frequently) high accuracy. There are two common approaches, which could make SVM do multi classification, One-versus-the-rest approach and One-versus-one approach. You can find related materials in Bishop’s book Pattern Recognition and Machine Learning, in section 7.1.3 Multiclass SVMs.
 +
#Supplement: most of the coefficients alpha would turn out to be zero. However, when it is not zero, it means that the vector is near the separation hyperplane or misclassified. We also call it support vector, only that would have influence on our final separation hyperplane.
 +
 +
Effect of Kernel Functions on SVM:
 +
 +
Misclassification rate of each kernel, what’s the parameters you chose, specifically what’s the penalty value c for each model, what’s the value of gamma for Gaussian kernel? Actually those parameters would significantly influence the classification accuracy. Suggestion: use cross validation and grid search to choose proper parameter. AS MENTIONED IN THE LECTURE, IN THIS SECTION, THE PARAMETERS ARE TUNED BY CROSSVALIDATION....
  
 +
Effect of Kernel Parameters on SVM:
  
<br>
+
#It would be better that author introduced how the data was simulated. For example, if the data is mostly linear separable (with some outliner), then linear kernel would be better, even though the misclassification rate is higher than Gaussian kernel. IN THE PREVIOUS SECTION, IT IS MENTIONED THE DATA IS&nbsp;Ripley data set...
 +
#Performance should be measured by prediction, or at least by cross validation, but not by misclassification rate of the training data set. IT IS BASED ON THE CROSSVALIDATION USING THE GIVEN DATA SET... It is for sure that more complex the model is, the higher accuracy would be achieved in training set, which is also know as over fitting. By the way, over fitting is not measured by how complex the model is, but measured by how the model fits the data error.
  
<br>  
+
<br> <br> <br>  
  
 
----
 
----
  
 
----
 
----

Latest revision as of 18:07, 5 May 2014

Questions and Comments for: Support Vector Machine and its Applications in Classification Problems

A slecture by Xing Liu


Let me know if you have any questions or comments


Questions and Comments


Review

This slecture is reviewed by Tao Jiang

This slecture is well written, and organized in a good structure. Besides, visualizing the decision boundary of SVM is good to comprehend how SVM works. Like maximize margin, the influence of kernel functions.

Following are some suggestions for improvement.

Some typos:

  1. Background of Linear Classification Part, some wiki syntax does not transform to formula correctly.
  2. Support Vector Machine Part, formula of discriminate function, one more “)” was typed.
  3. Effect of Kernel Parameters on SVM, kernel is mistakenly spelled as kernal.
  4. By the way, formulas are not fit well in the text.

Support Vector Machine:

  1. "Support vector machines are an example of a linear two-class classifier." Actually SVM can also handle multi classification problem and have (frequently) high accuracy. There are two common approaches, which could make SVM do multi classification, One-versus-the-rest approach and One-versus-one approach. You can find related materials in Bishop’s book Pattern Recognition and Machine Learning, in section 7.1.3 Multiclass SVMs.
  2. Supplement: most of the coefficients alpha would turn out to be zero. However, when it is not zero, it means that the vector is near the separation hyperplane or misclassified. We also call it support vector, only that would have influence on our final separation hyperplane.

Effect of Kernel Functions on SVM:

Misclassification rate of each kernel, what’s the parameters you chose, specifically what’s the penalty value c for each model, what’s the value of gamma for Gaussian kernel? Actually those parameters would significantly influence the classification accuracy. Suggestion: use cross validation and grid search to choose proper parameter. AS MENTIONED IN THE LECTURE, IN THIS SECTION, THE PARAMETERS ARE TUNED BY CROSSVALIDATION....

Effect of Kernel Parameters on SVM:

  1. It would be better that author introduced how the data was simulated. For example, if the data is mostly linear separable (with some outliner), then linear kernel would be better, even though the misclassification rate is higher than Gaussian kernel. IN THE PREVIOUS SECTION, IT IS MENTIONED THE DATA IS Ripley data set...
  2. Performance should be measured by prediction, or at least by cross validation, but not by misclassification rate of the training data set. IT IS BASED ON THE CROSSVALIDATION USING THE GIVEN DATA SET... It is for sure that more complex the model is, the higher accuracy would be achieved in training set, which is also know as over fitting. By the way, over fitting is not measured by how complex the model is, but measured by how the model fits the data error.






Alumni Liaison

Ph.D. on Applied Mathematics in Aug 2007. Involved on applications of image super-resolution to electron microscopy

Francisco Blanco-Silva