سرعت موج استونلي از نگارهاي پتروفيزيكي با استفاده از ماشين مشاوره هوشمند در سازند سروك،دشت آبادان”. مجله علوم دانشگاه تهران، جلد سي و پنجم،شماره 2 ،10-1،1388
]9[کبوديان، ج؛ رحمتي، م؛ همايون پور،م: “استفاده از ماشين بردار پشتيبان در سه مساله شناسايي الگو” اولين کنفرانس بين المللي فناوري اطلاعات و دانش، تهران، دانشگاه صنعتي اميرکبير، 1382.http://www.civilica.com/Paper-ICIKT01-ICIKT01_034.html
]10[ کشاورز، م؛ يزدي، ح : “يک الگوريتم سريع مبتني بر ماشين بردار پشتيبان براي طبقه بندي تصاوير ابرطيفي با استفاده از همبستگي مکاني”.نشريه مهندسي برق و مهندسي كامپيوترايران، سال 3، شماره 1،1384
]11[ روحي، م : “آشکار سازي چهره با استفاده از ماشين بردار پشتيبان”.پايان نامه کارشناسي ارشد، دانشگاه يزد، دانشکده مهندسي برق،شهريور 86
]12[کبوديان، ج؛ مرادي، م: “يك ماشين بردار پشتيبان فازي جديد با فازي سازي در دو مرحله”. دوازدهيمن کنفرانس مهندسي برق ايران، مشهد، دانشگاه فردوسي مشهد،1383. http://www.civilica.com/Paper-ICEE12-ICEE12_226.html

]13[عربي، م؛ پويانژاد، پ : “پيش بيني نشست شمعها براساس مقاومت برشي زهکشي نشده خاک با استفاده از ماشين بردار پشتيبان”. نهمين کنگره بين المللي مهندسي عمران، دانشگاه صنعتي اصفهان، ارديبهشت ماه 1391.
http://www.civilica.com/Paper-ICCE09-ICCE09_395.html

[14] Farquad M.A.H , Bose I. “Preprocessing unbalanced data using support vector machine”. Decision Support Systems 53, 2012, pp.226-233.
[15] Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F “A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches”. IEEE Transactions on Systems, Man, and Cybernetics-part C: Applications and Reviews, 2011.

[16] Veropoulos K, Campbell C, Cristianini N.”Controlling the sensitivity of support vector machines”. in Proceedings of the International Joint Conference on Artificial Intelligence, pp. 55-60,1999.

[17] Wu G, Chang E. “Adaptive feature-space conformal transformation for imbalanced-data learning.”. in Proceedings of the 20th International Conference on Machine Learning, pp. 816-823, 2003.

[18] Akbani R, Kwek S, Japkowicz N. “Applying support vector machines to imbalanced datasets”. in Proceedings of the 15th European Conference on Machine Learning, pp. 39-50, 2004.

[19] Batuwita R, Palade V.” Efficient resampling methods for training support vector machines with imbalanced datasets”. in Proceedings of the International Joint Conference on Neural Networks, pp. 1-8, 2010.

[20 ] Batuwita R, Palade V.Class Imbalance Learning Methods for Support Vector Machines. In: He H,Ma Y: Imbalanced Learning: Foundations, Algorithms, and Applications. John Wiley & Sons, Inc.2012

[21] Fan W, Stolfo S, Zhang J, Chan P. “Adacost: Misclassification cost-sensitive boosting”. in Proceedings of the 16th International Conference on Machine Learning, pp. 97-105, Morgan Kaufmann Publishers Inc., 1999.

[22] Joshi M, Kumar V, Agarwal C. “Evaluating boosting algorithms to classify rare classes: Comparison and improvements”. in Proceedings of the IEEE International Conference on Data Mining, pp. 257-264, IEEE Computer Society, 2001.

[23] Chawla N, Lazarevic A, Hall L, Bowyer K, “Smoteboost: Improving prediction of the minority class in boosting”. in Proceedings of the Principles of Knowledge Discovery in Databases, pp. 107-119, 2003.

[24] Raskutti B, Kowalczyk A. “Extreme re-balancing for svms: a case study”.SIGKDD Exploration Newsletters, vol. 6, pp. 60-69, June 2004.

[25] Kowalczyk A, Raskutti B. “One class svm for yeast regulation prediction”.SIGKDD Exploration Newsletters, vol. 4, no. 2, pp. 99-100, 2002.

[26] Imam T, Ting K, Kamruzzaman J. “z-svm: an svm for improved classification of imbalanced data”. in Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence, pp. 264-273, Springer-Verlag, 2006.

[27] Wu G, Chang E. “Class-boundary alignment for imbalanced dataset learning,” in proceeding of the International Conference on Machine Learning: Workshop on Learning from Imbalanced Data Sets, pp. 49-56, 2003.

[28] Cristianini N, Kandola J, Elissee A, Shawe-Taylor J. “On kernel-target alignmen”. in Advances in Neural Information Processing Systems 14, pp. 367-373, MIT Press, 2002.

[29] Yang C.Y, Yang J.S, Wang J.J. “Margin calibration in svm class imbalanced learning,” Neurocomputing, vol. 73, no. 1-3, pp. 397-411, 2009.

[30] Qin A, Suganthan P. “Kernel neural gas algorithms with application to cluster analysis”. in Proceedings of the 17th International Conference on Pattern Recognition, pp. 617-620, IEEE Computer Society, 2004.

[31] Yu X.P, Yu X.G. “Novel text classification based on k-nearest neighbor”. in Proceedings of the International Conference on Machine Learning and Cybernetics, pp. 3425-3430, 2007.

[32] Tashk A, Faez K. “Boosted bayesian kernel classifier method for face detection”. in Proceedings of the Third International Conference on Natural Computation, pp. 533-537, IEEE Computer Society, 2007.

[33] Ertekin S, Huang J, Giles L. “Active learning for class imbalance problem”. in Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 823-824, ACM, 2007.

[34] Batuwita R, Palade V. “Fsvm-cil: fuzzy support vector machines for class imbalance learning”. IEEE Transactions on Fuzzy Systems, vol. 18, no. 3, pp. 558-571, 2010.

[35] Li P, Chan K, Fang W. “Hybrid kernel machine ensemble for imbalanced data sets”. in Proceedings of the 18th International Conference on Pattern Recognition, pp. 1108-1111, IEEE Computer Society, 2006.

[36] Haibo H, Garcia E. “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng, vol. 21, no. 9, pp. 1263-1284, Sep. 2009.

[37] Jian L, Xia Z,Lina X, Gao C.” Design of a multiple kernel learning algorithm for LS-SVM by convex programming”. Neural Networks 24, pp. 476-483, 2011.

[38] Chawla N.Data Mining for Imbalanced Datasets:An Overview. In: Maimon O, Rokach L : Data Mining and Knowledge Discovery Handbook,2010.

[39] Sutton O,” Introduction to k Nearest Neighbour Classification and Condensed Nearest Neighbour Data Reduction “, 2012 http://www.math.le.ac.uk/people/ag153/homepage/KNN/OliverKNN_Talk.pdf

Abstract

We need data preprocessing to achieve desired results. Data preprocessing is one the important component in knowledge discovery. There are different methods to data preprocessing. One of them is support vector machine that provide good results. but some of them are not suitable for imbalanced data.
Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data(the imbalanced learning problem is connected with the performance of learning algorithms in the presence of noise and severe class distribution skew).due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new methods and algorithms.
In this thesis we provide a review of different data preprocessing methods for imbalanced data. We try to present a efficient algorithm for achieve better results in data classification.

Key words : data preprocessing, imbalanced data sets, support vector machine

1 Algorithm level approaches
2 Data level
3 Cost sensitive learning framework
4 Support Vector Machine
5 Hard
6Knowledge Discovery and Data Mining(KDD)
7Knowledge Acquisition
8Machine Learning
9Pattern Recognition
10Back propagation
11Case Base Reasoning
12Over Fit
13Non Stationary
14Dirty
15Incompelete
16Noisy
17Inconsistent
18 Principal Component Analysis(PCA)
19 Singular Value Decomposition(SVD)
20 Accuracy
21Speed
22Robustness
23Interpretability
24Compactness
25Generalization Problem
26Confusion Matrix
27True Positive
28True Negative
29False Negative
30False Positive
31Sensitivity
32Specificity
33Precision
34Accuracy
35 Legendre
36 Gauss
37 Least Square
38Sum of Squared Residuals
39Vladimir Vapnik
40Supervised learning
41Classification
42Regression
43 Quadratic Programming
44High dimensional space
45VC Dimension
46Structural Risk Minimization
47Experimental Risk Minimization
48 Indirect Decision Function
49Direct decision function
50 Linearly Separable
51 Hyperplane Decision Function
52Feature vector
53Label
54Hyperplane
55Empirical Risk minimization
56Expected Risk
57Hypothesis
58Hypothesis Space
59Multi-Layer perceptron
60In Probability
61Empirical Risk Minimization Principle
62Structural Risk Minimization
63Structural Risk Minimization Principle
64Euclidean norm
65Bias
66Margin
67Optimal Margin Hyperplane (Maximal Margin Hyperplane)
68Quadratic Programming
69Karush-Kuhn-Tucker Conditions
70Slack variables
71Soft Margin Hyperplane
72Polynomial Kernels
73Neural Network Kernels
74Gaussian Kernels
75Sub -optimal
76 regularization parameter
77Class boundary region
78Skew
79False negative prediction
80Minority positive class
81The imbalanced support-vector ratio
82External imbalance learning method
83Resampling
84Random and focused Undersampling
85Random and focused Oversampling
86Under sampling
87Synthetic Minority Oversampling TEchnique
88Class boundary region
89 Support cluster machine
90Focused resampling method
91Shrinking technique
92convergence
93SyntheticMinority Oversampling TEchnique
94 Condensed nearest neighborhood
95Nearest neighbor
96Edited Nearest Neighbor
97Neighborhood Cleaning Rule
98Ensemble learning method
99Sub

دسته بندی : No category

دیدگاهتان را بنویسید