|Table of Contents|

A Fisher Linear Discriminant Classification Approach Dealing With Single Positive Sample(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2008年03期
Page:
61-65
Research Field:
Publishing date:

Info

Title:
A Fisher Linear Discriminant Classification Approach Dealing With Single Positive Sample
Author(s):
Yin JunmeiYang Ming
School of Mathematics and Computer Science,Nanjing Normal University,Nanjing 210097,China
Keywords:
im ba lanced data se t fisher linear discr im inant( FLD) over-samp ling
PACS:
TP181
DOI:
-
Abstract:
An approach to dea ling w ith imbalanced data set w ith on ly one positive sam ple is proposed. After finding out the K-Near-Ne ighbours( K-NN) o f the sing le pos itive sample, according to certa in rules, synthetic samp les are produced in turn on the connected lines be tw een the sing le positive samp le and every near ne ighbour of it. Then the produced synthetic samp les are added to the o rig ina l positive c lasses. Further, the new data set is tra ined w ith the we ighing F isher linear d iscr im inant classification approach. In the experim ent, e igh t data sets are chosen from UCI, and the da ta sets are tra ined. The resu lts show that th is approach can improve the classifica tion perfo rmance o f the m inor ity classes effective ly.

References:

[ 1] Chan P K, Sto lfo S J. Tow ard sca lab le learn ing w ith non-un iform c lass and cost distributions: a case study in credit ca rd fraud detection[ C] / / Proc of the Fourth Interna tiona l Con ference on Know ledg eD iscovery and DataM ining( KDD- 98). NewYork, 1998: 164-168.
[ 2] W eiss G M, H irsh H. Learn ing to pred ict rare events in event sequences [ C ] / / Proc of the Fourth Internationa lConference on
Know ledg e Discovery and Da taM ining ( KDD- 98). New York, 1998: 359-363.
[ 3] A tiya A F. Bankruptcy prediction for credit risk using neura l netwo rk: a surv ey and new results [ J] . IEEE Trans on Neural Ne tw orks, 2001, 12( 4): 929-935.
[ 4] KubatM, H o lte R C, M atw in S. M ach ine learn ing for the detec tion o f o il spills in sa tellite radar im ages[ J]. M ach ine Learning,1998, 30( 2): 195-215.
[ 5] M a loo fM A. Lea rn ing when data sets a re imba lanced and w hen co sts are unequa l and unknown[ C] / / ICML- 2003W orkshop on Learn ing From Im balanced Da ta Sets II, 2003.
[ 6] KubatM, M atw in S. Address ing the curse o f imbalanced tra in ing sets: one-sided selection[ C] / / Proceedings o f the Fourteen th Interna tiona l Conference onM achine Learn ing. San Franc isco, CA: M organ Kaufm ann Press, 1997: 179-186.
[ 7] Chaw laN, Bow yerK, H allL, e t a.l SMOTE: syntheticm ino rity over-samp ling technique[ J]. Journa l o fArtific ia l Inte lligence Research, 2002, 16: 321-357.
[ 8] 周荃, 王崇骏, 王珺, 等. PC415: 用于不均衡数据集的C41 5改进算法[ J]. 计算机辅助工程, 2006, 15( 3): 23-26.
Zhou Quan,W ang Chong jun, W ang Jun, et a.l PC415: im proved C415 algorithm app lied in imba lanced datase t[ J]. Com puter A ided Eng ineer ing, 2006, 15( 3) : 23-26. ( in Ch inese)
[ 9] 肖健华, 吴今培. 样本数目不对称时的SVM 模型[ J] . 计算机科学, 2003, 30( 2): 165-167.
X iao Jianhua, Wu Jinpe.i SVM model w ith unequa l samp le number betw een c lasses[ J] . Computer Sc ience, 2003, 30( 2): 165-167. ( in Ch inese)
[ 10] 谢纪刚, 裘正定. 非平衡数据集Fishe r线性判别模型[ J]. 北京交通大学学报, 2006, 30( 5) : 15-18.
X ie J igang, Q iu Zhengding. Fisher linear d iscr im inant model w ith c lass im balance[ J] . Journal o f B eijing Jiaotong Un iv ers ity,2006, 30( 5): 15-18. ( in Ch inese)
[ 11] Chaw la N, Lazarev ic A, H all L, et a.l SMOTEBoost: im prov ing prediction of them inor ity class in boo sting[ C] / / 7 th European
Con ference on Pr inc iples and Practice o f Know ledg e D iscovery in Databases. Croatia: Cav tat-Dubrovn ik, 2003: 107-119.
[ 12] 边肇棋, 张学工. 模式识别[M ] . 北京: 清华大学出版社, 2001.
B ian Zhaoq,i Zhang Xuegong. Patte rn Recognition[M ]. Be ijing: TsinghuaUn iversity Press, 2001. ( in Chinese)

Memo

Memo:
-
Last Update: 2013-04-24