Data Mining with Weka (3.4: Decision trees)

By | Y2014Y2014-9M-D

게시일: 2013. 9. 22.

Data Mining with Weka: online course from the University of Waikato
Class 3 – Lesson 4: Decision trees

http://weka.waikato.ac.nz/

Slides (PDF):
http://goo.gl/1LRgAI

https://twitter.com/WekaMOOC
http://wekamooc.blogspot.co.nz/

Department of Computer Science
University of Waikato
New Zealand
http://cs.waikato.ac.nz/


  •  A quest for purity
  • Which attributes produces the purest node?
  • Get a smallest and top-down tree induction method using some kind of heuristic.
  • Information-theory based heuristic is the most popular
  • The total entropy of distribution before the split is minus the entropy of distribution after the split (in logarithm, the minus/negative means the entropy is less than 1 and this is same in probability)
  • Between attributes, the attributes that has the most bits of information is going to be chosen as the root node of decision tree.
  • Other attributes will be the branches of root node by the values of that node.
  • The purity is the… that there is no need to be branch, cause the values are clearly separated in groups.

 

1,583 total views, 3 views today

댓글 남기기