Data Mining with Weka (5.1: The data mining process)

By | Y2014Y2014-9M-D

게시일: 2013. 10. 6.

Data Mining with Weka: online course from the University of Waikato
Class 5 – Lesson 1: The data mining process

http://weka.waikato.ac.nz/

Slides (PDF):
http://goo.gl/5DW24X

https://twitter.com/WekaMOOC
http://wekamooc.blogspot.co.nz/

Department of Computer Science
University of Waikato
New Zealand
http://cs.waikato.ac.nz/


 

data mining process

data mining process

  • Data Mining Process
    •  Ask a question
      – what do you want to know?
      – “tell me something cool about the data” is not enough!
    •  Gather data
      – there’s soooo much around …
      – … but … we need (expert?) classifications
      – more data beats a clever algorithm
    •  Clean the data
      – real data is very mucky
    •  Define new features
      – feature engineering—the key to data mining
    •  Deploy the result
      – technical implementation
      – convince your boss!
  • (Selected) filters for feature engineering
    • AddExpression (MathExpression)
      Apply a math expression to existing attributes to create new one (or modify existing one)
    •  Center (Normalize) (Standardize)
      – Transform numeric attributes to have zero mean (or into a given numeric range) (or to have
      zero mean and unit variance)
    •  Discretize (also supervised discretization)
      – Discretize numeric attributes to have nominal values
    •  PrincipalComponents
      – Perform a principal components analysis/transformation of the data
    •  RemoveUseless
      – Remove attributes that do not vary at all, or vary too much
    •  TimeSeriesDelta, TimeSeriesTranslate
      – Replace attribute values with successive differences between this instance and the next

2,523 total views, 1 views today

댓글 남기기