Category Archives: Data Mining

R을 기반으로 한 빅데이터 분석 특강 후기R – Lecture Note

* 이 포스팅은 (주) 박영사에서 출판되는 한성대학교 권혁제 교수님의 “Dance with R”  저서 특강 참여 후기입니다. * 포스팅을 고려하지 않은 후기이므로 사진이 없습니다. ㅜㅜ 날씨가 추워지는 어느 날, 회사 벽에 위 그림과 같은 포스터가 붙었다. “Dance with R”  공짜.. 아니 무료 강좌입니다. 저자의 서명이 들어간 저서 증정 이벤트! 도 있다고 한다. 이렇게 좋은 기회를 놓.칠.소.냐.… Read More »

973 total views, 3 views today

Apache Giraph – From installation to Example execution

In this article, let me look into Apache Giraph and let’s run the example “SimpleShortestPathsComputation” First, look at the introduction of Giraph (http://giraph.apache.org/) Welcome to Apache Giraph! Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their… Read More »

1,849 total views, 1 views today

Data Mining Algorithm List Tree

related material: 『Data Mining』-written by Ian H. Witten Basic Data Mining Algorithm ├───Basic rule extraction: 1R ├───Statistical modeling: Simple Bayesian, Gaussian/Normal distribution (Numeric) ├───Divide-and-conquer method: decision trees ├───Association rule mining ├───Linear Model │               ├Mathmatical Prediction: Linear Regression │               ├Linear Category: Logistic Regression │… Read More »

1,677 total views, 2 views today

Terms-C4.5(Pseudo Decision Tree Guidance System)

related material: 『Data Mining』-written by Ian H. Witten Category: Divide and conquer technique: Decision Tree Divide and conquer algorithms for Decision Tree So called “Top-down Pseudo Decision Tree Guidance Method Developed & improved J.Ross Quinlan(wiki)  The scheme based on the information gain is identical with the ID3 scheme basically The scheme which use the gain ratio is… Read More »

1,732 total views, 2 views today

Terms-Information gain ratio

related material: 『Data Mining』-written by Ian H. Witten Category: Divide and conquer technique: Decision Tree Information Gain Ratio : In Decision Tree structure, the creation of child node for a property is decided by the most biggest information gain. But the information gain is tend to prefer the property from which the more properties can be… Read More »

1,899 total views, 1 views today

Terms – information, bit

related material: 『Data Mining』-written by Ian H, Witten Category: Divide and conquer technique Information Caculating (Decision Tree) Rules yes or no: one of both is 0 – information value = 0 yes & no: Same number –  max information value Information follows multistage property For instance, in case of info([2,3,4]) , a data might belonged to… Read More »

1,462 total views, 1 views today

[Data Mining] Basics Summary – Supervised & Unsupervised

* Supervised & Unsupervised Learning  – definitions below are cited from wikipedia: –> Machine Learning  Supervised learning is the machine learning task of inferring a function from labeled training data The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a… Read More »

1,834 total views, 1 views today

[Data Mining] Basics Summary – Data Types, Analysis Methods

Data Types, Analysis Methods * Data Mining > Explaining the Past * Data Mining > Explaining the Past > Data Exploration > Univariate Analysis Categorical Variables: A categorical or discrete variable is one that has two or more categories (values).  Types Nominal: No intrinsic ordering to its categories (e.g.: Gender – male/female) Ordinal: Variables  those… Read More »

1,276 total views, no views today

More Data Mining with Weka (1.6: Working with big data)

게시일: 2014. 4. 27. More Data Mining with Weka: online course from the University of Waikato Class 1 – Lesson 6: Working with big data http://weka.waikato.ac.nz/ Slides (PDF): http://goo.gl/Le602g https://twitter.com/WekaMOOC http://wekamooc.blogspot.co.nz/ Department of Computer Science University of Waikato New Zealand http://cs.waikato.ac.nz/ 카테고리 교육 라이선스 크리에이티브 커먼즈 저작자 표시 라이선스(재사용 허용) Remix this video 소스 동영상… Read More »

1,583 total views, 1 views today

More Data Mining with Weka (1.5: The Command Line interface)

게시일: 2014. 4. 27. More Data Mining with Weka: online course from the University of Waikato Class 1 – Lesson 5: The Command Line interface http://weka.waikato.ac.nz/ Slides (PDF): http://goo.gl/Le602g https://twitter.com/WekaMOOC http://wekamooc.blogspot.co.nz/ Department of Computer Science University of Waikato New Zealand http://cs.waikato.ac.nz/ 카테고리 교육 라이선스 크리에이티브 커먼즈 저작자 표시 라이선스(재사용 허용) Remix this video 소스 동영상… Read More »

1,469 total views, no views today