Mapper - reads line by line, and then maps from key to value.
Reducer - reads all values per key, and processes them
Frequent Itemsets, Association Rules
See Myndpage.
Near Neighbor Search in High Dimensions
Similarity Search
Locality Sensitive Hashing
Finding similar items. TODO brush up on AND - OR hash function families
Dimensionality Reduction: SVD & CUR
Done.
Recommendation Systems
Done
Clustering
Done
Link Analysis: Personalized Page Rank, Hubs and Authorities
Done
Web Spam and TrustRank
TODO? ish
Proximity Search on Graphs: Random Walks with Restarts
TODO
k-nearest neighbor, perceptron
Done (perceptron is for linearly separable data)
Large Scale Supervised Machine Learning
Classification & Regression Trees
Description
Support Vector Machines
Slack penalty C - the higher the value, the more the algorithm will try to optimize and account for all possible points. The smaller the value, the less
Mining Data Streams
Done.
Web Advertising
TODO
TODO
LSH families
Advertising
Analysis of Massive Graphs (analysis of social networks)
Review of recsys and advantages/disadvantages of each