Showing posts with label Similarity. Show all posts
Showing posts with label Similarity. Show all posts

Solving large scale similarity problems in Bigdata


Handling large scale similarity problems
Example of problems
Similarities between two texts
two persons
two shopping carts
How to measure
http://en.wikipedia.org/wiki/Similarity_measure
Jaccard Coefficient
http://en.wikipedia.org/wiki/Jaccard_index
Cosine Similarity
http://en.wikipedia.org/wiki/Cosine_similarity
Inner products of two vectors
Radical basics function kernel
http://en.wikipedia.org/wiki/Radial_basis_function_kernel
Other real world examples
Bank using the data provided while opening account
Links

http://asterix.ics.uci.edu/fuzzyjoin/
http://www.slideshare.net/Hadoop_Summit/similarity-at-scale-35988496