To learn spark, sparksql and GraphX with scala by designing:
• Movie recommender system for movie lens 100k Dataset
• Movie recommender system hosted on AWS EMR with 3 Machines finding item based recommendation in 1M using cosine distance
• Popular movies list of movie lens 100k Dataset
• Marvel Superhero’s friends connection for 20,000 characters, using Breath First Search algorithm
• Spark Graphx to find degree of separation between superheroes and identify most famous Marvel superhero
Technologies used: Scala, Spark, Sparksql, Graphx, Amazon EMR and EC2, SBT