際際滷shows by User: MaheshBhole / http://www.slideshare.net/images/logo.gif 際際滷shows by User: MaheshBhole / 際際滷Share feed for 際際滷shows by User: MaheshBhole https://cdn.slidesharecdn.com/profile-photo-MaheshBhole-48x48.jpg?cb=1612100093 Diving into Data Science and Machine Learning with Big Data....... 5 Days effort, 40 GB RAM(REDHAT,Linux,5 Nodes Hadoop cluster on Openstack),500 GB HDD,20 GB retail data for buyers,2 Hive Joins(2.5 Hours run),Data conditioning,Normalization,SVMwithSGD(Support Vector Machine using Stochastic Gradient Descent.) with Spark MLIB,Distributed computation,20 Iterations for model building(1.5 Hours run),auROC: Double = 0.7858184517425998, Accuracy to predict repeat buyer is 78.58147387964868 % Yeeeppeeeeeeeeeeeeeeeee!!! P.S: Model is in scala so any scala or Java API can call it to get the prediction.