Distributed algorithms for big data analytics in Spark

Publications:
  • Manda Winlaw, Michael Hynes, Anthony Caterini, and Hans De Sterck, 'Algorithmic Acceleration of Parallel ALS for Collaborative Filtering: Speeding up Distributed Big Data Recommendation in Spark', IEEE ICPADS, 2015. [arXiv link] [link to Spark code on github] Best Paper Award at ICPADS 2015!
  • Michael Hynes and Hans De Sterck, 'A polynomial expansion line search for large-scale unconstrained minimization of smooth L2-regularized loss functions, with implementation in Apache Spark', accepted for SIAM Data Mining conference, 2016. [arXiv link]