The data revolution is reshaping science, technology and business. Large-scale distributed optimization is emerging as a key tool in extracting useful information from the deluge of data that arises in many areas of application. In this project you will explore optimization methods for big data that include the alternating direction method of multipliers (ADMM) and the stochastic gradient descent (SGD) method. Applications of interest include, for example, matrix and tensor decompositions that may be used to generate user recommendations for movie and music streaming. The optimization algorithms will first be explored in Matlab. Areas of study may include algorithmic convergence acceleration of the ADMM and SGD methods, or efficient distributed implementations in the Spark framework for big data analytics. Some relevant links: -http://arxiv.org/abs/1508.03110 -http://stanford.edu/~boyd/papers/admm_distr_stats.html -http://spark.apache.org/docs/latest/mllib-optimization.html Required: -major in computational/applied mathematics, computer science, or engineering -at least one course on numerical computing -interest in and experience with programming (any of Matlab, Python, C, Java, C++, Scala, Spark, ...) Duration: 6 weeks, starting in December or January Funding: fully funded ($450/week (total $2,700)) Please email me if you are interested in this project or have questions about it. |
Prospective Students >