Visit

This course offers a broad overview of computational techniques and mathematical skills that are useful for data scientists. The course is designed as a “boot camp” to offer students an intensive immersion into the subject over a short period of time. The topics include: unix shell, version control: Git, iPython, creating web APIs, data structures and algorithms, working with databases, exploratory data analysis: using Python and related libraries to explore data sets (pandas, bokeh), Map-Reduce, Spark, Hadoop, overview of some machine learning and optimization algorithms (logit regression, Poisson regression, k-means, neural networks, stochastic gradient descent, gradient descent, lbfgs), Python libraries for data analysis (scikit-learn, pytorch, SciPy, numpy), parallel computing, unit testing, IEEE 754 (Infinity, NaN, rounding error, overflow and underflow).