The KDD 2016 conference concluded today after 5 days of presentations, workshops, tutorials, keynotes and panel discussions on topics ranging from deep learning to graph theory and experiments. It has been an overwhelming lot of good info with a fine balance of academic research and industry best practices. Here are a few highlights:
The panel discussion - Is Deep Learning the New 42 - moderated by Andrei Broder with panelists Pedro Domingos, Nando de Freitas, Isabelle Guyon, Jitendra Malik, and Jennifer Neville, covered excellent ground and provided much food for thought. Jitendra Malik spoke of being a skeptic about deep learning until it was applied to computer vision - a field that has been most transformed by deep learning. A few of the downsides of deep learning were called out. It is data intensive requiring large volumes of labeled data, power hungry, and not easily interpretable. According to Pedro Domingos, we are still in the Galileo phase of machine learning and the best is yet to come. ML is still a long way off from understanding, as a child does, the concept of an elephant from a few examples. Important ideas could come from the confluence of developmental psychology, neuroscience, and statistics
Nando de Freitas’s keynote on Learning to Learn and Compositionality with Deep Recurrent Neural Networks provided a good round-up of the key ideas and innovations in deep learning. The second half of the talk covered some of his research on ‘learning to learn’ and neural programmer-interpreters (NPIs) was particularly fascinating. NPIs involves allowing neural networks to learn, for example, from how programs go about adding numbers by training them on program execution logs. While the particular example of training a neural network to add two numbers may seem simplistic, the possibilities are endless and could change the way we go about programming
Jerome Friedman’s keynote at the IDEA workshop on Regression Location and Scale Estimation with Application to Censoring was super insightful. He discussed 3 problems in ML that are not covered in texts but need more research - robustness (when data does not follow observed/expected behavior), accuracy (how to compare algorithms), and censoring (less than perfect knowledge of the outcome variable in the training dataset). Statistical models attempt to get prediction accuracy on average and not at the level of individual samples. He suggested a few approaches to handle the above, including iterative estimation of the model function and the scale function (that is assumed to be a constant in standard approaches)
This was also the first time KDD had hands-on tutorials. The one on Scalable R on Spark by Microsoft R (formerly Revolution R) was very informative. All in all, an excellent conference and time well spent. While I have in the past been more inclined to skim through relevant papers and talks offline than attend conferences, the sheer amount of interesting and useful stuff one is exposed to when attending in person is beyond compare.