A Plethora of Machine Learning Tools | Best | Scoop.it

Two Machine Learning sub-fields Deep and Shallow Learning exist, which has become an important split over the last couple of years. Deep Learning is responsible for record results in Image Classification and Voice Recognition and is thus being spearheaded by large data companies like Google, Facebook, and Baidu. Conversely, Shallow Learning methods include a variety of less cutting edge Classification, Clustering and Boosting techniques like Support Vector Machines. Shallow learning methods are still widely in use in fields such as Natural Language Processing, Brain Computer Interfacing, and Information Retrieval.


GPU interfacing has become an important feature for Machine Learning tools because it can accelerate large scale matrix operations. The importance of this to Deep learning methods is apparent. For instance At the GPU Tech Conference in early May 2015 39 of 45 talks given under Machine Learning were about GPU accelerated-Deep Learning applications, these came from 31 major tech companies and 8 universities. The appeal reflects the massive speed improvements in GPU assisted training of Deep Networks and is thus an important feature.


Information about the tools ability to distribute computation across clusters through Hadoop or Spark is also given. This has become an important talking point for Shallow Learning techniques which suit distributed computation. Likewise distributed computation for Deep Networks has also become a talking point as new techniques have been developed for distributed training algorithms.


Lastly some additional notes are attached about the varying use of these tools in academia and industry. What information exists was gathered by searching Machine Learning publications, presentations and distributed code. Some information was also supported by researchers at Google, Facebook and Oracle so many thanks to Greg Mori, Adam Pocock, and Ronan Collobert.


The results of this research show that there are a number of tools being used at the current moment and that it is not yet quite certain which will win the lions share of use in industry or across academia.