I'd like to share some news that I saw yesterday about the launch of the new Google Prediction API. During the Google I/O anual event that have started yesterday, Google has released new web services including this new API.
So, What is the Google Prediction API ? The Prediction API enables access to Google's machine learning algorithms to analyze your historic data and predict likely future outcomes. It makes possible for developers and researchers to upload their data to Google Storage for Developers (another service launched during the event), and with the Prediction API , it helps them to make real-time decisions such as recommending products, evaluating user sentiment from blogs or even tweets, routing messages or assessing suspicious activities.
The Prediction API implements supervised learning algorithms as a RESTful web service to let you leverage patterns in your data, providing more relevant information to your users. Run your predictions on Google's infrastructure and scale effortlessly as your data grows in size and complexity.
A simple screenshot (extracted from the Google Prediction HomePage) shows the idea of the service. In this example, it assess the language of the text passed as parameter.
|Google Prediction API Workflow|
According to the official home page of the API, it only implements supervised learning algorithms (no unsupervised like clustering algorithms) as a RESTful web service so you can run your predictions on Google's infrastructure and scale effortlessly as your data grows in size and complexity.
They don't say about the specific algorithms they are using or how they select the one from several available machine learning techniques (I am very curious about it). It supports almost the most used types of inputs: numeric or data or unstructured text. Their outputs can be hundreds of discrete categories (doesn't work with continuous output). And the best it is accessible from many platforms like Google App Engine, web , desktop apps ( mobile apps are included?) and command line.
At least, Google introduced another tool for analyzing your data: BigQuery. This API enables fast, interactive analysis over huge datasets (Imagine trillions of records). Using SQL-like commands via a RESTful API, you can quickly explore and understand your massive data. It can help you, for example, analyze your network logs, identify seasonal sales trends, etc.
My opinion about this ? Google made a huge step forward to help the current applications in order to use their historical data for improving the usability, decisions and make money, of course! A new generation of applications using those techniques will appear in the next few years, using Natural Language Processing and Machine Learning for improving their services. A lot of data is available for users and Google is helping them to analyze this data in order to quickly make decisions. With this RESTful interface, even a young boy with some lines of code could develop a simple application to predict the weather in its city or a twitter-spam filter. Imagine the possibilities! Now, you don't need to be under a lot of machine learning and statistics books in order to give intelligence to your application or analysis at your data.
It's the intelligence now injected in black boxes for anyone with basic knowledge of programming. Let's see what happens with this step. Anyway, Google has made a step forward to the Cloud Data Analysis Computing (CDAC) ( I invented this name).
What do you think about it ? Let's wait for the next chapters!