Hadoop is an excellent tool for analyzing large data sets, but it lacks an easy-to-use graphical interface. RapidMiner is an excellent tool for data analytics, but its data size is limited by the memory available, and a single machine is often not enough to run the analyses on time. In this project, we combine the strengths of both projects and provide a RapidMiner extension for editing and running ETL, data analytics and machine learning processes over Hadoop.
Radoop
July 25th, 2011Google Predict Empowering Applications
May 21st, 2011The Google Prediction API allows you to tap into Google’s machine learning algorithms that crunch data and give your possible outcomes, thereby helping you make your applications smarter.
Features
- Lightweight RESTful API.
- Asynchronous training.
- Automatically selects from several available machine learning techniques.
- Supported inputs: numeric data and unstructured text.
- Outputs hundreds of discrete categories, or continuous values.
- Gallery of pre-trained prediction models.
- Ability to add new training data on the fly.
- Accessible from many platforms: Google App Engine, Apps Script (Google Spreadsheets), web & desktop apps, and command line.
Read More: http://code.google.com/apis/predict/
First look at Dhiti
May 21st, 2011Dhiti offers a RESTful API to our exploratory search platform. In short, our platform allows you to:
- Upload a set of documents (html, or text) into a session.
- Extract top topics, or concepts for a document, or a set of documents
- Provide relevance feedback about articles, concepts or nuggets you like and dislike.
Recommendations subsequently change according to that. - Get recommendations of nuggets, articles or categories for a pivot. A pivot can be,
- a url (of a document already added)
- a string (treated as a query)
- a category
- your preferences – based on the relevance feedback
- Persist the session, along with your preferences
Some applications of our API:
- Content discovery on publishing sites. Dhiti Dive.
- Explore pages and topics on any page on the web. Drilll
- Convert your incoming twitter stream into a research library. Intweetion
- Get short, relevant previews from a book. Eg: Preview for the Selfish Gene
More: http://dhiti.com/api/
KNIME Beginner’s Luck
May 21st, 2011KNIME (Konstanz Information Miner) is a user-friendly and comprehensive open-source data integration, processing, analysis, and exploration platform.
“KNIME Beginner’s Luck“ is a quick approach to KNIME for beginners.
More: Rosaria Silipo is a certified KNIME trainer and this book has been born from her lessons on KNIME and KNIME Reporting. It gives a detailed overview of the main tools and philosphy of the KNIME data analysis platform. The goal is to empower new KNIME users with the necessary knowledge to start analysing, manipulating, and reporting even complex data.


