Hadoop is an excellent tool for analyzing large data sets, but it lacks an easy-to-use graphical interface. RapidMiner is an excellent tool for data analytics, but its data size is limited by the memory available, and a single machine is often not enough to run the analyses on time. In this project, we combine the strengths of both projects and provide a RapidMiner extension for editing and running ETL, data analytics and machine learning processes over Hadoop.
Posts Tagged ‘Tools’
Radoop
Monday, July 25th, 2011RapidAnalytics released at OSBI 2010
Wednesday, December 1st, 2010Rapid-I now releases the first open source solution for business analytics. The process-oriented approach of RapidMiner and RapidAnalytics allows the direct and even real-time integration into business processes.
For more visit:
http://rapid-i.com/content/view/267/1/
Data Mining Trends
Tuesday, September 21st, 2010A trends statistics from google for search volume of data mining is as follows:
http://www.google.com/trends?q=Data+mining&ctab=0&geo=all&date=all
Topics of Interest:
A: Business Intelligence and Data Mining
B: Data mining tells government and business a lot about you
C: Data mining is commonly used in business to find patterns
D: `Data mining’ may implicate innocent people in search for terrorists
E: ‘Data mining’ for drug companies goes to courts
F: IMS Health stock falls, as data mining ban pitched
| Ranking according to countries (South Asia) of interest in datamining : | ||||||||||||||||||||||
|
A survey in 2010, for the data mining tools used revealed the interest of consumers in different data mining tools as follows:
This poll was conducted by KDnuggets ::http://www.kdnuggets.com/polls/2010/data-mining-analytics-tools.html and about 900 unique Data miners voted in the poll , but each were allowed multiple votes.
| RapidMiner (345) | 37.8% |
| R (272) | 29.8% |
| Excel (222) | 24.3% |
| KNIME (175) | 19.2% |
| Your own code (168) | 18.4% |
| Pentaho/Weka (131) | 14.3% |
| SAS (110) | 12.0% |
| MATLAB (84) | 9.2% |
| IBM SPSS Statistics (72) | 7.9% |
| Other free tools (67) | 7.3% |
| IBM SPSS Modeler (former Clementine) (67) | 7.3% |
| Microsoft SQL Server (63) | 6.9% |
| Statsoft Statistica (57) | 6.2% |
| Other commercial tools (56) | 6.1% |
| SAS Enterprise Miner (50) | 5.5% |
| Zementis (34) | 3.7% |
| Orange (25) | 2.7% |
| Oracle DM (19) | 2.1% |
| KXEN (19) | 2.1% |
| Salford CART Mars other (15) | 1.6% |
| VisuaLinks (12) | 1.3% |
| Viscovery (10) | 1.1% |
| Angoss (8) | 0.9% |
| TIBCO Insightful Miner (7) | 0.8% |
| Miner3D (7) | 0.8% |
| REvolution Computing (4) | 0.4% |
| Megaputer Polyanalyst/TextAnalyst (3) | 0.3% |
| Portrait Software (2) | 0.2% |
| Data Applied (2) | 0.2% |
| Centrifuge (2) | 0.2% |
| PRSD Studio (1) | 0.1% |
| Clario Analytics (1) | 0.1% |
| Bayesia (1) | 0.1% |
Open Source Data mining tools:
Well, even open source data mining is on the rise. Weka , Orange , Rattle and Rapid miner are few open source software to name. The recent trends in use of data mining software also supports Open Source in a big way, the following is an analysis by KDnuggets which indicates the choice of type of software by users of various countries.
Manu C, Student Content Intern.
IBM upgrades SPSS data mining software
Friday, May 14th, 2010IBM updates its SPSS suite with better social media analysis features and enhanced its text mining features too.According to IBM, this changes should allows customers to directly access text, web and survey data and integrate it into predictive models for helping meet there business needs in a better manner.




