KNIME


KNIME, the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing, for modeling, data analysis and visualization without, or with only minimal, programming.
Since 2006, KNIME has been used in pharmaceutical research, it also used in other areas like CRM customer data analysis, business intelligence, text mining and financial data analysis.
KNIME's headquarters are based in Zurich, with additional offices in Konstanz, Berlin, and Austin.

History

The Development of KNIME was started January 2004 by a team of software engineers at University of Konstanz as a proprietary product. The original developer team headed by Michael Berthold came from a company in Silicon Valley providing software for the pharmaceutical industry. The initial goal was to create a modular, highly scalable and open data processing platform which allowed for the easy integration of different data loading, processing, transformation, analysis and visual exploration modules without the focus on any particular application area. The platform was intended to be a collaboration and research platform and should also serve as an integration platform for various other data analysis projects.
In 2006 the first version of KNIME was released and several pharmaceutical companies started using KNIME and a number of life science software vendors began integrating their tools into KNIME. Later that year, after an article in the German magazine c't, users from a number of other areas joined ship. As of 2012, KNIME is in use by over 15,000 actual users not only in the life sciences but also at banks, publishers, car manufacturer, telcos, consulting firms, and various other industries but also at a large number of research groups worldwide. Latest updates to KNIME Server and KNIME Big Data Extensions, provide support for Apache Spark 2.3, Parquet and HDFS-type storage.
For the sixth years in a row, KNIME has been placed as a leader for Data Science and Machine Learning Platforms in Gartner's Magic Quadrant.

Internals

KNIME allows users to visually create data flows, selectively execute some or all analysis steps, and later inspect the results, models, using interactive widgets and views. KNIME is written in Java and based on Eclipse. It makes use of extension mechanism to add plugins providing additional functionality. The core version already includes hundreds of modules for data integration, data transformation as well as the commonly used methods of statistics, data mining, analysis and text analytics. Visualization supports with the free Report Designer extension. KNIME workflows can be used as data sets to create report templates that can be exported to document formats like doc, ppt, xls, pdf and others. Other capabilities of KNIME are:
KNIME is implemented in Java but also allows for wrappers calling other code in addition to providing nodes that allow to run Java, Python, R, Ruby and other code fragments.

License

As of version 2.1, KNIME is released under GPLv3 with an exception that allows others to use the well defined node API to add proprietary extensions. This allows also commercial SW vendors to add wrappers calling their tools from KNIME.