{"id":13024,"date":"2019-07-31T13:16:32","date_gmt":"2019-07-31T17:16:32","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=13024"},"modified":"2022-05-24T15:17:58","modified_gmt":"2022-05-24T19:17:58","slug":"voracity-knime-node","status":"publish","type":"post","link":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/","title":{"rendered":"Speeding KNIME with Voracity"},"content":{"rendered":"<h4><b>Abstract<\/b><\/h4>\n<p><i><span style=\"font-weight: 400;\"><a href=\"https:\/\/www.knime.com\">KNIME<\/a> is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects is usually done via their intermediate file node, database connectors, or other extensions like Spark. To increase functionality and speed while reducing the complexity of data preparation, IRI created a \u2018job source\u2019 or \u2018data provider\u2019 node to use the CoSort engine. Its <\/span><\/i><a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl\"><i><span style=\"font-weight: 400;\">SortCL<\/span><\/i><\/a><i><span style=\"font-weight: 400;\"> program, also built in the <a href=\"https:\/\/www.iri.com\/products\/voracity\">IRI Voracity<\/a> data management platform and running in Eclipse with KNIME, simultaneously wrangles and feeds integrated, cleansed, masked, or synthesized data in memory into the KNIME workflow. PC benchmarks show that this approach can drive dramatic KNIME performance gains in high volume.<\/span><\/i><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Group-by-Store-Sales-Average-Times.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13394 size-full aligncenter\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Group-by-Store-Sales-Average-Times.png\" alt=\"\" width=\"600\" height=\"371\" \/><\/a><\/p>\n<h4><b>Background<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">KNIME<\/span><span style=\"font-weight: 400;\">, short for Konstanz Information Miner, is an open source data mining and analytics platform for turning data from multiple sources into charts, images, data models, and outputs, all in a single workflow designed and run from Eclipse. KNIME has become a very powerful, and popular ecosystem for predictions, machine learning, and other areas of data science.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">KNIME projects consist of task-specific nodes, which are all the input, analytic and visualization pieces of the larger or job. KNIME provides many free nodes of its own, and a marketplace of nodes built by the <\/span><a href=\"https:\/\/www.knime.com\/community\"><span style=\"font-weight: 400;\">KNIME community<\/span><\/a><span style=\"font-weight: 400;\">. There are nodes for machine and deep learning (AI), predictive analytics, custom code, and nodes for topics like molecular chemistry.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The nodes are strung together in workflows can also run individually or in groups. But as with other BI and analytic platforms, KNIME jobs run slow given large data sources. Its own data sourcing and transformation nodes are inherently slow, and external nodes like Spark are limited in functional and data source scope.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Fortunately, a new node compatible with the <\/span><a href=\"https:\/\/www.iri.com\/products\/voracity\"><span style=\"font-weight: 400;\">IRI Voracity<\/span><\/a><span style=\"font-weight: 400;\"> data manipulation and management platform &#8212; which is also built on Eclipse but powered by the <\/span><a href=\"https:\/\/www.iri.com\/products\/cosort\"><span style=\"font-weight: 400;\">IRI CoSort<\/span><\/a><span style=\"font-weight: 400;\"> data manipulation engine\u00a0<\/span><span style=\"font-weight: 400;\">called <\/span><a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl\"><span style=\"font-weight: 400;\">SortCL<\/span><\/a><span style=\"font-weight: 400;\"> &#8212; can result in KNIME workflows finishing up to up to 10x faster. CoSort is a proven big data workhorse featuring superior transformation algorithms, optimized I\/O and memory use, and task consolidation.<span id='easy-footnote-1-13024' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#easy-footnote-bottom-1-13024' title=' You can read about the foundations of CoSort and the roles of Voracity in big data &lt;a href=&quot;https:\/\/www.iri.com\/solutions\/big-data&quot;&gt;here&lt;\/a&gt;.'><sup>1<\/sup><\/a><\/span><\/span><\/p>\n<p><span style=\"font-weight: 400;\">The &#8220;Voracity Job Source\u201d node for KNIME rapidly <\/span><a href=\"https:\/\/www.iri.com\/solutions\/business-intelligence\/bi-tool-acceleration\"><span style=\"font-weight: 400;\">wrangles raw data<\/span><\/a><span style=\"font-weight: 400;\"> using SortCL and pumps its results into a KNIME model in memory. This model can then be used by almost every other node in KNIME. Those nodes can thus work immediately with the filtered, transformed, cleansed, and PII-masked data that they need to succeed. This article covers the relative planning and performance of data wrangling in KNIME without, and then with, the Voracity node.<\/span><\/p>\n<h4><b>Benchmarking Environment<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">For testing the node, I used CSV source files of increasing size<span id='easy-footnote-2-13024' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#easy-footnote-bottom-2-13024' title=' Those files were also built in Voracity using SortCL, which is also the parent program for the &lt;a href=&quot;https:\/\/www.iri.com\/products\/rowgen&quot;&gt;IRI RowGen&lt;\/a&gt; random &lt;a href=&quot;http:\/\/www.iri.com\/solutions\/test-data&quot;&gt;test data&lt;\/a&gt; synthesis product.'><sup>2<\/sup><\/a><\/span>\u00a0<\/span><span style=\"font-weight: 400;\">to compare the relative processing speeds and scalability of the single Voracity Job Source node against the typical KNIME nodes provided to read the same input and perform the same data transformations.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">My test files contained four transaction items: department, store number, date, and sales by department. They were chosen to simulate a data set from a department store chain where product sales are tallied by department and by store.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">All tests were performed in the <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\"> IDE for Voracity with the KNIME Analytics Platform installed in the same Eclipse workspace. I will explain how to load KNIME components into the IRI Workbench along with the Voracity node in another blog post.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The hardware is used for the benchmarks of both approaches was on an Intel hexa-core Windows 10 64-bit desktop with 12GB of RAM. The software versions used are CoSort v10 (in Voracity v2) and KNIME v4.<\/span><\/p>\n<h4><b>Prerequisite KNIME Nodes<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">To perform a<\/span><i><span style=\"font-weight: 400;\"> group by<\/span><\/i><span style=\"font-weight: 400;\"> to find the total sales for each store, I needed to configure, connect, and run five nodes in KNIME to achieve the same result Voracity will using only one node.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/Knime_close_up.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13028 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/Knime_close_up.png\" alt=\"KNIME close up\" width=\"644\" height=\"135\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Knime_close_up.png 644w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Knime_close_up-300x63.png 300w\" sizes=\"(max-width: 644px) 100vw, 644px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">The nodes needed for this test:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">CSV Reader: reads the CSV file into KNIME<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Number to String: changes the store numbers to strings<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Column Filter: removes all columns but the store and sales column<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Round Double: sales column from the CSV file was automatically read in as an integer. This data type is too small for the sum in the next node<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">GroupBy: Sums the totals of all the sales and groups them by store<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Interactive Table: Shows the results in a table form<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">These nodes and others can be found in the node repository by either using the search bar or by navigating to, and selected them from, the respective categories in which they appear.<\/span><\/p>\n<h4><b>Prerequisite Voracity (CoSort SortCL) Data Wrangling Job<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">For the <\/span><i><span style=\"font-weight: 400;\">Voracity Job Source<\/span><\/i><span style=\"font-weight: 400;\"> node to work, there needs to be a preexisting SortCL job script with a stdout (standard output \/ unnamed pipe) target specification. These jobs are also made in IRI Workbench, through a <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\/voracity-gui\/design\"><span style=\"font-weight: 400;\">variety of methods<\/span><\/a><span style=\"font-weight: 400;\"> like automatic job creation wizards, ETL-style mapping diagrams, and a syntax aware editor.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this example, I am sorting and aggregating the test department store transaction data set in different-sized CSV files. This requires a single job script for the SortCL program to parse and run on the command line, or this case, to be launched in the Voracity provider node for KNIME.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">My job script is shown in IRI Workbench below, along with its dynamic outline and diagram view:<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13029 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage-1024x576.png\" alt=\"Voracity Collage\" width=\"590\" height=\"332\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage-1024x576.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage-300x169.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage-768x432.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage.png 1920w\" sizes=\"(max-width: 590px) 100vw, 590px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">This job is then selected for execution in the one (and only) <\/span><i><span style=\"font-weight: 400;\">Voracity <\/span><\/i><span style=\"font-weight: 400;\">Job Source node required:<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/Voracity_close_up.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-13030 size-full\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_close_up-e1564520571932.png\" alt=\"Voracity close up\" width=\"407\" height=\"131\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_close_up-e1564520571932.png 407w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_close_up-e1564520571932-300x97.png 300w\" sizes=\"(max-width: 407px) 100vw, 407px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Note that a SortCL job can not only perform multiple transformations in the same script and I\/O pass, it can take multiple sources (files, tables, pipes, URLs, etc.) and produce multiple targets all at the same time. It uses source field names as the symbolic references for mapping the data to the outputs.<\/span><span style=\"font-weight: 400;\">\u00a0<span id='easy-footnote-3-13024' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#easy-footnote-bottom-3-13024' title=' The task consolidation of its external sortcl engine is one of the reasons Voracity is a &lt;a href=&quot;https:\/\/www.iri.com\/products\/voracity\/why-is-voracity-better&quot;&gt;superior alternative to legacy ETL tools&lt;\/a&gt; in its own right.'><sup>3<\/sup><\/a><\/span><\/span><\/p>\n<p><span style=\"font-weight: 400;\">Remember, the first target you want feeding to KNIME must be designated as stdout (standard output) for the data to flow from SortCL in memory to the KNIME table node. Without that defined pipe, the Voracity node will not work.<\/span><\/p>\n<h4><b>Running the Voracity Job Source Node<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">Using the Voracity Job Source node is similar to using any other node in KNIME. Just go to the IRI Voracity Node category and drag the Voracity Job Source into the workflow:<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/node_repo.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13031 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/node_repo.png\" alt=\"node ropo\" width=\"572\" height=\"354\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/node_repo.png 572w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/node_repo-300x186.png 300w\" sizes=\"(max-width: 572px) 100vw, 572px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Open the Node and use the file browser to locate and select the target script (.scl file) that SortCL will run in the Voracity Job Source node for KNIME:\u00a0<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/Voracity_box.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13032 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/Voracity_box.png\" alt=\"Voracity box\" width=\"471\" height=\"328\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_box.png 471w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_box-300x209.png 300w\" sizes=\"(max-width: 471px) 100vw, 471px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">After you select the script, click Apply and run the Voracity node to create and feed the SortCL results directly, through memory, into the target KNIME table. That created table can be transferred to other nodes for analytics, other mining, or visual nodes in KNIME:<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage2.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13033 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage2-1024x575.png\" alt=\"Voracity collage 2\" width=\"677\" height=\"380\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage2-1024x575.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage2-300x169.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage2-768x432.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage2.png 1600w\" sizes=\"(max-width: 677px) 100vw, 677px\" \/><\/a><\/p>\n<h4><b>Relative Speeds<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">Using the Global Timer node provided by KNIME, we can compare the speed of KNIME with and without the Voracity node in performing the same work. The listed times are each node\u2019s individual time in milliseconds from the workflow.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The following screenshots show how the normal Global Timer node\u2019s table would appear.<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-weight: 400;\">Times for 1 million rows with just KNIME Nodes:<\/span><\/p>\n<p style=\"text-align: center;\"><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/refined_knime_time_table.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-13035\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/refined_knime_time_table.png\" alt=\"\" width=\"397\" height=\"193\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/refined_knime_time_table.png 397w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/refined_knime_time_table-300x146.png 300w\" sizes=\"(max-width: 397px) 100vw, 397px\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><span style=\"font-weight: 400;\">Total time for those nodes:<\/span><\/p>\n<p style=\"text-align: center;\"><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/knime_time_table_sum.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-13038\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/knime_time_table_sum.png\" alt=\"\" width=\"254\" height=\"105\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><span style=\"font-weight: 400;\">Times for 1 million rows with Voracity in KNIME:<\/span><\/p>\n<p style=\"text-align: center;\"><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/refined_voracity_time_table.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-13036\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/refined_voracity_time_table.png\" alt=\"\" width=\"340\" height=\"119\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/refined_voracity_time_table.png 340w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/refined_voracity_time_table-300x105.png 300w\" sizes=\"(max-width: 340px) 100vw, 340px\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><span style=\"font-weight: 400;\">Total time for Voracity in KNIME:<\/span><\/p>\n<p style=\"text-align: center;\"><a href=\"\/blog\/wp-content\/uploads\/2019\/07\/voracity_time_table_sum.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-13037\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/voracity_time_table_sum.png\" alt=\"\" width=\"267\" height=\"103\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">After various tests on the increasing file sizes, below are my final results. The times listed are the average totals of all nodes involved to create the same table at the end of each workflow.<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/KNIME-4.0-times.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13393 size-full aligncenter\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/KNIME-4.0-times.png\" alt=\"\" width=\"559\" height=\"267\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/KNIME-4.0-times.png 559w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/KNIME-4.0-times-300x143.png 300w\" sizes=\"(max-width: 559px) 100vw, 559px\" \/><\/a><\/p>\n<p>These times illustrate a major improvement over the functionally equivalent, required KNIME nodes; i.e., what takes seconds for Voracity takes minutes in KNIME.<\/p>\n<p><span style=\"font-weight: 400;\">More specifically, given the same flat-file CSV sources, Voracity wrangled that data up to 9 &#8211; 10x faster than KNIME! In fact, KNIME needed more time just to read the CSV file than Voracity needed to complete the entire data transformation job and transfer that data to KNIME.<\/span><\/p>\n<h4><b>Bottom Line<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">By pre-processing and calculating data for KNIME <\/span><i><span style=\"font-weight: 400;\">before <\/span><\/i><span style=\"font-weight: 400;\">it has to convert it into its own table, we can greatly increase the speed at which KNIME can perform its total set of data preparation and analytic functions.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In both methods of course, downstream analytics in the workflow proceeds identically, but the major bottleneck associated with them is alleviated when the Voracity node provides the data.<\/span><\/p>\n<h4><b>Next Steps<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">So far, the Voracity node works with KNIME inside native Eclipse builds like, or including, IRI Workbench. Learn how to install it <a href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/knime-installation-workbench\/\">here<\/a>. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Check out <a href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/knime-machine-learning\/\">this<\/a><\/span><span style=\"font-weight: 400;\">\u00a0application of the Voracity node feeding data into KNIME machine learning nodes in the evaluation of breast cancer data. Give us your feedback and suggestions in the reply form below.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Abstract KNIME is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects is usually done via their intermediate file node, database connectors, or other extensions like Spark. To increase functionality and speed while reducing the complexity of data preparation, IRI created a \u2018job source\u2019 or \u2018data provider\u2019<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\" title=\"Speeding KNIME with Voracity\">Read More<\/a><\/div>\n","protected":false},"author":115,"featured_media":13048,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[108,32,1,776,217,34,91],"tags":[1426,373,52,44,1383,1163,1425,100,546,789,850,1422,1423,1424,354,68],"class_list":["post-13024","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data-2","category-business-intelligence","category-data-transformation2","category-etl","category-iri","category-business","category-iri-workbench","tag-artificial-intelligence","tag-bi-tool-acceleration","tag-business-intelligence-2","tag-cosort","tag-data-science","tag-data-wrangling","tag-deep-learning","tag-etl","tag-iri-cosort","tag-iri-voracity","tag-iri-workbench","tag-knime","tag-knime-analytics-platform","tag-machine-learning","tag-predictive-analytics","tag-sortcl"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Speeding KNIME with Voracity - IRI<\/title>\n<meta name=\"description\" content=\"Abstract KNIME is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects...\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Speeding KNIME with Voracity - IRI\" \/>\n<meta property=\"og:description\" content=\"Abstract KNIME is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2019-07-31T17:16:32+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-05-24T19:17:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1110\" \/>\n\t<meta property=\"og:image:height\" content=\"624\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Craig Schein\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Craig Schein\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\"},\"author\":{\"name\":\"Craig Schein\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/803e7bda27760374008e0dee86eee2bd\"},\"headline\":\"Speeding KNIME with Voracity\",\"datePublished\":\"2019-07-31T17:16:32+00:00\",\"dateModified\":\"2022-05-24T19:17:58+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\"},\"wordCount\":1475,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png\",\"keywords\":[\"artificial intelligence\",\"bi tool acceleration\",\"business intelligence\",\"CoSort\",\"data science\",\"data wrangling\",\"deep learning\",\"ETL\",\"IRI CoSort\",\"IRI Voracity\",\"IRI Workbench\",\"KNIME\",\"KNIME Analytics Platform\",\"machine learning\",\"predictive analytics\",\"SortCL\"],\"articleSection\":[\"Big Data\",\"Business Intelligence (BI&#041;\",\"Data Transformation\",\"ETL\",\"IRI\",\"IRI Business\",\"IRI Workbench\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\",\"name\":\"Speeding KNIME with Voracity - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png\",\"datePublished\":\"2019-07-31T17:16:32+00:00\",\"dateModified\":\"2022-05-24T19:17:58+00:00\",\"description\":\"Abstract KNIME is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects...\",\"breadcrumb\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png\",\"width\":1110,\"height\":624},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/beta.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Speeding KNIME with Voracity\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/beta.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/803e7bda27760374008e0dee86eee2bd\",\"name\":\"Craig Schein\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/dd64a949f0641d95a87230cabec062d5?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/dd64a949f0641d95a87230cabec062d5?s=96&d=blank&r=g\",\"caption\":\"Craig Schein\"},\"url\":\"https:\/\/beta.iri.com\/blog\/author\/craigs\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Speeding KNIME with Voracity - IRI","description":"Abstract KNIME is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects...","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/","og_locale":"en_US","og_type":"article","og_title":"Speeding KNIME with Voracity - IRI","og_description":"Abstract KNIME is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects...","og_url":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/","og_site_name":"IRI","article_published_time":"2019-07-31T17:16:32+00:00","article_modified_time":"2022-05-24T19:17:58+00:00","og_image":[{"width":1110,"height":624,"url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png","type":"image\/png"}],"author":"Craig Schein","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Craig Schein","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#article","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/"},"author":{"name":"Craig Schein","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/803e7bda27760374008e0dee86eee2bd"},"headline":"Speeding KNIME with Voracity","datePublished":"2019-07-31T17:16:32+00:00","dateModified":"2022-05-24T19:17:58+00:00","mainEntityOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/"},"wordCount":1475,"commentCount":1,"publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png","keywords":["artificial intelligence","bi tool acceleration","business intelligence","CoSort","data science","data wrangling","deep learning","ETL","IRI CoSort","IRI Voracity","IRI Workbench","KNIME","KNIME Analytics Platform","machine learning","predictive analytics","SortCL"],"articleSection":["Big Data","Business Intelligence (BI&#041;","Data Transformation","ETL","IRI","IRI Business","IRI Workbench"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/","url":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/","name":"Speeding KNIME with Voracity - IRI","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png","datePublished":"2019-07-31T17:16:32+00:00","dateModified":"2022-05-24T19:17:58+00:00","description":"Abstract KNIME is a leading open source analytic and visualization tool for data scientists. Wrangling raw data for KNIME projects...","breadcrumb":{"@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#primaryimage","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png","width":1110,"height":624},{"@type":"BreadcrumbList","@id":"https:\/\/beta.iri.com\/blog\/business-intelligence\/voracity-knime-node\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/beta.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Speeding KNIME with Voracity"}]},{"@type":"WebSite","@id":"https:\/\/beta.iri.com\/blog\/#website","url":"https:\/\/beta.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/beta.iri.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/beta.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/beta.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/803e7bda27760374008e0dee86eee2bd","name":"Craig Schein","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/dd64a949f0641d95a87230cabec062d5?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/dd64a949f0641d95a87230cabec062d5?s=96&d=blank&r=g","caption":"Craig Schein"},"url":"https:\/\/beta.iri.com\/blog\/author\/craigs\/"}]}},"jetpack_featured_media_url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/07\/Voracity_collage_IRI_logo_orca.png","_links":{"self":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13024"}],"collection":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/users\/115"}],"replies":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=13024"}],"version-history":[{"count":21,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13024\/revisions"}],"predecessor-version":[{"id":15862,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13024\/revisions\/15862"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media\/13048"}],"wp:attachment":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=13024"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=13024"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=13024"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}