The T in ETL, or the “transform” stage of “extract, transform, load” data process, rarely gets the spotlight in the Big Data revolution, but it’s one of the most challenging and important elements in turning raw data into intelligence. Pulling entries from various logs, for example, is useless unless you transform the dates and times into a standard timestamp. Capturing, harvesting, extracting, transforming and loading text from the web can be even trickier, with lots of hidden information that can be very relevant to analysis. Consider, for example, a page like eBay’s that uses green and red color coding to indicate whether you have the highest bid. Advanced capabilities are required to harvest this type of information.
Kapow Katalyst offers the industry’s most robust, comprehensive data transformation and normalization capabilities to allow you to work with the Web’s unstructured data. Katalyst can handle different types, formats, spacing, binary, arithmetic operations, HTML, XTML, tags, and more. All of this can be done conditionally with unlimited if-then/else conditions for flexibility, and an ability to invoke Java or .NET or other services as required.
All of this logic can be programmed through Katalysts’s Design Studio, a visual integrated development environment (IDE). The transformations can then be tested in real time and shared with other members of the team.
You can learn more about Kapow’s transformation capabilities on the Katalyst Platform’s Business Rules and Data Transformations page.