Update of "Reference: Apache Tika"
Not logged in
Overview

Artifact ID: ce68fc0d5ae87e30fa9e1b14f52f65338cff6eb4
Page Name:Reference: Apache Tika
Date: 2020-01-29 22:42:46
Original User: martin_vahi
Parent: a0f0d070140d0bfd6917019f7778da2936acf9bd (diff)
Content

The Apache Tika is a scraper for various file formats. It's a search-engine component. 

A 2020_01_30 citation from tika.apache.org: " The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more."