RDig 0.2.1
With this release, RDig, my Ferret-based all-in-one site search solution ;-) , can index PDF and MS Word files, too.
RDig delegates all the hard work to the pdftotext
and wvHtml
command line utilities, so you need to have the xpdf-utils
and wv
packages installed to use this feature.
Also it should be easier now to plug in custom content extractors as they will be auto-discovered and used for the content-types they declare to be able to handle.
As always, any feedback is very welcome.