Fixing PDF indexing / missing pdf_to_text transform in Plone

If poppler-utils was not installed when a Plone site was created, the pdf transforms are missing. Just installing poppler-utils will not ad them automatically. We show here, how to fix it in the debug shell.

First make sure you have the system dependencies installed, we need poppler-utils and it's dependencies.

To open the debug shell, we run ./bin/instance debug. Now run the following Python commands.

from zope.component.hooks import setSite
from Products.PortalTransforms.transforms import initialize
portal = app.Plone
setSite(portal)
initialize(portal.portal_transforms)
import transaction
transaction.commit()

Note

In the debug mode, app will be your Zope root object. To get your Plone site object with the name Plone, you just use app.Plone.

After restarting your instance, you should have pdf_to_text and pdf_to_html in your portal_transforms tool. It make sense that you reindex all your content, to have existing PDF files indexed.

By @MrTango in
Tags :