How to search a post by the attached PDF file

  • The Problem

    Sometimes you want to search for posts that have a PDF file attached to it. Assume you don't want to display the attached files themselves in the search results. But instead, you want these parent posts (maybe custom post types) shown in search results using attached files content.

    The solution to this can be complex. Because there are numerous ways how the file can be attached to the post. For example:

    1. Very simple case - you can put post ID to attachment's record as the post_parent. In this case, Wordpress considers the attachment is "uploaded" to the post.

    2. Another case is when you put the PDF attachment ID to the arbitrary post's meta field.

    3. You can put a direct PDF link or file path to the post's meta field.

    4. You can have a PDF link mentioned somewhere in the post_content. For example, you have some text where is one or more direct links to PDF files.

    5. You even can have a shortcode which is included in the post_content and this shortcode may explode to something very beautiful - for example, PDF viewer widget or real3D book.

    6. Imagine more cases...

    Ok, well. So how we should manage all these cases?

    Basic Idea of the Algorithm

    The solution could be almost impossible with the usual "direct" Wordpress search. But the indexed search is a "silver bullet" for this type of task.

    The main idea is here. We need to extract the content from the included file(s) and put it to the specific search index cluster of the parent post.

Log in to reply