Get WPFTS Pro today with 25% discount!

How to search file by content when using PDFViewer plugin


  • There is a good (albeit somewhat outdated) plugin for embedding PDF files directly into a WordPress page - PDFViewer. Using the PDF Viewer plugin, you can insert a special shortcode into a page or post and specify a direct link to the PDF file inside this shortcode. This article will let you know how you can search posts with embedded PDF files by their content.

    It is extremely convenient. However, in order for the WPFTS plugin to index the contents of the inserted file and include it in the page/post index, a little bit of coding magic is needed. This will make the parent page searchable by the content of the files inserted into it!

    I want to show you how to write this magic code.

    So, we need to add a wpfts_index_post hook handler, inside which we will check the types of posts that need to be processed. If the post_type matches the desired one, then you will need to look for the [pdfviewer] tag in its content, find the link to the PDF file that this tag contains. Next, we will need to pull the content of each of these PDFs and add it to the index of the parent post.

    Seems simple? Now, look at the code. I tried to add comments to make it clear what and where. But if you still have questions - write in the comments.

    add_filter('wpfts_index_post', function($index, $p)
    {
            global $wpdb, $wpfts_core;
            
            // Check if we are processing the correct post_type
            if (in_array($p->post_type, array('post', 'page'))) {
                    global $post;
                    
                    // Look for pdfviewer tags using PREG expression
                    if (preg_match_all('~\[pdfviewer((\s+[^\]]*)|(\]))\]([^\[]*)\[/pdfviewer\]~sU', $p->post_content, $zz, PREG_OFFSET_CAPTURE)) {
    					// offset 4 is a file URL
    					// offset 0 is a whole shortcode tag
    
                                            // Let's include WPFTS Utils (we will need it below)
    					require_once $wpfts_core->root_dir.'/includes/wpfts_utils.class.php';
    
                                            // We going to collect extracted texts here
    					$sum = '';
    
    					foreach ($zz[4] as $k => $d) {
    						$url = $d[0];
    						if (preg_match('~^http~', $url)) {
                                                            // In case we have a correct URL, let's extract the text from that file.
                                                            // This method is using caching to prevent repeated extractions
    							$ret = WPFTS_Utils::GetCachedFileContent_ByLocalLink($url);
    
                                                            // Summarize extracted content
    							$sum .= (isset($ret['post_content']) ? trim($ret['post_content']) : '').' ';
    						}
    					}
    
                                            // Store extracted texts into the separate cluster
    					$index['pdfviewer_content'] .= $sum;
    
                                            // But we are not yet finished. Let's remove [shortcodes] from the content to be sure they will not appear in search results
    					global $shortcode_tags;
    
                                            // Temporary disable shortcode processor for [pdfviewer]
    					$removed_tmp = array();
    					
    					$shortcode_list = array('pdfviewer');
    
    					foreach ($shortcode_list as $dd) {
    						if (isset($shortcode_tags[$dd])) {
    							$removed_tmp[$dd] = $shortcode_tags[$dd];	// Save shortcode function
    							unset($shortcode_tags[$dd]);
    							add_shortcode($dd, function(){ return ''; });	// Dummy function to render empty string for shortcode
    						}
    					}
    
                                            // Render post content with shortcodes
    					$post = get_post($p->ID);
    					setup_postdata($post);
                    
    					ob_start();
    					the_content();
                    
    					$r = ob_get_clean();
    					$r = strip_tags(str_replace('<', ' <', $r));
    				
    					// Restore disabled shortcode processors
    					foreach ($removed_tmp as $k => $d) {
    						$shortcode_tags[$k] = $removed_tmp[$k];
    					}
    
    					wp_reset_postdata();
                     
                                            // Okay, we are done. Just store result into the cluster  
    					$index['post_content'] = $r;
    				}
    		}
            
            return $index;
    }, 3, 2);
    

    Using this method, you can create your own handler for any such shortcodes.

    If you don't want to mess with a code, simple download and install this addon.
    WPFTS Addon for PDFViewer (zip)

Suggested Topics

  • Slow search on a site based on Divi Theme

    Recipes and Known Solutions
    1
    0 Votes
    1 Posts
    365 Views
    No one has replied
  • 0 Votes
    5 Posts
    1k Views
    G
    @EpsilonAdmin Thanks for quick response and question. Yes, my hope was to use the standard WP search widget - but I haven't explored any other option. If there's a better way I'm happy to get guidance My site has been recently re-created in WordPress after quite a few years of running under Joomla and that Joomla installation had a free plugin called jiFiles (?) which did the document scanning/indexing. A standard search, scoped on file name or a string from within file content, would pull up a list of file names each hyperlinked to the file itself to easily click on for in-browser viewing (or possibly downloading). I appreciate that there are other WordPress plugins that offer a full document management system but they have a much larger feature set than I need and are also majorly expensive for a small non-profit community web site.
  • Searching data attached to image files

    Recipes and Known Solutions
    2
    1 Votes
    2 Posts
    1k Views
    EpsilonAdminE
    You can upload images to the WP Media Library as well and add the image description to the image. WPFTS Pro can search these images by the description then. Alternatively, you can create a special post type and upload images as "Featured Image" to each of these posts, and put the description and special information to the post_content or additional meta field(s) - depending on the number of data fields and type of information. In both cases, WPFTS Pro will help you with the search.
  • 0 Votes
    1 Posts
    824 Views
    No one has replied
  • PDF Search Results: Titles and Excerpts

    Recipes and Known Solutions avada
    12
    0 Votes
    12 Posts
    3k Views
    EpsilonAdminE
    Great, thanks!

Be the first to read the news!

We are always improving our products, adding new functions and fixes. Subscribe now to be the first to get the updates and stay informed about our sales! We are not spammy. Seriously.

Join Us Now!

We are a professional IT-team. Many of us have been working in a Web IT field for more than 10 years. Our advanced experience of software development has been employed in the creation of the WordPress FullText Search plugin. All solutions implemented into the plugin have been used for 5 or more years in over 60 different web-projects.

We are looking forward to your comments, requests and suggestions in relation to the current plugin and future updates.

ewm-logo-450

The forum powered by NodeBB | Contributors