Version 1.30.85 - big improvements and fixes
-
A new version of WPFTS with a fundamentally new search algorithm is out now!
Some time ago, we have accumulated quite a lot of feedback from users who very much asked to add an improvement in search - to significantly increase the search speed on a large number of documents and add the ability to search for a phrase. This became somewhat of a challenge for us since it seemed almost impossible to implement a full-fledged indexed search using the small power of PHP and MySQL.
New Search Algorithm
But, as a result, we managed to get closer to the goal. To achieve this, the search algorithm had to be completely redone. Yes, it's still TF-IDF, but with significant additions to the logic and relevance calculation. In particular, we have added a "phrase bonus", which allows those documents in which similar phrases were encountered to be displayed in higher places. Note that phrases, in this case, can be not only EXACT matches but also partial matches of words, as well as nearby words (not only close ones).
For example, for the search phrase "infinite space", the relevance will be increased in those documents where the phrases "infinite distant space" and even "space seems to be infinite" are found. The word order in the phrase is currently ignored. An option that includes strict word order in the phrase will appear in the next updates.
Deep searching is faster
Also, the speed for "deep search" has been significantly increased. So, for example, by default the query "cat" will only find words like "catty", "catalog", that is, those whose beginning exactly coincides with the search word. When you enable "deep search", the words "subcategory", "procatering", "mystification" and the like will also be found, that is, words in which the "cat" occurs in the middle or at the end.
Previously, a "deep" search literally put the algorithm into a stupor and the search could take 10, 20, or even more seconds. Now, with our new algorithm, deep search has become much faster, and even on a large number of documents can take only 2-3 times longer than a regular "non-deep" search.
No character limit anymore
Thanks to the new algorithm, it becomes possible to get rid of the 3 character limit on the length of the request. Previously, this limitation was necessary because searching for one or two characters set in motion almost the entire search index, as a result of which the query could run for a very long time and even lead to a script crash. Now it has become possible to remove the limitation on the number of characters since requests of one or two characters only receive a limited number of matches and do not significantly slow down or crash the script.
Faster with the Word Index
We have introduced one more small index - word index. It allows you to increase the search speed significantly for a large number of documents, since it partially relieves the load on MySQL, eliminating the need to perform concatenation operations on huge tables.
MyISAM support was dropped
We have dropped support for the MyISAM table type and removed the corresponding option and all associated algorithms from the code. Tables of this type all too often lead to "locking" and despite the fact that they sometimes give a 10-20% bonus to search speed, people decided to abandon MyISAM in favor of the more modern type as InnoDB.
Faster re-indexing
Indexing and re-indexing of documents have also been significantly accelerated, although the new "word index" slightly increases the re-indexing time. We are still working on improving this part of the plugin.
Work on improving the plugin is very active, only about 40% of all planned improvements for this year have been implemented. Therefore, your feedback and understanding that bugs may occur are very important to us. If you notice an error or inaccuracy, please report it immediately on the plugin support forum.
Let me remind you that I have a Patreon page. If you want to support the development of this awesome plugin, I would be very grateful.
Thank you for using WPFTS!