Integrating Zend_Lucene with Yii

December 5, 2009 — 23 Comments
The Yii Book If you like my writing on the Yii framework, you'll love "The Yii Book"!

I’m just not a big fan of using the Zend Framework as my Web development tool, but one of the framework’s nicest features is that you can use only the parts of it you need. I am, however, a big fan of the Yii framework and one of its many plusses is that you can easily integrate other frameworks and tools into it. Like, for example, the Zend Framework. Yii does not have its own search engine functionality, and Apache’s Lucene is arguably the gold standard (although clearly not the only choice), so tapping into Zend’s Lucene module for a Yii-driven site makes a lot of sense. In this post, I’ll walk you through the steps for integrating  Zend_Lucene into Yii. This post does assume familiarity with PHP, MVC, and Yii.To start, let’s create a spot in the Yii application for the Zend Framework. Create a new directory called vendors within the Yii protected folder. This isn’t required, but as the Zend Framework is a different beast than all the Yii code, I think it’s best to separate it out. Within vendors, create a directory called Zend (or ZendFramework, if you’d rather).

Next, download the Zend Framework. You’ll want to download the latest full package, even though you’ll only use a bit of it. After the download has completed, expand the ZIP or TAR.GZ file (whichever format you choose to download the framework in). The result will be a folder named ZendFramework-x.y.z. (where x.y.z represent the full version number). Within that folder, go into library/Zend and copy Exception.php to protected/vendors/Zend. This is the file that the Zend Framework uses to report problems, so you’ll want to include it while developing and debugging Zend_Lucene with Yii. Also copy the Search folder to protected/vendors/Zend. You’ll end up with a structure like this:

In terms of the MVC architecture, the Zend Framework provides the Model to be used by this search process, but the Controller and View will still be done using Yii. First, let’s write a new Controller for searching:

class SearchController extends CController
{
    private $_indexFiles = '../runtime/search';
    public function actionIndex() {}
    public function actionCreate() {}
    public function actionSearch() {}
    public function actionUpdate() {}
}

As with all Yii Controllers, this one extends the base CController class. Within this Controller the various methods are defined, corresponding to the actions that’ll be taken in the search process. The index action is the default and is for accessing the search page without performing an actual search (e.g., clicking on a link to go to the search page). The create action will be used to generate the search database: the series of files that Lucene needs to perform its searches. The search action is for handling submission of the search form (i.e., it does the actual searching). Finally, the update action is for updating the Lucene database files when necessary (like when the site content changes). The class also has one private variable that stores the location on the server of Lucene database files. I chose to put them in a search folder found within runtime (protected/runtime/search). This class member is good to have as multiple methods will need this information but I create it as a private variable as it’s not necessary (nor should it be accessed) outside of the class. As a naming convention, some like to use underscores at the front of private class variables.

Within three of the methods (not actionIndex()), the Controller will use Zend_Lucene. In order to do so, this script needs access to the Zend files, so import the contents of the vendors directory at the top of this script, just before the class definition begins:

Yii::import('application.vendors.*');

Then, include the Lucene.php page, found within the Zend Framework Search folder:

require_once('Zend/Search/Lucene.php');

Now this Yii Controller can create objects of type Zend_Search_Lucene, which is defined in that file. The actions will use that object type to perform the searches. To start, the index action just renders the index View:

public function actionIndex()
{
    $this->render('index');
}

Presumably the index View file just shows the search form. The search form, by the way, should have an action attribute of www.example.com/index.php/search/search, so that it calls the search action of the search Controller. The form should contain a text input with the name terms.

The update action would be used by an administrator to update the search database. Perhaps it’d be called automatically after some content is generated or once per hour or day. It would destroy the existing search database and then invoke the actionCreate() method. The Lucene database can’t just be updated for whatever content changed; you need to destroy and recreate it instead. It really wouldn’t matter what View this action renders, depending upon what you want the admin to see. Maybe the View would just show a message indicating that the database has been updated.

The create action is an important one, and is where real knowledge of Lucene comes into play. The shell of it would look like so:

public function actionCreate() {
    $index = new Zend_Search_Lucene($this->_indexFile, true);
    // Add documents to the database.
    $index->commit();
    $this->render('create');
}

First, a Zend_Search_Lucene object is created (again, this is where Yii is making use of a class defined outside of Yii; it’s a sweet thing). The first argument provided when creating the object is the location of the database files. This is represented by the Controller variable, accessible in $this->_indexFile. The second argument indicates that a fresh database should be created. Next up, you add content to the database. This is complicated and well beyond the scope of what I’m writing here. I’ll try to discuss this, in brief, in a separate post, but I’d recommend you read as much as you can online first. In a very minimalistic way, you could add a single HTML page to the search database by doing this:

$url = 'http://www.example.com/index.php/page/show/id/1';
$doc = Zend_Search_Lucene_Document_Html::loadHTMLFile($url);
$index->addDocument($doc);

Finally the database has to be saved, by invoking the commit() method. And then some View is rendered. As this action would also only be likely called by an administrator or cron, it doesn’t matter much what the View contains.

Lastly, there’s the search action. This action would check for search terms, run the search against the database, then send the results on to a View:

public function actionSearch() {
    if (isset($_GET['terms'])) {
        $index = new Zend_Search_Lucene($this->_indexFile);
        $results = $index->find($_GET['terms']);
        $this->render('search', array('results' => $results));
    } else {
        $this->render('index');
    }
}

First the method checks for the presence of search terms in the URL. Then it creates a Zend_Search_Lucene object, which is necessary for both creating and using the search database. This time only the location of the search database is passed when creating the object. The object’s find() method is invoked for performing the search (it can be that simple!). Then the search View is rendered, passing it the results. If no search terms were passed to this page, the index View is rendered instead. As for the search results View, a basic version to get you started might look like this:

<h2>Search Results for "<?php echo CHtml::encode($_GET['terms']); ?>"</h2>
<?php if ($results): ?>
    <?php foreach($results as $result): ?>
        <p><?php echo CHtml::encode($result->title); ?></p>
    <?php endeach; ?>
<?php else: ?>
    <p class="error">No results matched your search terms.</p>
<?php endif; ?>

That’s largely the logic and structure of a search results View. It displays the provided search terms and checks for results. If there were some, each result title is printed. In a real application, you’d likely link the title to a URL or whatever but I don’t want to get too messy here. If you do print_r($result), you’ll see a bunch of information there that you can use.

So that’s the steps you need to take to get started using Zend_Lucene within your Yii application. These steps provide functionality; mastering Lucene is how you make this more professional. I’ll try to write more about defining a Lucene search database in subsequent posts towards that end. If you have any comments, questions, or requests, let me know.

Thanks,

Larry

If you enjoyed this post, then please consider following me using your favorite social media, the RSS feed, and/or by subscribing to my newsletter. Or go crazy, and buy one or more of my books . Thanks!

23 responses to Integrating Zend_Lucene with Yii

  1. Wonderfull post! I used Lucene behind Hibernate on the Java platform but now I’m finally able to use it in my PHP based sites also…

    • Thanks for the nice words. Much appreciated. I’m working on a post on Lucene in general, that will help augment the information covered here, although it sounds like you already know Lucene pretty well. Good like with your PHP!

  2. Thanks Larry for this wonderfull tutorials, though i have a question on the website am working on. Am using Yii framework to develop a site and the client needs me to use PHPBB3 as his forum and sphider as the php serch tool for the website. How can I integrate this two applications in the Yii site am working on or in which directory should i place these applications. Thanks

    • You’re quite welcome and thanks for the nice words. The question really is a matter of how integrated you need this all to be. For the Sphider, it should be relatively easy to integrate the public side of that with Yii but creating an actionSearch() method of your SiteController that performs the actual search. The Sphider admin could stay separate, unless you really need it to be homogeneous on the admin side. For PHPBB3, if information needs to be shared between Yii and PHPBB, that’s a bit trickier, otherwise you can put PHPBB in its own directory and then just create a template based upon the primary layout that the Yii pages use.

  3. Do i need directory such as protected/vendors for this integration.

    • Most likely not, but it depends upon how you’re doing the integration. The protected/vendors is normally where you’d put third-party, separate code libraries to be used through the Yii system. If you’re not using the Yii MVC system, then you wouldn’t put PHPBB and Sphider there.

  4. Thanks again Larry for your help, but i have one more question. Can you please show me where to place these applications i mean phpBB3 and sphider in yii framework directory. I mean how should the directory structure look like, do i need to create something like protected/vendors/phpBB3?. Thanks again

    • Please try to be patient. I answer comments as I get to them, but that won’t necessarily be as fast as you would prefer. I just answered this question for you but cannot give you a more specific answer because I don’t know how tightly integrated you’re trying to make the three components (Yii, PHPBB, and Sphider).

  5. Hello mr. Larry,can you give me some code snippet how to save the search result ?

  6. Hi Larry, You by far have given me the most insite on integrating Lucene into Yii, thank you for that.

    What I am doing is using the Zend_Search_Lucene_Document_Html class when creating the indexes, because I want my users to be able to search the static content.

    When I display the results is there any way to to display a snippet of the content with the searched word highlighted? I dont see where I can get a snippet of the surrounding content.

    Any help would be very appreciated.

  7. I would like to ask you many questions, as I am having trouble with the Lucene integration, but instead I will say thank you so much for taking the time to write these posts. You really have helped me out with learning Yii and PHP!
    I am looking forward to the upcoming book too, cheers.

    • Got Lucene running in Yii now with custom indexing. Powerful stuff. For anyone else out there who is having difficultly, check the path the script is executing from before setting the class variable path.

      Basically, I used echo getcwd() as a debug statement in the SearchController’s actionCreate method to help me through this. (I was getting mkdir() permission denied errors even though I was certain that the search directory was writable).

      I altered the path to ‘protected/runtime/search’ as opposed to ‘../runtime/search’. Perhaps, this is due to not having an .htaccess file to rewrite the index.php portion of url yet on a newer Yii install. Anyway, thanks again, this is awesome!

    • Thanks, Brian, for the nice words and the interest in my work. Good luck with your projects!

  8. The structure of the Zend framework seems to have changed.

    I had to go to here to get to/find the search folder.
    http://framework.zend.com/svn/framework/standard/trunk/library/Zend/Search/

    I haven’t finished the tutorial, but I hope I now have the correct files :)

    Side note: there is a spelling error in the first paragraph:
    Yii.To

  9. IN ZF2 search library in not in library/zend directory. Can download it from https://github.com/zendframework/ZendSearch.

Trackbacks and Pingbacks:

  1. Larry Ullman's Blog » Creating a Yii Console Application - October 7, 2010

    […] And although I initially created a Controller method for generating the index (per the instructions I sketched out in this other post), indexing a lot of content through a Web request is less than ideal. This particular site probably […]

  2. Integrating Zend_Search_Lucene with YII | Eat. Sleep. Code. Repeat. - October 29, 2013

    […] Larry Ulman’s article […]

Comments are great, but I'd strongly prefer any requests for assistance get made in the support forums. Thanks!