Yahoo's Build your Own Search Service in Django
django (72), boss (11)In this tutorial we are going to look at building a simple Django application that integrates with theYahoo BOSS search framework. More specifically we're going to be using theBOSS Mashup Framework.
First, lets address the most pressing question:What the hell is Yahoo BOSS?BOSS isBuild Your Own Search Serviceand presents us with a fairly low level interface with Yahoo's search engine, not just to search our own site, but to search pretty much anything. The BOSS Mashup Framework, which is what we are going to be using, is open for any developers and has very few restrictions.
Fussy Details
First lets get all the little configuration stuff out of the way. There is a fair bit, but none of it is very difficult. As a warning, I'll point out that theBOSS Mashup Framework requires Python 2.5, and won't work with previous versions without some changes1.
Create a new Django project, lets call it
my_search
.django-admin.py startproject my_search
Create a Django app inside
my_search
, lets name ityahoo_search
.python2.5 manage.py startapp yahoo_search
Unzip it into the
my_search/yahoo_search
folder, and rename it toboss
.unzip boss_mashup_framework_0.1.zip rm boss_mashup_framework_0.1.zip mv boss_mashup_framework_0.1 boss
Yahoo didn't do a great job of packaging something that just works, so we have to go through a few steps to build the framework. (Although, these sub-instructions here are lifted almost directly from the included
README
file, so its not that they didn't document it, just that its a bit of a pain to get working.) In Yahoo's defense, I think the reason they did a 'bad' job of packaging is that they probably ran into some incompatable licenses.Install Simple JSON if you don't have it installed. You can check if you have it installed by entering a Python2.5 prompt and typing
importsimplejson
If that didn't work,download Simple JSON. And then install it.
python2.5 setup.py build python2.5 setup.py install
Create the folder
my_search/yahoo_search/boss/deps/
.Downloaddict2xmlandxml2dict, and extract them into the deps folder, remove the
.tgz
files, and return to theboss
目录中。tar -xzvf dict2xml.tgz tar -xzvf xml2dict.tgz rm *.tgzcd..
Now we can finally build the framework.
python2.5 setup.py build python2.5 setup.py install
Next, we have to update the settings in
boss/config.json
. I only changed the first three settings:appid
,email
, andorg
. Theappid
is the one you were given uponsigning up for BOSS.Check that it all worked by running (from within the
boss
directory):python2.5 examples/ex3.py
From here on things are going to deviate from the
README
一点,我们要行动example
andyos
into ouryahoo_search
directory, moveconfig.json
into ourmy_search
directory and get rid of everything else (well, you might want to keep theexamples
folder for your own benefit).mv config.json ../../ mv yos ../ mv examples ../cd.. rm -r boss
Okay, now we're all done with the setup, and are ready to move on to putting together a simple Django application that uses the BOSS Mashup Framework.
Defining our App
Now that we have all the setup out of the way, we need to decide exactly what our app is going to do. To begin with (however, fear not, this is posed to turn into a multi-part series where we gradually put together a more interesting app) we're going to do something really simple:search Yahoo News based on the results of a posted form.
Yep. As simple as you can get. We'll make it more interesting afterwards, when we have something that works.
URLs
First lets edit our project'surls.py
to include urls from ouryahoo_search
应用程序。my_search/urls.py
is should look like this:
fromdjango.conf.urls.defaultsimport*urlpatterns=patterns('',(r'^',include('my_search.yahoo_search.urls')),)
However, we haven't actually createdmy_search/yahoo_search/urls.py
yet, so lets do that real quick.
fromdjango.conf.urls.defaultsimport*urlpatterns=patterns('',(r'^$','my_search.yahoo_search.views.index'),)
As you can see by looking aturlpatterns
we're only going to have one viewindex
, and it is going to be handling everything for us.
Theindex
view
Now we're going to write theindex
view, which will be handling everything for us. Start out by openingmy_search/yahoo_search/views.py
. Lets start out with all the imports we're going to need.
fromdjango.shortcutsimportrender_to_responsefromdjangoimportnewformsasformsfromyos.bossimportysearchfromyos.yqlimportdb
We're going to userender_to_response
to render templates,newforms
to query our user for their search term,ysearch
for retrieving data from BOSS, anddb
to format those retrieved results into something a bit more managable.
Writing thesearch
function
Now lets write a simple search function we'll use for querying BOSS.
defsearch(str):data=ysearch.search(str,vertical="news",count=10)news=db.create(data=data)returnnews.rows
Brief Aside
If you wanted to search from Yahoo's web results instead of their news, you'd simply change the line
data=ysearch.search(str,vertical="news",count=10)
to
data=ysearch.search(str,count=10)
The data returned by thesearch
function is a list of dictionaries that look like this:
{u'sourceurl':u'http://www.channelweb.com/',u'language':u'en english',u'title':u'Google Works With eBay And PayPal To Curtail Phishing',u'url':u'http://www.crn.com/security/208808698?cid=ChannelWebBreakingNews',u'abstract':u'Google Gmail requires eBay and PayPal to use DomainKeys to authenticate mail in an anti-phish effort',u'clickurl':u'http://www.crn.com/security/208808698?cid=ChannelWebBreakingNews',u'source':u'ChannelWeb',u'time':u'22:26:08',u'date':u'2008/07/11'}
Thesearch
function is very basic, but will be enough for this initial version of the application. Lets move forward.
A simplenewform
Next we need to create a (very) simplenewform
that we will use for querying our users' for their search terms.
classSearchForm(forms.Form):search_terms=forms.CharField(max_length=200)
Thats all we'll need for now, carry on. (I said it was simple.)
Actually implementing theindex
view
Okay, now lets stop for a moment and consider what theindex
view needs to accomplish.
- It needs to check if there are any incoming POST parameters.
- If there are POST parameters, it needs to validate them using
SearchForm
, and then usesearch
to put together the results. - It needs to use
render_to_response
to render a template contain aSearchForm
, and any search results (if applicable).
Okay, translating that into Python we get ourindex
function:
defindex(request):results=Noneifrequest.method=="POST":form=SearchForm(request.POST)ifform.is_valid():search_terms=form.cleaned_data[“search_term”]results=search(search_terms)else:form=SearchForm()returnrender_to_response(“yahoo_search/index.html',{'form':form,'results':results})
Admittedly we haven't written theindex.html
template yet, that will be our next task. Beyond that, this is a pretty standard Django view.
Filling in theindex.html
template
First, we need to create the template directory for ouryahoo_search
应用程序。Frominsidethemy_search/yahoo_search
directory:
mkdir templates mkdir templates/yahoo_search
And then create the filetemplates/yahoo_search/index.hml
, and open it up in your editor. This is going to be a simple template, containing only an input box for searching, and a listing of the results.
It'll look like this:
My Search
My Search
{%ifresults%}
{%forresultinresults%}{%comment%}Notice we are using { { result.clickurl }} instead of{ { result.url }}. You might wonder why we are doingthat, and the answer is pretty simple: because thatswhat Yahoo is asking us to.http://developer.yahoo.com/search/boss/boss_guide/univer_api_query.html#url_vs_clickurl{%endcomment%}class="title">href="{ {result.clickurl}}">{ {result.title}}class="date">{ {result.date}}class="time">{ {result.time}}class="source">href="{ {result.sourceurl}}">{ {result.source}}class="abstract">{ {result.abstract}}
{%endfor%}{%endif%}
Download Zip of Files
If you haven't been keeping up, or if your code is behaving strangely,you can grab a zip of all these files. Just unzip these somewhere, fill in the first three entries (your BOSSappid
,email
, andorg
) inmy_search/config.json
, and you'll be ready to take a look at the app in the next step.
Update 7/12/2008: Unfortunately, the way the BOSS library has been built it isn't enough to simply copy overyos
folder, and instead you will need to follow the installation steps for the BOSS Framework listed above (step #6). Specifically, you need to work through those steps and finish with:
python2.5 setup.py build python2.5 setup.py install
Its a bit of a pain, and I'll see if I can clean things up to make it simpler.
Seeing it work
Now we've finished building the app, lets fire it up.
python2.5 manage.py runserver
Navigate over tohttp://127.0.0.1:8000/, and you'll see a friendly search box waiting for you. Type in a search term, hit enter, and voila, you'll see a list of your results. I searched foriPhone
and got a page of results like this:
One gotcha I'll point out is that the helper library Yahoo has supplied relies onconfig.json
being in the base directory where the Python is being run from. This will be true for your development setup, but won't necessarily be the case on your deployment server. I believe the best solution here would be to add the contents ofconfig.json
to your project'ssettings.py
file and tweak theyos/boss/ysearch.py
file to load the settings usingdjango.conf.settings
instead of from disk.
Let me know if you have any questions, and I'll try to answer them. Time permitting, I'll continue with another segment or two working on building a slightly more compelling search service than what we have created so far.
Update 7/12Thanks to Wayne's comments I was able to simplify thesearch
function quite a bit. Specifically, he pointed out that I was using the library to prependynews$
to all the dictionaries' keys, then getting upset it was there and removing it manually. Woops.
I accidentally installed it under Python 2.4 at first, and the first problem it runs into is the renaming of the
ElementTree
包在2.4和2.5之间。我没有去任何菲尔特er with that, so I'm unsure if there is anything else causing problems.↩