This is Info file pm.info, produced by Makeinfo version 1.68 from the
input file bigpm.texi.
File: pm.info, Node: WWW/Search/AOL/Classifieds/Employment, Next: WWW/Search/AlltheWeb, Prev: WWW/Search, Up: Module List
class for searching Jobs Classifieds on AOL
*******************************************
NAME
====
WWW::Search::AOL::Classifieds::Employment - class for searching Jobs
Classifieds on AOL
SYNOPSIS
========
use WWW::Search;
my $oSearch = new WWW::Search('Aol');
my $sQuery = WWW::Search::escape_query("unix c++ java");
$oSearch->native_query($sQuery,
{'qcqs' => ':ca:'});
while (my $res = $oSearch->next_result()) {
print $res->company . "\t" . $res->title . "\t" . $res->change_date
. "\t" . $res->location . "\n";
}
DESCRIPTION
===========
This class is a Aol specialization of WWW::Search. It handles making
and interpreting Aol searches at `http://classifiedplus.aol.com' in
category employment->JobSearch.
The returned WWW::SearchResult objects contain url, title, *company*,
location and change_date fields.
OPTIONS
=======
The following search options can be activated by sending a hash as the
second argument to native_query().
Format / Treatment of Query Terms
---------------------------------
The default is to match ALL keywords in your query.
{'QY' => 2} - to match at least one word
{'QY' => 5} - to match exact phrase
Restrict by Job Category
------------------------
No restriction by default. To select jobs from a specific job category
use the following option:
{'QVSSCAT' => $job_category}
Possible values of $job_category are the following:
* 10 Accounting/Finance/Banking/Insurance
* 20 Administrative/Clerical
* 30 Creative Arts/Media
* 40 Education/Training
* 50 Engineering/Architecture/Design
* 60 Human resources
* 70 Information Technology/Computer
* 80 Legal/Law Enforcement/Security
* 90 Marketing/Public relations/Advertising
* 100 Medical/Heath Care/Dental
* 110 Online/Internet/New Media
* 120 Sales/Customer Service/Sales Management
* 130 Sports
* 140 Travel/Hospitality/Restaurant/Transportation
* 150 Other
Restrict by Company Name
------------------------
{'QM' => $pattern}
Display jobs where company name matches $pattern.
Restrict by Location
--------------------
No preference by default. Several options can restrict your search.
Only one of the below listed options can be enabled at a time.
{'QREG' => $region} - to select a region
Regions can be:
* 1 Mid-Atl
* 2 Midwest
* 3 Northeast
* 4 Northwest
* 5 Southeast
* 6 Southwest
* 7 West
* 8 Outside USA
* 9999 National
{'qcqs' => $state_or_city} - more detailed selection
There are too many possible values to be listed here. See
`http://classifiedplus.aol.com' in category employment->JobSearch for
a full list. Here are some examples from that list: to select jobs
only from California use {'qcqs' => ':ca:'}, for jobs from San
Fransisco use {'qcqs' => 'san francisco:ca:807'}.
{'QZ' => $zip_code} - restrict by zip code.
AUTHOR
======
`WWW::Search::Aol' is written and maintained by Alexander Tkatchev
(Alexander.Tkatchev@cern.ch).
LEGALESE
========
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/AlltheWeb, Next: WWW/Search/AltaVista, Prev: WWW/Search/AOL/Classifieds/Employment, Up: Module List
class for searching AlltheWeb
*****************************
NAME
====
WWW::Search::AlltheWeb - class for searching AlltheWeb
SYNOPSIS
========
use WWW::Search; $query = "sprinkler system installation how to";
$search = new WWW::Search('AlltheWeb');
$search->native_query(WWW::Search::escape_query($query));
$search->maximum_to_retrieve(100); while (my $result =
$search->next_result()) {
$url = $result->url; $title = $result->title; $desc =
$result->description;
print "$title $source
$date
$desc
\n"; }
DESCRIPTION
===========
AlltheWeb is a class specialization of WWW::Search. It handles making
and interpreting AlltheWeb searches. This is one of the fastest and
largest search engines around. `http://www.alltheweb.com'.
This class exports no public interface; all interaction should be done
through *Note WWW/Search: WWW/Search, objects. See SYNOPSIS.
AUTHOR
======
`WWW::Search::AlltheWeb' is written by Jim Smyser Author e-mail
COPYRIGHT
=========
Copyright (c) 1996-1999 University of Southern California. All rights
reserved.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/AltaVista, Next: WWW/Search/AltaVista/AdvancedNews, Prev: WWW/Search/AlltheWeb, Up: Module List
class for searching Alta Vista
******************************
NAME
====
WWW::Search::AltaVista - class for searching Alta Vista
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('AltaVista');
DESCRIPTION
===========
This class is an AltaVista specialization of WWW::Search. It handles
making and interpreting AltaVista searches `http://www.altavista.com'.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
OPTIONS
=======
The default is for simple web queries. Specialized back-ends for
simple and advanced web and news searches are available (see *Note
WWW/Search/AltaVista/Web: WWW/Search/AltaVista/Web,, *Note
WWW/Search/AltaVista/AdvancedWeb: WWW/Search/AltaVista/AdvancedWeb,, *Note
WWW/Search/AltaVista/News: WWW/Search/AltaVista/News,, *Note
WWW/Search/AltaVista/AdvancedNews: WWW/Search/AltaVista/AdvancedNews,).
These back-ends set different combinations following options.
search_url=URL
Specifies who to query with the AltaVista protocol. The default is at
`http://www.altavista.com/cgi-bin/query'; you may wish to retarget it
to `http://www.altavista.telia.com/cgi-bin/query' or other hosts if
you think that they're "closer".
search_debug, search_parse_debug, search_ref Specified at *Note WWW/Search: WWW/Search,.
pg=aq
Do advanced queries. (It defaults to simple queries.)
what=news
Search Usenet instead of the web. (It defaults to search the web.)
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,, or the
specialized AltaVista searches described in options.
HOW DOES IT WORK?
=================
`native_setup_search' is called before we do anything. It initializes
our private variables (which all begin with underscores) and sets up a URL
to the first results page in `{_next_url}'.
`native_retrieve_some' is called (from `WWW::Search::retrieve_some')
whenever more hits are needed. It calls the LWP library to fetch the page
specified by `{_next_url}'. It parses this page, appending any search
hits it finds to `{cache}'. If it finds a "next" button in the text, it
sets `{_next_url}' to point to the page for the next set of results,
otherwise it sets it to undef to indicate we're done.
AUTHOR and CURRENT VERSION
==========================
`WWW::Search::AltaVista' is written and maintained by John Heidemann,
.
The best place to obtain `WWW::Search::AltaVista' is from Martin
Thurn's WWW::Search releases on CPAN. Because AltaVista sometimes changes
its format in between his releases, sometimes more up-to-date versions can
be found at
`http://www.isi.edu/~johnh/SOFTWARE/WWW_SEARCH_ALTAVISTA/index.html'.
COPYRIGHT
=========
Copyright (c) 1996-1998 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/AltaVista/AdvancedNews, Next: WWW/Search/AltaVista/AdvancedWeb, Prev: WWW/Search/AltaVista, Up: Module List
class for advanced Alta Vista news searching
********************************************
NAME
====
WWW::Search::AltaVista::AdvancedNews - class for advanced Alta Vista
news searching
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('AltaVista::AdvancedNews');
DESCRIPTION
===========
This class implements the advanced AltaVista news search (specializing
AltaVista and WWW::Search). It handles making and interpreting AltaVista
web searches `http://www.altavista.com'.
Details of AltaVista can be found at *Note WWW/Search/AltaVista:
WWW/Search/AltaVista,.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
AUTHOR
======
`WWW::Search' is written by John Heidemann, .
COPYRIGHT
=========
Copyright (c) 1996 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/AltaVista/AdvancedWeb, Next: WWW/Search/AltaVista/Careers, Prev: WWW/Search/AltaVista/AdvancedNews, Up: Module List
class for advanced Alta Vista web searching
*******************************************
NAME
====
WWW::Search::AltaVista::AdvancedWeb - class for advanced Alta Vista web
searching
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('AltaVista::AdvancedWeb');
DESCRIPTION
===========
Class hack for Advance AltaVista web search mode originally written by
John Heidemann `http://www.altavista.com'.
This hack now allows for AltaVista AdvanceWeb search results to be
sorted and relevant results returned first. Initially, this class had
skiped the 'r' option which is used by AltaVista to sort search results
for relevancy. Sending advance query using the 'q' option resulted in
random returned search results which made it impossible to view best
scored results first.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
USAGE
=====
Advanced AltaVista searching requires boolean operators: AND, OR, AND
NOT, NEAR in all uppercase. Phrases require to be enclosed in braces ( )'s
instead of double quotes. Some examples:
(John Heidemann) AND (lsam OR replication) AND NOT (somestupiedword OR
thisone)
(lsam OR replication) AND (John Heidemann) AND NOT (somestupiedword OR
thisone)
Batman and Robin and not Joker
Batman and Robin and not (joker or riddler)
Comments: For ideal results start your query with the words that matter
most in being returned. This module will take those and apply them first
for sorting purposes.
CASE doesnt matter anymore for the Boolean operators for 'and' will be
uppercased to 'AND'. This is to make constructing complex queries easier.
AUTHOR
======
`WWW::Search' hack by Jim Smyser, .
COPYRIGHT
=========
Copyright (c) 1996 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
VERSION HISTORY
===============
2.01 - Additional query modifiers added for even better results.
2.0 - Minor change to set lowercase Boolean operators to uppercase.
1.9 - First hack version release.
File: pm.info, Node: WWW/Search/AltaVista/Careers, Next: WWW/Search/AltaVista/Intranet, Prev: WWW/Search/AltaVista/AdvancedWeb, Up: Module List
class for searching www.altavistacareers.com
********************************************
NAME
====
WWW::Search::AltaVista::Careers - class for searching
www.altavistacareers.com
SYNOPSIS
========
use WWW::Search;
my $oSearch = new WWW::Search('AltaVista::Careers');
my $sQuery = WWW::Search::escape_query("java c++)");
$oSearch->native_query($sQuery,
{'state' => 'CA'});
while (my $res = $oSearch->next_result()) {
print $res->title . "\t" . $res->change_date
. "\t" . $res->location . "\t" . $res->url . "\n";
}
DESCRIPTION
===========
This class is a AltaVistaCareers specialization of WWW::Search. It
handles making and interpreting AltaVistaCareers searches
`http://careers.altavista.com'.
The returned WWW::SearchResult objects contain url, title, location and
change_date fields.
OPTIONS
=======
The following search options can be activated by sending a hash as the
second argument to native_query().
The only available options are to select a specific location. The
default is to search all locations. To change it use
{'state' => $state} - Only jobs in state $state.
{'city' => $city} - Only job in a specific $city
AUTHOR
======
`WWW::Search::AltaVistaCareers' is written and maintained by Alexander
Tkatchev (Alexander.Tkatchev@cern.ch).
LEGALESE
========
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/AltaVista/Intranet, Next: WWW/Search/AltaVista/News, Prev: WWW/Search/AltaVista/Careers, Up: Module List
class for searching via AltaVista Search Intranet 2.3
*****************************************************
NAME
====
WWW::Search::AltaVista::Intranet - class for searching via AltaVista
Search Intranet 2.3
SYNOPSIS
========
use WWW::Search;
my $oSearch = new WWW::Search('AltaVista::Intranet',
(_host => 'copper', _port => 9000),);
my $sQuery = WWW::Search::escape_query("+investment +club");
$oSearch->native_query($sQuery);
while (my $oResult = $oSearch->next_result())
{ print $oResult->url, "\n"; }
DESCRIPTION
===========
This class implements a search on AltaVista's Intranet ("AVI") Search.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
NOTES
=====
If your query includes characters outside the 7-bit ascii, you must
tell AVI how to interpret 8-bit characters. Add an option for 'enc' to
the native_query() call:
$oSearch->native_query(WWW::Search::escape_query('Zürich'),
{ 'enc' => 'iso88591'},
);
Hopefully the correct values for various languages can be found in the
AVI documentation (sorry, I haven't looked).
TESTING
=======
There is no standard built-in test mechanism for this module, because
very few users of WWW::Search will have AVI installed on their intranet.
(How's that for an excuse? ;-)
AUTHOR
======
`WWW::Search::AltaVista::Intranet' was written by Martin Thurn
COPYRIGHT
=========
Copyright (c) 1996 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
VERSION HISTORY
===============
If it"s not listed here, then it wasn"t a meaningful nor released
revision.
2.04, 2000-03-09
----------------
Added pod for selecting query language encoding
2.03, 2000-02-14
----------------
Added support for score/rank (thanks to Peter bon Burg
)
2.02, 1999-11-29
----------------
Fixed to work with latest version of AltaVista.pm
1.03, 1999-06-20
----------------
First publicly-released version.
File: pm.info, Node: WWW/Search/AltaVista/News, Next: WWW/Search/AltaVista/Web, Prev: WWW/Search/AltaVista/Intranet, Up: Module List
class for Alta Vista news searching
***********************************
NAME
====
WWW::Search::AltaVista::News - class for Alta Vista news searching
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('AltaVista::News');
DESCRIPTION
===========
This class implements the AltaVista news search (specializing AltaVista
and WWW::Search). It handles making and interpreting AltaVista news
searches `http://www.altavista.com'.
Details of AltaVista can be found at *Note WWW/Search/AltaVista:
WWW/Search/AltaVista,.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
AUTHOR
======
`WWW::Search' is written by John Heidemann, .
COPYRIGHT
=========
Copyright (c) 1996 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/AltaVista/Web, Next: WWW/Search/Crawler, Prev: WWW/Search/AltaVista/News, Up: Module List
class for Alta Vista web searching
**********************************
NAME
====
WWW::Search::AltaVista::Web - class for Alta Vista web searching
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('AltaVista::Web');
DESCRIPTION
===========
This class implements the AltaVista web search (specializing AltaVista
and WWW::Search). It handles making and interpreting AltaVista web
searches `http://www.altavista.com'.
Details of AltaVista can be found at *Note WWW/Search/AltaVista:
WWW/Search/AltaVista,.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
AUTHOR
======
`WWW::Search' is written by John Heidemann, .
COPYRIGHT
=========
Copyright (c) 1996 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/Crawler, Next: WWW/Search/Dice, Prev: WWW/Search/AltaVista/Web, Up: Module List
class for searching Crawler
***************************
NAME
====
WWW::Search::Crawler - class for searching Crawler
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('Crawler');
DESCRIPTION
===========
This class is an Crawler specialization of WWW::Search. It handles
making and interpreting Fireball searches `http://www.crawler.de'.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,.
HOW DOES IT WORK?
=================
`native_setup_search' is called before we do anything. It initializes
our private variables (which all begin with underscores) and sets up a URL
to the first results page in `{_next_url}'.
`native_retrieve_some' is called (from `WWW::Search::retrieve_some')
whenever more hits are needed. It calls the LWP library to fetch the page
specified by `{_next_url}'. It parses this page, appending any search
hits it finds to `{cache}'. If it finds a "next" button in the text, it
sets `{_next_url}' to point to the page for the next set of results,
otherwise it sets it to undef to indicate we're done.
AUTHOR
======
`WWW::Search::Crawler' has been shamelessly copied by Andreas Borchert,
from `WWW::Search::AltaVista' by John
Heidemann, .
COPYRIGHT
=========
The original parts from John Heidemann are subject to following
copyright notice:
Copyright (c) 1996-1998 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/Dice, Next: WWW/Search/Excite, Prev: WWW/Search/Crawler, Up: Module List
class for searching Dice
************************
NAME
====
WWW::Search::Dice - class for searching Dice
SYNOPSIS
========
use WWW::Search;
my $oSearch = new WWW::Search('Dice');
my $sQuery = WWW::Search::escape_query("unix and (c++ or java)");
$oSearch->native_query($sQuery,
{'method' => 'bool',
'state' => 'CA',
'daysback' => 14});
while (my $res = $oSearch->next_result()) {
if(isHitGood($res->url)) {
my ($company,$title,$date,$location) =
$oSearch->getMoreInfo($res->url);
print "$company $title $date $location " . $res->url . "\n";
}
}
sub isHitGood {return 1;}
DESCRIPTION
===========
This class is a Dice specialization of WWW::Search. It handles making
and interpreting Dice searches at `http://www.dice.com'.
By default, returned WWW::SearchResult objects contain only url, title
and description which is a mixture of location and skills wanted.
Function *getMoreInfo( $url )* provides more specific info - it has to be
used as
my ($company,$title,$date,$location) =
$oSearch->getMoreInfo($res->url);
OPTIONS
=======
The following search options can be activated by sending a hash as the
second argument to native_query().
Format / Treatment of Query Terms
---------------------------------
The default is to treat entire query as a boolean expression with AND,
OR, NOT and parentheses
{'method' => 'and'}
Logical AND of all the query terms.
{'method' => 'or'}
Logical OR of all the query terms.
{'method' => 'bool'}
treat entire query as a boolean expression with AND, OR, NOT and
parentheses. This is the default option.
Restrict by Date
----------------
The default is to return jobs posted in last 30 days
{'daysback' => $number}
Display jobs posted in last $number days
Restrict by Location
--------------------
The default is "ALL" which means all US states
{'state' => $state} - Only jobs in state $state.
{'state' => 'CDA'} - Only jobs in Canada.
{'state' => 'INT'} - To select international jobs.
{'state' => 'TRV'} - Require travel.
{'state' => 'TEL'} - Display telecommute jobs.
Multiple selections are possible. To do so, add a "+" sign between
desired states, e.g. {'state' => 'NY+NJ+CT'}
You can also restrict by 3-digit area codes. The following option does
that:
{'acode' => $area_code}
Multiple area codes (up to 5) are supported.
Restrict by Job Term
--------------------
No restrictions by default.
{'term' => 'CON'} - contract jobs
{'term' => 'C/H'} - contract to hire
{'term' => 'FTE'} - full time
Use a '+' sign for multiple selection.
There is also a switch to select either W2 or Independent:
{'addterm' => 'W2ONLY'} - W2 only
{'addterm' => 'INDOK'} - Independent ok
Restrict by Job Type
--------------------
No restriction by default. To select jobs with specific job type use the
following option:
{'jtype' => $jobtype}
Here $jobtype (according to `http://www.dice.com') can be one or more
of the following:
* ANL - Business Analyst/Modeler
* COM - Communications Specialist
* DBA - Data Base Administrator
* ENG - Other types of Engineers
* FIN - Finance / Accounting
* GRA - Graphics/CAD/CAM
* HWE - Hardware Engineer
* INS - Instructor/Trainer
* LAN - LAN/Network Administrator
* MGR - Manager/Project leader
* OPR - Data Processing Operator
* PA - Application Programmer/Analyst
* QA - Quality Assurance/Tester
* REC - Recruiter
* SLS - Sales/Marketing
* SWE - Software Engineer
* SYA - Systems Administrator
* SYS - Systems Programmer/Support
* TEC - Custom/Tech Support
* TWR - Technical Writer
* WEB - Web Developer / Webmaster
Limit total number of hits
--------------------------
The default is to stop searching after 500 hits.
{'num_to_retrieve' => $num_to_retrieve}
Changes the default to $num_to_retrieve.
AUTHOR
======
`WWW::Search::Dice' is written and maintained by Alexander Tkatchev
(Alexander.Tkatchev@cern.ch).
LEGALESE
========
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/Excite, Next: WWW/Search/Excite/News, Prev: WWW/Search/Dice, Up: Module List
backend for searching www.excite.com
************************************
NAME
====
WWW::Search::Excite - backend for searching www.excite.com
SYNOPSIS
========
use WWW::Search;
my $oSearch = new WWW::Search('Excite');
my $sQuery = WWW::Search::escape_query("+sushi restaurant +Columbus Ohio");
$oSearch->native_query($sQuery);
while (my $oResult = $oSearch->next_result())
{ print $oResult->url, "\n"; }
DESCRIPTION
===========
This class is a Excite specialization of WWW::Search. It handles
making and interpreting Excite searches `http://www.excite.com'.
This class exports no public interface; all interaction should be done
through *Note WWW/Search: WWW/Search, objects.
NOTES
=====
www.excite.com does not report the approximate result count.
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,.
CAVEATS
=======
Only returns results from Excite's "Web Results". Ignores all other
sections of Excite's query results.
BUGS
====
Please tell the author if you find any!
AUTHOR
======
As of 1998-03-23, `WWW::Search::Excite' is maintained by Martin Thurn
(MartinThurn@iname.com).
`WWW::Search::Excite' was originally written by Martin Thurn based on
`WWW::Search::HotBot'.
LEGALESE
========
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
VERSION HISTORY
===============
2.16, 2000-11-02
----------------
No change in functionality, but parser was totally rewritten using
HTML::TreeBuilder
2.14, 2000-
-----------
BUGFIX for missing result-count sometimes;
2.13, 2000-10-10
----------------
BUGFIX for missing result-count sometimes; BUGFIX for missing END of
results; BUGFIX for mis-parsing URLs
2.12, 2000-09-18
----------------
BUGFIX for still missing the result-count; BUGFIX for missing all
results sometimes
2.11, 2000-09-05
----------------
BUGFIX for still missing some header formats
2.07, 2000-03-29
----------------
BUGFIX for sometimes missing header (and getting NO results)
2.06, 2000-03-02
----------------
BUGFIX for bungled next_url
2.05, 2000-02-08
----------------
testing now uses WWW::Search::Test module; www.excite.com only allows
(up to) 50 per page (and no odd numbers)
2.04, 2000-01-28
----------------
www.excite.com changed their output format slightly
2.03, 1999-10-20
----------------
www.excite.com changed their output format slightly; use strip_tags()
on title and description results
2.02, 1999-10-05
----------------
now uses hash_to_cgi_string()
1.12, 1999-06-29
----------------
updated test cases
1.10, 1999-06-11
----------------
fixed a BUG where returned URLs were garbled (maybe this was because
www.excite.com changed their links)
1.08, 1998-11-06
----------------
www.excite.com changed their output format slightly (thank you Jim
(jsmyser@bigfoot.com) for pointing it out!)
1.7, 1998-10-09
---------------
use new split_lines function
1.5
---
\n changed to \012 for MacPerl compatibility
1.4
---
Modified for new Excite output format.
1.2
---
First publicly-released version.
File: pm.info, Node: WWW/Search/Excite/News, Next: WWW/Search/ExciteForWebServers, Prev: WWW/Search/Excite, Up: Module List
class for searching ExciteNews
******************************
NAME
====
WWW::Search::Excite::News - class for searching ExciteNews
SYNOPSIS
========
require WWW::Search; $search = new WWW::Search('Excite::News');
DESCRIPTION
===========
Class for searching Excite News `http://www.excite.com'. Excite has
one of the best news bot on the web.
Following results returned for printing are:
$result->{'description'} will return description if any
$result->{'source'} articles news source $result->{'date'} articles date
This class exports no public interface; all interaction should be done
through WWW::Search objects.
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,.
HOW DOES IT WORK?
=================
`native_setup_search' is called before we do anything. It initializes
our private variables (which all begin with underscores) and sets up a URL
to the first results page in `{_next_url}'.
`native_retrieve_some' is called (from `WWW::Search::retrieve_some')
whenever more hits are needed. It calls the LWP library to fetch the page
specified by `{_next_url}'. It parses this page, appending any search
hits it finds to `{cache}'. If it finds a "next" button in the text, it
sets `{_next_url}' to point to the page for the next set of results,
otherwise it sets it to undef to indicate we are done.
AUTHOR
======
Maintained by Jim Smyser
TESTING
=======
This module adheres to the `WWW::Search' test suite mechanism. See
$TEST_CASES below.
VERSION HISTORY
===============
2.03, 2000-03-21
----------------
New format changes
2.02, 1999-10-5
---------------
Misc. formatting changes
2.01, 1999-07-13
----------------
New test mechanism
COPYRIGHT
=========
The original parts from John Heidemann are subject to following
copyright notice:
Copyright (c) 1996-1998 University of Southern California. All rights
reserved.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/ExciteForWebServers, Next: WWW/Search/Fireball, Prev: WWW/Search/Excite/News, Up: Module List
class for searching ExciteforWeb engine
***************************************
NAME
====
WWW::Search::ExciteForWebServers - class for searching ExciteforWeb
engine
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('ExciteForWebServers');
DESCRIPTION
===========
This class is a specialization of WWW::Search for search indices built
using Excite for Web Servers (available from `http://www.excite.com').
This class exports no public interface; all interaction should be done
through WWW::Search objects.
This object interprets the WWW::Search `search_how' in this node
attribute as follows:
match_any = concept search
match_all = keyword (simple) search
match_phrase = error condition
match_boolean= error condition
AUTHOR
======
`WWW::Search::ExciteForWebServers' is written by Paul Lindner,
COPYRIGHT
=========
Copyright (c) 1997,98 by the United Nations Administrative Committee on
Coordination (ACC)
All rights reserved.
File: pm.info, Node: WWW/Search/Fireball, Next: WWW/Search/FirstGov, Prev: WWW/Search/ExciteForWebServers, Up: Module List
class for searching Fireball
****************************
NAME
====
WWW::Search::Fireball - class for searching Fireball
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('Fireball');
DESCRIPTION
===========
This class is an Fireball specialization of WWW::Search. It handles
making and interpreting Fireball searches `http://www.fireball.de'.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,.
HOW DOES IT WORK?
=================
`native_setup_search' is called before we do anything. It initializes
our private variables (which all begin with underscores) and sets up a URL
to the first results page in `{_next_url}'.
`native_retrieve_some' is called (from `WWW::Search::retrieve_some')
whenever more hits are needed. It calls the LWP library to fetch the page
specified by `{_next_url}'. It parses this page, appending any search
hits it finds to `{cache}'. If it finds a "next" button in the text, it
sets `{_next_url}' to point to the page for the next set of results,
otherwise it sets it to undef to indicate we're done.
AUTHOR
======
`WWW::Search::Fireball' has been shamelessly copied by Andreas
Borchert, from `WWW::Search::AltaVista'
by John Heidemann, .
COPYRIGHT
=========
The original parts from John Heidemann are subject to following
copyright notice:
Copyright (c) 1996-1998 University of Southern California. All rights
reserved.
Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are duplicated
in all such forms and that any documentation, advertising materials, and
other materials related to such distribution and use acknowledge that the
software was developed by the University of Southern California,
Information Sciences Institute. The name of the University may not be
used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/FirstGov, Next: WWW/Search/FolioViews, Prev: WWW/Search/Fireball, Up: Module List
class for searching http://www.firstgov.gov
*******************************************
NAME
====
WWW::Search::FirstGov - class for searching http://www.firstgov.gov
SYNOPSIS
========
use WWW::Search;
my $search = new WWW::Search('FirstGov'); # cAsE matters
my $query = WWW::Search::escape_query("uncle sam");
$search->native_query($query);
while (my $result = $search->next_result()) {
print $result->url, "\n";
}
DESCRIPTION
===========
Class specialization of WWW::Search for searching
`http://www.firstgov.gov'.
FirstGov.gov can return up to 100 hits per page.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
OPTIONS
=======
The following search options can be activated by sending a hash as the
second argument to native_query().
{ 'begin_at' => '100' }
Retrieve results starting at 100th match.
{ 'pl' => 'domain', 'domain' => 'osec.doc.gov+itd.doc.gov' }
The query is limited to searching the domains osec.doc.gov and
itd.doc.gov.
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,, or the
specialized AltaVista searches described in options.
See http://www.fed-search.org/specialized.html to learn more about
specialized FirstGov searches.
HOW DOES IT WORK?
=================
`native_setup_search' is called before we do anything. It initializes
our private variables (which all begin with underscores) and sets up a URL
to the first results page in `{_next_url}'.
`native_retrieve_some' is called (from `WWW::Search::retrieve_some')
whenever more hits are needed. It calls the LWP library to fetch the page
specified by `{_next_url}'. It parses this page, appending any search
hits it finds to `{cache}'. If it finds a "next" button in the text, it
sets `{_next_url}' to point to the page for the next set of results,
otherwise it sets it to undef to indicate we're done.
AUTHOR
======
`WWW::Search::FirstGov' is written and maintained by Dennis Sutch -
.
LEGALESE
========
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
BUGS
====
None reported.
VERSION HISTORY
===============
1.03 2001-03-01 - Removed 'require 5.005_62;'.
1.02 2001-03-01 - Removed 'my' declarations for package variables.
1.01 2001-02-26 - Fixed problem with quoted sring on MSWin.
Removed 'our' declarations.
1.00 2001-02-23 - First publicly-released version.
File: pm.info, Node: WWW/Search/FolioViews, Next: WWW/Search/Go, Prev: WWW/Search/FirstGov, Up: Module List
class for searching Folio Views
*******************************
NAME
====
WWW::Search:: FolioViews - class for searching Folio Views
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('FolioViews');
DESCRIPTION
===========
This class is an Folio Views specialization of WWW::Search. It queries
and interprets searches based on Folio Views, which is available at
`http://www.openmarket.com'
This class exports no public interface; all interaction should be done
through WWW::Search objects.
OPTIONS
=======
This search supports sytandard WWW::Search arguments
search_url
The Folio Views URL to search. This usually looks like
`http://somehost/.../cgi-bin/search2.pl'
search_args
The arguments used for the search engine, separate them by &.
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,,
AUTHOR
======
`WWW::Search::FolioViews' is written by Paul Lindner, ,
Nicholas Sapirie
COPYRIGHT
=========
Copyright (c) 1998 by the United Nations Administrative Committee on
Coordination (ACC)
All rights reserved.
File: pm.info, Node: WWW/Search/Go, Next: WWW/Search/Gopher, Prev: WWW/Search/FolioViews, Up: Module List
backend class for searching with go.com
***************************************
NAME
====
WWW::Search::Go - backend class for searching with go.com
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('Go');
DESCRIPTION
===========
This class is an Go specialization of WWW::Search. It handles making
and interpreting Go searches `http://www.Go.com', older Infoseek search
engine.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
USAGE EXAMPLE
=============
use WWW::Search;
my $oSearch = new WWW::Search('Go');
$oSearch->maximum_to_retrieve(100);
#$oSearch ->{_debug}=1;
my $sQuery = WWW::Search::escape_query("cgi");
$oSearch->gui_query($sQuery);
while (my $oResult = $oSearch->next_result())
{
print $oResult->url,"\t",$oResult->title,"\n";
}
AUTHOR
======
`WWW::Search::Go' is written by Alain BARBET, alian@alianwebserver.com
File: pm.info, Node: WWW/Search/Gopher, Next: WWW/Search/HeadHunter, Prev: WWW/Search/Go, Up: Module List
class for searching Gopher pages
********************************
NAME
====
WWW::Search::Gopher - class for searching Gopher pages
SYNOPSIS
========
require WWW::Search;
$search = new WWW::Search('Gopher');
DESCRIPTION
===========
This class is a specialization of WWW::Search that searches Gopher
index items.
This class exports no public interface; all interaction should be done
through WWW::Search objects.
AUTHOR
======
`WWW::Search::NULL' is written by Paul Lindner,
COPYRIGHT
=========
Copyright (c) 1997,98 by the United Nations Administrative Committee on
Coordination (ACC)
All rights reserved.
File: pm.info, Node: WWW/Search/HeadHunter, Next: WWW/Search/HotBot, Prev: WWW/Search/Gopher, Up: Module List
class for searching HeadHunter
******************************
NAME
====
WWW::Search::HeadHunter - class for searching HeadHunter
SYNOPSIS
========
use WWW::Search;
my $oSearch = new WWW::Search('HeadHunter');
my $sQuery = WWW::Search::escape_query("unix and (c++ or java)");
$oSearch->native_query($sQuery,
{'SID' => 'CA',
'Freshness' => 14});
while (my $res = $oSearch->next_result()) {
print $res->company . "\t" . $res->title . "\t" . $res->change_date
. "\t" . $res->location . "\t" . $res->url . "\n";
}
DESCRIPTION
===========
This class is a HeadHunter specialization of WWW::Search. It handles
making and interpreting HeadHunter searches at
`http://www.HeadHunter.net'. HeadHunter supports Boolean logic with "and"s
"or"s. See `http://www.HeadHunter.net/Help/jobquerylang.htm' for a full
description of the query language.
The returned WWW::SearchResult objects contain url, title, *company*,
location and change_date fields.
OPTIONS
=======
The following search options can be activated by sending a hash as the
second argument to native_query().
Restrict by Date
----------------
The default is to return jobs posted in last 30 days (internally done by
HeadHunter search engine).
{'Freshness' => $number}
Display jobs posted in last $number days
Restrict by Location
--------------------
No restriction by default.
{'Town' => $town}
To select jobs from approximately 30 miles around the city.
{'SID' => $loc}
Only jobs in state/province $loc (two letters only).
{'CID' => 'US'}
To view only US jobs. To see jobs from other countries, check out the
acceptable country list at `http://www.Headhunter.net/listcoun.htm'.
Restrict by Salary
------------------
No restrictions by default.
{'Pay' => 'P1'} - less than $15,000 Per Year
{'Pay' => 'P2'} - $15,000 - $30,000 Per Year
{'Pay' => 'P3'} - $30,000 - $50,000 Per Year
{'Pay' => 'P4'} - $50,000 - $75,000 Per Year
{'Pay' => 'P4'} - $75,000 - $100,000 Per Year
{'Pay' => 'P6'} - more than $100,000 Per Year
To select several pay ranges use a '+' sign, e.g. {'Pay' => 'P3+P4'}
Restrict by Employment Type
---------------------------
No restrictions by default.
{'EmpType' => 'Typ1'} - Employee
{'EmpType' => 'Typ2'} - Contract
{'EmpType' => 'Typ3'} - Employee or Contract
{'EmpType' => 'Typ4'} - Intern
Restrict by Job Category
------------------------
No restriction by default. To select jobs from a specific job category
use the following option:
{'Cats' => $job_category}
See below the list of acceptable values of $job_category. Multiple
selections are possible (up to five) using a '+' sign, e.g. {'Cats' =>
'Cat001+Cat002'}.
* Cat001 - Accounting
* Cat002 - Activism
* Cat003 - Administration
* Cat004 - Advertising
* Cat005 - Aerospace
* Cat110 - Agriculture
* Cat006 - Air Conditioning
* Cat007 - Airlines
* Cat008 - Apartment Management
* Cat009 - Architecture
* Cat010 - Art
* Cat011 - Automotive
* Cat012 - Aviation
* Cat013 - Banking
* Cat015 - Bilingual
* Cat111 - Biotechnology
* Cat016 - Bookkeeping
* Cat017 - Broadcasting
* Cat018 - Care Giving
* Cat112 - Carpentry
* Cat113 - Chemistry
* Cat019 - Civil Service
* Cat020 - Clerical
* Cat021 - College
* Cat114 - Communication
* Cat022 - Computer
* Cat023 - Construction
* Cat125 - Consulting
* Cat024 - Counseling
* Cat025 - Customer Service
* Cat026 - Decorating
* Cat027 - Dental
* Cat028 - Design
* Cat029 - Driving
* Cat030 - Education
* Cat031 - Electronic
* Cat032 - Emergency
* Cat033 - Employment
* Cat034 - Engineering
* Cat035 - Entertainment
* Cat036 - Environmental
* Cat037 - Executive
* Cat115 - Fabrication
* Cat116 - Facilities
* Cat038 - Fashion/Apparel
* Cat039 - Financial
* Cat040 - Food Services
* Cat042 - Fundraising
* Cat044 - General Office
* Cat126 - Government
* Cat045 - Graphics
* Cat046 - Grocery
* Cat047 - Health/Medical
* Cat048 - Home Services
* Cat049 - Hospital
* Cat050 - Hotel/Motel
* Cat052 - Human Resources
* Cat053 - HVAC
* Cat054 - Import/Export
* Cat117 - Industrial
* Cat055 - Installer
* Cat056 - Insurance
* Cat118 - Internet
* Cat057 - Janitorial
* Cat119 - Journalism
* Cat058 - Law Enforcement
* Cat059 - Legal
* Cat060 - Maintenance
* Cat061 - Management
* Cat062 - Manufacturing
* Cat063 - Marketing
* Cat064 - Mechanical
* Cat065 - Media
* Cat066 - Merchandising
* Cat127 - Military
* Cat067 - Mining
* Cat128 - Mortgage
* Cat069 - Multimedia
* Cat070 - Nursing
* Cat071 - Nutrition
* Cat121 - Packaging
* Cat122 - Painting
* Cat073 - Pest Control
* Cat129 - Pharmaceutical
* Cat075 - Photography
* Cat076 - Plumbing
* Cat077 - Printing
* Cat078 - Professional
* Cat079 - Property Management
* Cat080 - Public Relations
* Cat081 - Publishing
* Cat082 - Purchasing
* Cat083 - Quality Control
* Cat123 - Radio
* Cat084 - Real Estate
* Cat085 - Recreation
* Cat086 - Research
* Cat087 - Restaurant
* Cat088 - Retail
* Cat089 - Sales
* Cat090 - Science
* Cat124 - Secretarial
* Cat091 - Security
* Cat092 - Services
* Cat093 - Shipping/Receiving
* Cat094 - Social Services
* Cat130 - Supply Chain
* Cat095 - Teaching
* Cat096 - Technical
* Cat097 - Telecommunications
* Cat098 - Telemarketing
* Cat099 - Television
* Cat100 - Textile
* Cat101 - Trades
* Cat102 - Training
* Cat103 - Transportation
* Cat104 - Travel
* Cat105 - Utilities
* Cat106 - Warehouse
* Cat107 - Waste Management
* Cat108 - Word Processing
* Cat109 - Work From Home
AUTHOR
======
`WWW::Search::HeadHunter' is written and maintained by Alexander
Tkatchev (Alexander.Tkatchev@cern.ch).
LEGALESE
========
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
File: pm.info, Node: WWW/Search/HotBot, Next: WWW/Search/HotFiles, Prev: WWW/Search/HeadHunter, Up: Module List
backend for searching hotbot.lycos.com
**************************************
NAME
====
WWW::Search::HotBot - backend for searching hotbot.lycos.com
SYNOPSIS
========
use WWW::Search;
my $oSearch = new WWW::Search('HotBot');
my $sQuery = WWW::Search::escape_query("+sushi restaurant +Columbus Ohio");
$oSearch->native_query($sQuery);
while (my $oResult = $oSearch->next_result())
{ print $oResult->url, "\n"; }
DESCRIPTION
===========
This class is a HotBot specialization of WWW::Search. It handles
making and interpreting HotBot searches `http://www.hotbot.com'.
This class exports no public interface; all interaction should be done
through *Note WWW/Search: WWW/Search, objects.
By default, WWW::Search::HotBot uses hotbot.com's text-only interface,
which can be found at http://hotbot.lycos.com/text . If you want to
perform a query with the same default options as if a user typed it in the
browser window (i.e. at http://www.hotbot.com), call
$oSearch->gui_query($sQuery) instead of ->native_query().
The default behavior is for HotBot to look for "any of" the query terms:
$oSearch->native_query(escape_query('Dorothy Oz'));
If you want "all of", call native_query like this:
$oSearch->native_query(escape_query('Dorothy Oz'), {'SM' => 'MC'});
If you want to send HotBot a boolean phrase, call native_query like
this:
$oSearch->native_query(escape_query('Oz AND Dorothy NOT Australia'), {'SM' => 'B'});
See below for other query-handling options.
OPTIONS
=======
The following search options can be activated by sending a hash as the
second argument to native_query().
Format / Treatment of Query Terms
---------------------------------
The default is logical OR of all the query terms.
{'SM' => 'MC'}
"Must Contain": logical AND of all the query terms.
{'SM' => 'SC'}
"Should Contain": logical OR of all the query terms. This is the
default.
{'SM' => 'B'}
"Boolean": the entire query is treated as a boolean expression with
AND, OR, NOT, and parentheses.
{'SM' => 'name'}
The entire query is treated as a person's name.
{'SM' => 'phrase'}
The entire query is treated as a phrase.
{'SM' => 'title'}
The query is applied to the page title. (I assume the logical OR of
the query terms will be applied to the page title.)
{'SM' => 'url'}
The query is assumed to be a URL, and the results will be pages that
link to the query URL.
Restricting Search to a Date Range
----------------------------------
The default is no date restrictions.
{'date' => 'within', 'DV' => 90}
Only return pages updated within 90 days of today. (Substitute any
integer in place of 90.)
{'date' => 'range', 'DR' => 'newer', 'DY' => 97, 'DM' => 12, 'DD' => 25}
Only return pages updated after Christmas 1997. (Substitute any
year, month, and day for 97, 12, 25.)
{'date' => 'range', 'DR' => 'older', 'DY' => 97, 'DM' => 12, 'DD' => 25}
Only return pages updated before Christmas 1997. (Substitute any
year, month, and day for 97, 12, 25.)
Restricting Search to a Geographic Area
---------------------------------------
The default is no restriction to geographic area.
{'RD' => 'AN'}
Return pages from anywhere. This is the default.
{'RD' => 'DM', 'Domain' => 'microsoft.com, .cz'}
Restrict search to pages located in the listed domains. (Substitute
any list of domain substrings.)
{'RD' => 'RG', 'RG' => '.com'}
Restrict search to North American commercial web sites.
{'RD' => 'RG', 'RG' => '.edu'}
Restrict search to North American educational web sites.
{'RD' => 'RG', 'RG' => '.gov'}
Restrict search to United Stated Government web sites.
{'RD' => 'RG', 'RG' => '.mil'}
Restrict search to United States military commercial web sites.
{'RD' => 'RG', 'RG' => '.net'}
Restrict search to North American '.net' web sites.
{'RD' => 'RG', 'RG' => '.org'}
Restrict search to North American organizational web sites.
{'RD' => 'RG', 'RG' => 'NA'}
"North America": Restrict search to all of the above types of web
sites.
{'RD' => 'RG', 'RG' => 'AF'}
Restrict search to web sites in Africa.
{'RD' => 'RG', 'RG' => 'AS'}
Restrict search to web sites in India and Asia.
{'RD' => 'RG', 'RG' => 'CA'}
Restrict search to web sites in Central America.
{'RD' => 'RG', 'RG' => 'DU'}
Restrict search to web sites in Oceania.
{'RD' => 'RG', 'RG' => 'EU'}
Restrict search to web sites in Europe.
{'RD' => 'RG', 'RG' => 'ME'}
Restrict search to web sites in the Middle East.
{'RD' => 'RG', 'RG' => 'SE'}
Restrict search to web sites in Southeast Asia.
Requesting Certain Multimedia Data Types
----------------------------------------
The default is not specifically requesting any multimedia types
(presumably, this will NOT restrict the search to NON-multimedia pages).
{'FAC' => 1}
Return pages which contain Adobe Acrobat PDF data.
{'FAX' => 1}
Return pages which contain ActiveX.
{'FJA' => 1}
Return pages which contain Java.
{'FJS' => 1}
Return pages which contain JavaScript.
{'FRA' => 1}
Return pages which contain audio.
{'FSU' => 1, 'FS' => '.txt, .doc'}
Return pages which have one of the listed extensions. (Substitute
any list of DOS-like file extensions.)
{'FSW' => 1}
Return pages which contain ShockWave.
{'FVI' => 1}
Return pages which contain images.
{'FVR' => 1}
Return pages which contain VRML.
{'FVS' => 1}
Return pages which contain VB Script.
{'FVV' => 1}
Return pages which contain video.
Requesting Pages at Certain Depths on Website
---------------------------------------------
The default is pages at any level on their website.
{'PS'=>'A'}
Return pages at any level on their website. This is the default.
{'PS' => 'D', 'D' => 3 }
Return pages within 3 links of "top" page of their website.
(Substitute any integer in place of 3.)
{'PS' => 'F'}
Only return pages that are the "top" page of their website.
SEE ALSO
========
To make new back-ends, see *Note WWW/Search: WWW/Search,.
CAVEATS
=======
When www.hotbot.com reports a "Mirror" URL, WWW::Search::HotBot ignores
it. Therefore, the number of URLs returned by WWW::Search::HotBot might
not agree with the value returned in approximate_result_count.
BUGS
====
Please tell the author if you find any!
AUTHOR
======
As of 1998-02-02, `WWW::Search::HotBot' is maintained by Martin Thurn
(MartinThurn@iname.com).
`WWW::Search::HotBot' was originally written by Wm. L. Scheding, based
on `WWW::Search::AltaVista'.
LEGALESE
========
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
VERSION HISTORY
===============
If it is not listed here, then it was not a meaningful nor released
revision.
2.21, 2000-12-11
----------------
new URL for advanced search
2.19, 2000-10-11
----------------
added AM1=MC to all URLs in GUI mode (hotbot.com seems to "randomly"
add this if you search manually at their site)
2.18, 2000-06-26
----------------
fix for only one page of gui results; and "next" link in new place
2.17, 2000-05-24
----------------
was still missing first URL of non-gui(?) results!
2.16, 2000-05-17
----------------
was missing first URL of gui results
2.15, 2000-04-03
----------------
fixed gui_query()
2.14, 2000-02-01
----------------
testing now uses WWW::Search::Test module
2.13, 2000-01-31
----------------
bugfix: was missing title
2.12, 2000-01-19
----------------
new function gui_query(), and handle output from it
2.10, 1999-12-22
----------------
handle new result format
2.09, 1999-12-15
----------------
handle new result count format
2.08, 1999-12-10
----------------
handle new output format
2.07, 1999-11-12
----------------
BUGFIX for domain-limited URL parsing (thanks to Leon Brocard)
2.06, 1999-10-18
----------------
www.hotbot.com changed their output format slightly; now uses
strip_tags() on title and description
2.05, 1999-10-05
----------------
now uses hash_to_cgi_string(); new test cases
2.03, 1999-09-28
----------------
BUGFIX: was missing the "Next page" link sometimes.
2.02, 1999-08-17
----------------
Now is able to parse "URL-only" format (i.e. {'DE' => 0}) and "brief
description" format (i.e. {'DE' => 1}) if the user so desires.
1.34, 1999-07-01
----------------
New test cases.
1.32, 1999-06-20
----------------
Now unescapes the URLs before returning them.
1.31, 1999-06-11
----------------
www.hotbot.com changed their output format ever so slightly. (Thanks
to Jim jsmyser@bigfoot.com for pointing it out)
1.30, 1999-04-12
----------------
BUG FIX: results for domain-limited search were not parsed. (Thanks to
Christopher York yorkc@ccwf.cc.utexas.edu for pointing it out)
1.29, 1999-02-22
----------------
www.hotbot.com changed their output format. (Thanks to Tim Chklovski
timc@mit.edu for pointing it out)
1.27, 1998-11-06
----------------
HotBot changed their output format(?). HotBot.pm now uses hotbot.com's
text-only search results format. Minor documentation changes.
1.25, 1998-09-11
----------------
HotBot changed their output format ever so slightly. Documentation
added for all known HotBot query options!
1.23
----
Better documentation for boolean queries. (Thanks to Jason Titus
jason_titus@odsnet.com)
1.22
----
www.hotbot.com changed their output format.
1.21
----
www.hotbot.com changed their output format.
1.17
----
www.hotbot.com changed their search script location and output format
on 1998-05-21. Also, as many as 6 fields of each SearchResult are now
filled in.
1.13
----
Fixed the maximum_to_retrieve off-by-one problem. Updated test cases.
1.12
----
www.hotbot.com does not do truncation. Therefore, if the query contains
truncation characters (i.e. '*' at end of words), they are simply deleted
before the query is sent to www.hotbot.com.
1.11, 1998-02-05
----------------
Fixed and revamped by Martin Thurn.