This is Info file pm.info, produced by Makeinfo version 1.68 from the input file bigpm.texi.  File: pm.info, Node: WWW/Search/AOL/Classifieds/Employment, Next: WWW/Search/AlltheWeb, Prev: WWW/Search, Up: Module List class for searching Jobs Classifieds on AOL ******************************************* NAME ==== WWW::Search::AOL::Classifieds::Employment - class for searching Jobs Classifieds on AOL SYNOPSIS ======== use WWW::Search; my $oSearch = new WWW::Search('Aol'); my $sQuery = WWW::Search::escape_query("unix c++ java"); $oSearch->native_query($sQuery, {'qcqs' => ':ca:'}); while (my $res = $oSearch->next_result()) { print $res->company . "\t" . $res->title . "\t" . $res->change_date . "\t" . $res->location . "\n"; } DESCRIPTION =========== This class is a Aol specialization of WWW::Search. It handles making and interpreting Aol searches at `http://classifiedplus.aol.com' in category employment->JobSearch. The returned WWW::SearchResult objects contain url, title, *company*, location and change_date fields. OPTIONS ======= The following search options can be activated by sending a hash as the second argument to native_query(). Format / Treatment of Query Terms --------------------------------- The default is to match ALL keywords in your query. {'QY' => 2} - to match at least one word {'QY' => 5} - to match exact phrase Restrict by Job Category ------------------------ No restriction by default. To select jobs from a specific job category use the following option: {'QVSSCAT' => $job_category} Possible values of $job_category are the following: * 10 Accounting/Finance/Banking/Insurance * 20 Administrative/Clerical * 30 Creative Arts/Media * 40 Education/Training * 50 Engineering/Architecture/Design * 60 Human resources * 70 Information Technology/Computer * 80 Legal/Law Enforcement/Security * 90 Marketing/Public relations/Advertising * 100 Medical/Heath Care/Dental * 110 Online/Internet/New Media * 120 Sales/Customer Service/Sales Management * 130 Sports * 140 Travel/Hospitality/Restaurant/Transportation * 150 Other Restrict by Company Name ------------------------ {'QM' => $pattern} Display jobs where company name matches $pattern. Restrict by Location -------------------- No preference by default. Several options can restrict your search. Only one of the below listed options can be enabled at a time. {'QREG' => $region} - to select a region Regions can be: * 1 Mid-Atl * 2 Midwest * 3 Northeast * 4 Northwest * 5 Southeast * 6 Southwest * 7 West * 8 Outside USA * 9999 National {'qcqs' => $state_or_city} - more detailed selection There are too many possible values to be listed here. See `http://classifiedplus.aol.com' in category employment->JobSearch for a full list. Here are some examples from that list: to select jobs only from California use {'qcqs' => ':ca:'}, for jobs from San Fransisco use {'qcqs' => 'san francisco:ca:807'}. {'QZ' => $zip_code} - restrict by zip code. AUTHOR ====== `WWW::Search::Aol' is written and maintained by Alexander Tkatchev (Alexander.Tkatchev@cern.ch). LEGALESE ======== THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/AlltheWeb, Next: WWW/Search/AltaVista, Prev: WWW/Search/AOL/Classifieds/Employment, Up: Module List class for searching AlltheWeb ***************************** NAME ==== WWW::Search::AlltheWeb - class for searching AlltheWeb SYNOPSIS ======== use WWW::Search; $query = "sprinkler system installation how to"; $search = new WWW::Search('AlltheWeb'); $search->native_query(WWW::Search::escape_query($query)); $search->maximum_to_retrieve(100); while (my $result = $search->next_result()) { $url = $result->url; $title = $result->title; $desc = $result->description; print "$title $source
$date
$desc

\n"; } DESCRIPTION =========== AlltheWeb is a class specialization of WWW::Search. It handles making and interpreting AlltheWeb searches. This is one of the fastest and largest search engines around. `http://www.alltheweb.com'. This class exports no public interface; all interaction should be done through *Note WWW/Search: WWW/Search, objects. See SYNOPSIS. AUTHOR ====== `WWW::Search::AlltheWeb' is written by Jim Smyser Author e-mail COPYRIGHT ========= Copyright (c) 1996-1999 University of Southern California. All rights reserved. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/AltaVista, Next: WWW/Search/AltaVista/AdvancedNews, Prev: WWW/Search/AlltheWeb, Up: Module List class for searching Alta Vista ****************************** NAME ==== WWW::Search::AltaVista - class for searching Alta Vista SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('AltaVista'); DESCRIPTION =========== This class is an AltaVista specialization of WWW::Search. It handles making and interpreting AltaVista searches `http://www.altavista.com'. This class exports no public interface; all interaction should be done through WWW::Search objects. OPTIONS ======= The default is for simple web queries. Specialized back-ends for simple and advanced web and news searches are available (see *Note WWW/Search/AltaVista/Web: WWW/Search/AltaVista/Web,, *Note WWW/Search/AltaVista/AdvancedWeb: WWW/Search/AltaVista/AdvancedWeb,, *Note WWW/Search/AltaVista/News: WWW/Search/AltaVista/News,, *Note WWW/Search/AltaVista/AdvancedNews: WWW/Search/AltaVista/AdvancedNews,). These back-ends set different combinations following options. search_url=URL Specifies who to query with the AltaVista protocol. The default is at `http://www.altavista.com/cgi-bin/query'; you may wish to retarget it to `http://www.altavista.telia.com/cgi-bin/query' or other hosts if you think that they're "closer". search_debug, search_parse_debug, search_ref Specified at *Note WWW/Search: WWW/Search,. pg=aq Do advanced queries. (It defaults to simple queries.) what=news Search Usenet instead of the web. (It defaults to search the web.) SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,, or the specialized AltaVista searches described in options. HOW DOES IT WORK? ================= `native_setup_search' is called before we do anything. It initializes our private variables (which all begin with underscores) and sets up a URL to the first results page in `{_next_url}'. `native_retrieve_some' is called (from `WWW::Search::retrieve_some') whenever more hits are needed. It calls the LWP library to fetch the page specified by `{_next_url}'. It parses this page, appending any search hits it finds to `{cache}'. If it finds a "next" button in the text, it sets `{_next_url}' to point to the page for the next set of results, otherwise it sets it to undef to indicate we're done. AUTHOR and CURRENT VERSION ========================== `WWW::Search::AltaVista' is written and maintained by John Heidemann, . The best place to obtain `WWW::Search::AltaVista' is from Martin Thurn's WWW::Search releases on CPAN. Because AltaVista sometimes changes its format in between his releases, sometimes more up-to-date versions can be found at `http://www.isi.edu/~johnh/SOFTWARE/WWW_SEARCH_ALTAVISTA/index.html'. COPYRIGHT ========= Copyright (c) 1996-1998 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/AltaVista/AdvancedNews, Next: WWW/Search/AltaVista/AdvancedWeb, Prev: WWW/Search/AltaVista, Up: Module List class for advanced Alta Vista news searching ******************************************** NAME ==== WWW::Search::AltaVista::AdvancedNews - class for advanced Alta Vista news searching SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('AltaVista::AdvancedNews'); DESCRIPTION =========== This class implements the advanced AltaVista news search (specializing AltaVista and WWW::Search). It handles making and interpreting AltaVista web searches `http://www.altavista.com'. Details of AltaVista can be found at *Note WWW/Search/AltaVista: WWW/Search/AltaVista,. This class exports no public interface; all interaction should be done through WWW::Search objects. AUTHOR ====== `WWW::Search' is written by John Heidemann, . COPYRIGHT ========= Copyright (c) 1996 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/AltaVista/AdvancedWeb, Next: WWW/Search/AltaVista/Careers, Prev: WWW/Search/AltaVista/AdvancedNews, Up: Module List class for advanced Alta Vista web searching ******************************************* NAME ==== WWW::Search::AltaVista::AdvancedWeb - class for advanced Alta Vista web searching SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('AltaVista::AdvancedWeb'); DESCRIPTION =========== Class hack for Advance AltaVista web search mode originally written by John Heidemann `http://www.altavista.com'. This hack now allows for AltaVista AdvanceWeb search results to be sorted and relevant results returned first. Initially, this class had skiped the 'r' option which is used by AltaVista to sort search results for relevancy. Sending advance query using the 'q' option resulted in random returned search results which made it impossible to view best scored results first. This class exports no public interface; all interaction should be done through WWW::Search objects. USAGE ===== Advanced AltaVista searching requires boolean operators: AND, OR, AND NOT, NEAR in all uppercase. Phrases require to be enclosed in braces ( )'s instead of double quotes. Some examples: (John Heidemann) AND (lsam OR replication) AND NOT (somestupiedword OR thisone) (lsam OR replication) AND (John Heidemann) AND NOT (somestupiedword OR thisone) Batman and Robin and not Joker Batman and Robin and not (joker or riddler) Comments: For ideal results start your query with the words that matter most in being returned. This module will take those and apply them first for sorting purposes. CASE doesnt matter anymore for the Boolean operators for 'and' will be uppercased to 'AND'. This is to make constructing complex queries easier. AUTHOR ====== `WWW::Search' hack by Jim Smyser, . COPYRIGHT ========= Copyright (c) 1996 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. VERSION HISTORY =============== 2.01 - Additional query modifiers added for even better results. 2.0 - Minor change to set lowercase Boolean operators to uppercase. 1.9 - First hack version release.  File: pm.info, Node: WWW/Search/AltaVista/Careers, Next: WWW/Search/AltaVista/Intranet, Prev: WWW/Search/AltaVista/AdvancedWeb, Up: Module List class for searching www.altavistacareers.com ******************************************** NAME ==== WWW::Search::AltaVista::Careers - class for searching www.altavistacareers.com SYNOPSIS ======== use WWW::Search; my $oSearch = new WWW::Search('AltaVista::Careers'); my $sQuery = WWW::Search::escape_query("java c++)"); $oSearch->native_query($sQuery, {'state' => 'CA'}); while (my $res = $oSearch->next_result()) { print $res->title . "\t" . $res->change_date . "\t" . $res->location . "\t" . $res->url . "\n"; } DESCRIPTION =========== This class is a AltaVistaCareers specialization of WWW::Search. It handles making and interpreting AltaVistaCareers searches `http://careers.altavista.com'. The returned WWW::SearchResult objects contain url, title, location and change_date fields. OPTIONS ======= The following search options can be activated by sending a hash as the second argument to native_query(). The only available options are to select a specific location. The default is to search all locations. To change it use {'state' => $state} - Only jobs in state $state. {'city' => $city} - Only job in a specific $city AUTHOR ====== `WWW::Search::AltaVistaCareers' is written and maintained by Alexander Tkatchev (Alexander.Tkatchev@cern.ch). LEGALESE ======== THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/AltaVista/Intranet, Next: WWW/Search/AltaVista/News, Prev: WWW/Search/AltaVista/Careers, Up: Module List class for searching via AltaVista Search Intranet 2.3 ***************************************************** NAME ==== WWW::Search::AltaVista::Intranet - class for searching via AltaVista Search Intranet 2.3 SYNOPSIS ======== use WWW::Search; my $oSearch = new WWW::Search('AltaVista::Intranet', (_host => 'copper', _port => 9000),); my $sQuery = WWW::Search::escape_query("+investment +club"); $oSearch->native_query($sQuery); while (my $oResult = $oSearch->next_result()) { print $oResult->url, "\n"; } DESCRIPTION =========== This class implements a search on AltaVista's Intranet ("AVI") Search. This class exports no public interface; all interaction should be done through WWW::Search objects. NOTES ===== If your query includes characters outside the 7-bit ascii, you must tell AVI how to interpret 8-bit characters. Add an option for 'enc' to the native_query() call: $oSearch->native_query(WWW::Search::escape_query('Zürich'), { 'enc' => 'iso88591'}, ); Hopefully the correct values for various languages can be found in the AVI documentation (sorry, I haven't looked). TESTING ======= There is no standard built-in test mechanism for this module, because very few users of WWW::Search will have AVI installed on their intranet. (How's that for an excuse? ;-) AUTHOR ====== `WWW::Search::AltaVista::Intranet' was written by Martin Thurn COPYRIGHT ========= Copyright (c) 1996 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. VERSION HISTORY =============== If it"s not listed here, then it wasn"t a meaningful nor released revision. 2.04, 2000-03-09 ---------------- Added pod for selecting query language encoding 2.03, 2000-02-14 ---------------- Added support for score/rank (thanks to Peter bon Burg ) 2.02, 1999-11-29 ---------------- Fixed to work with latest version of AltaVista.pm 1.03, 1999-06-20 ---------------- First publicly-released version.  File: pm.info, Node: WWW/Search/AltaVista/News, Next: WWW/Search/AltaVista/Web, Prev: WWW/Search/AltaVista/Intranet, Up: Module List class for Alta Vista news searching *********************************** NAME ==== WWW::Search::AltaVista::News - class for Alta Vista news searching SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('AltaVista::News'); DESCRIPTION =========== This class implements the AltaVista news search (specializing AltaVista and WWW::Search). It handles making and interpreting AltaVista news searches `http://www.altavista.com'. Details of AltaVista can be found at *Note WWW/Search/AltaVista: WWW/Search/AltaVista,. This class exports no public interface; all interaction should be done through WWW::Search objects. AUTHOR ====== `WWW::Search' is written by John Heidemann, . COPYRIGHT ========= Copyright (c) 1996 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/AltaVista/Web, Next: WWW/Search/Crawler, Prev: WWW/Search/AltaVista/News, Up: Module List class for Alta Vista web searching ********************************** NAME ==== WWW::Search::AltaVista::Web - class for Alta Vista web searching SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('AltaVista::Web'); DESCRIPTION =========== This class implements the AltaVista web search (specializing AltaVista and WWW::Search). It handles making and interpreting AltaVista web searches `http://www.altavista.com'. Details of AltaVista can be found at *Note WWW/Search/AltaVista: WWW/Search/AltaVista,. This class exports no public interface; all interaction should be done through WWW::Search objects. AUTHOR ====== `WWW::Search' is written by John Heidemann, . COPYRIGHT ========= Copyright (c) 1996 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/Crawler, Next: WWW/Search/Dice, Prev: WWW/Search/AltaVista/Web, Up: Module List class for searching Crawler *************************** NAME ==== WWW::Search::Crawler - class for searching Crawler SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('Crawler'); DESCRIPTION =========== This class is an Crawler specialization of WWW::Search. It handles making and interpreting Fireball searches `http://www.crawler.de'. This class exports no public interface; all interaction should be done through WWW::Search objects. SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,. HOW DOES IT WORK? ================= `native_setup_search' is called before we do anything. It initializes our private variables (which all begin with underscores) and sets up a URL to the first results page in `{_next_url}'. `native_retrieve_some' is called (from `WWW::Search::retrieve_some') whenever more hits are needed. It calls the LWP library to fetch the page specified by `{_next_url}'. It parses this page, appending any search hits it finds to `{cache}'. If it finds a "next" button in the text, it sets `{_next_url}' to point to the page for the next set of results, otherwise it sets it to undef to indicate we're done. AUTHOR ====== `WWW::Search::Crawler' has been shamelessly copied by Andreas Borchert, from `WWW::Search::AltaVista' by John Heidemann, . COPYRIGHT ========= The original parts from John Heidemann are subject to following copyright notice: Copyright (c) 1996-1998 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/Dice, Next: WWW/Search/Excite, Prev: WWW/Search/Crawler, Up: Module List class for searching Dice ************************ NAME ==== WWW::Search::Dice - class for searching Dice SYNOPSIS ======== use WWW::Search; my $oSearch = new WWW::Search('Dice'); my $sQuery = WWW::Search::escape_query("unix and (c++ or java)"); $oSearch->native_query($sQuery, {'method' => 'bool', 'state' => 'CA', 'daysback' => 14}); while (my $res = $oSearch->next_result()) { if(isHitGood($res->url)) { my ($company,$title,$date,$location) = $oSearch->getMoreInfo($res->url); print "$company $title $date $location " . $res->url . "\n"; } } sub isHitGood {return 1;} DESCRIPTION =========== This class is a Dice specialization of WWW::Search. It handles making and interpreting Dice searches at `http://www.dice.com'. By default, returned WWW::SearchResult objects contain only url, title and description which is a mixture of location and skills wanted. Function *getMoreInfo( $url )* provides more specific info - it has to be used as my ($company,$title,$date,$location) = $oSearch->getMoreInfo($res->url); OPTIONS ======= The following search options can be activated by sending a hash as the second argument to native_query(). Format / Treatment of Query Terms --------------------------------- The default is to treat entire query as a boolean expression with AND, OR, NOT and parentheses {'method' => 'and'} Logical AND of all the query terms. {'method' => 'or'} Logical OR of all the query terms. {'method' => 'bool'} treat entire query as a boolean expression with AND, OR, NOT and parentheses. This is the default option. Restrict by Date ---------------- The default is to return jobs posted in last 30 days {'daysback' => $number} Display jobs posted in last $number days Restrict by Location -------------------- The default is "ALL" which means all US states {'state' => $state} - Only jobs in state $state. {'state' => 'CDA'} - Only jobs in Canada. {'state' => 'INT'} - To select international jobs. {'state' => 'TRV'} - Require travel. {'state' => 'TEL'} - Display telecommute jobs. Multiple selections are possible. To do so, add a "+" sign between desired states, e.g. {'state' => 'NY+NJ+CT'} You can also restrict by 3-digit area codes. The following option does that: {'acode' => $area_code} Multiple area codes (up to 5) are supported. Restrict by Job Term -------------------- No restrictions by default. {'term' => 'CON'} - contract jobs {'term' => 'C/H'} - contract to hire {'term' => 'FTE'} - full time Use a '+' sign for multiple selection. There is also a switch to select either W2 or Independent: {'addterm' => 'W2ONLY'} - W2 only {'addterm' => 'INDOK'} - Independent ok Restrict by Job Type -------------------- No restriction by default. To select jobs with specific job type use the following option: {'jtype' => $jobtype} Here $jobtype (according to `http://www.dice.com') can be one or more of the following: * ANL - Business Analyst/Modeler * COM - Communications Specialist * DBA - Data Base Administrator * ENG - Other types of Engineers * FIN - Finance / Accounting * GRA - Graphics/CAD/CAM * HWE - Hardware Engineer * INS - Instructor/Trainer * LAN - LAN/Network Administrator * MGR - Manager/Project leader * OPR - Data Processing Operator * PA - Application Programmer/Analyst * QA - Quality Assurance/Tester * REC - Recruiter * SLS - Sales/Marketing * SWE - Software Engineer * SYA - Systems Administrator * SYS - Systems Programmer/Support * TEC - Custom/Tech Support * TWR - Technical Writer * WEB - Web Developer / Webmaster Limit total number of hits -------------------------- The default is to stop searching after 500 hits. {'num_to_retrieve' => $num_to_retrieve} Changes the default to $num_to_retrieve. AUTHOR ====== `WWW::Search::Dice' is written and maintained by Alexander Tkatchev (Alexander.Tkatchev@cern.ch). LEGALESE ======== THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/Excite, Next: WWW/Search/Excite/News, Prev: WWW/Search/Dice, Up: Module List backend for searching www.excite.com ************************************ NAME ==== WWW::Search::Excite - backend for searching www.excite.com SYNOPSIS ======== use WWW::Search; my $oSearch = new WWW::Search('Excite'); my $sQuery = WWW::Search::escape_query("+sushi restaurant +Columbus Ohio"); $oSearch->native_query($sQuery); while (my $oResult = $oSearch->next_result()) { print $oResult->url, "\n"; } DESCRIPTION =========== This class is a Excite specialization of WWW::Search. It handles making and interpreting Excite searches `http://www.excite.com'. This class exports no public interface; all interaction should be done through *Note WWW/Search: WWW/Search, objects. NOTES ===== www.excite.com does not report the approximate result count. SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,. CAVEATS ======= Only returns results from Excite's "Web Results". Ignores all other sections of Excite's query results. BUGS ==== Please tell the author if you find any! AUTHOR ====== As of 1998-03-23, `WWW::Search::Excite' is maintained by Martin Thurn (MartinThurn@iname.com). `WWW::Search::Excite' was originally written by Martin Thurn based on `WWW::Search::HotBot'. LEGALESE ======== THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. VERSION HISTORY =============== 2.16, 2000-11-02 ---------------- No change in functionality, but parser was totally rewritten using HTML::TreeBuilder 2.14, 2000- ----------- BUGFIX for missing result-count sometimes; 2.13, 2000-10-10 ---------------- BUGFIX for missing result-count sometimes; BUGFIX for missing END of results; BUGFIX for mis-parsing URLs 2.12, 2000-09-18 ---------------- BUGFIX for still missing the result-count; BUGFIX for missing all results sometimes 2.11, 2000-09-05 ---------------- BUGFIX for still missing some header formats 2.07, 2000-03-29 ---------------- BUGFIX for sometimes missing header (and getting NO results) 2.06, 2000-03-02 ---------------- BUGFIX for bungled next_url 2.05, 2000-02-08 ---------------- testing now uses WWW::Search::Test module; www.excite.com only allows (up to) 50 per page (and no odd numbers) 2.04, 2000-01-28 ---------------- www.excite.com changed their output format slightly 2.03, 1999-10-20 ---------------- www.excite.com changed their output format slightly; use strip_tags() on title and description results 2.02, 1999-10-05 ---------------- now uses hash_to_cgi_string() 1.12, 1999-06-29 ---------------- updated test cases 1.10, 1999-06-11 ---------------- fixed a BUG where returned URLs were garbled (maybe this was because www.excite.com changed their links) 1.08, 1998-11-06 ---------------- www.excite.com changed their output format slightly (thank you Jim (jsmyser@bigfoot.com) for pointing it out!) 1.7, 1998-10-09 --------------- use new split_lines function 1.5 --- \n changed to \012 for MacPerl compatibility 1.4 --- Modified for new Excite output format. 1.2 --- First publicly-released version.  File: pm.info, Node: WWW/Search/Excite/News, Next: WWW/Search/ExciteForWebServers, Prev: WWW/Search/Excite, Up: Module List class for searching ExciteNews ****************************** NAME ==== WWW::Search::Excite::News - class for searching ExciteNews SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('Excite::News'); DESCRIPTION =========== Class for searching Excite News `http://www.excite.com'. Excite has one of the best news bot on the web. Following results returned for printing are: $result->{'description'} will return description if any $result->{'source'} articles news source $result->{'date'} articles date This class exports no public interface; all interaction should be done through WWW::Search objects. SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,. HOW DOES IT WORK? ================= `native_setup_search' is called before we do anything. It initializes our private variables (which all begin with underscores) and sets up a URL to the first results page in `{_next_url}'. `native_retrieve_some' is called (from `WWW::Search::retrieve_some') whenever more hits are needed. It calls the LWP library to fetch the page specified by `{_next_url}'. It parses this page, appending any search hits it finds to `{cache}'. If it finds a "next" button in the text, it sets `{_next_url}' to point to the page for the next set of results, otherwise it sets it to undef to indicate we are done. AUTHOR ====== Maintained by Jim Smyser TESTING ======= This module adheres to the `WWW::Search' test suite mechanism. See $TEST_CASES below. VERSION HISTORY =============== 2.03, 2000-03-21 ---------------- New format changes 2.02, 1999-10-5 --------------- Misc. formatting changes 2.01, 1999-07-13 ---------------- New test mechanism COPYRIGHT ========= The original parts from John Heidemann are subject to following copyright notice: Copyright (c) 1996-1998 University of Southern California. All rights reserved. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/ExciteForWebServers, Next: WWW/Search/Fireball, Prev: WWW/Search/Excite/News, Up: Module List class for searching ExciteforWeb engine *************************************** NAME ==== WWW::Search::ExciteForWebServers - class for searching ExciteforWeb engine SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('ExciteForWebServers'); DESCRIPTION =========== This class is a specialization of WWW::Search for search indices built using Excite for Web Servers (available from `http://www.excite.com'). This class exports no public interface; all interaction should be done through WWW::Search objects. This object interprets the WWW::Search `search_how' in this node attribute as follows: match_any = concept search match_all = keyword (simple) search match_phrase = error condition match_boolean= error condition AUTHOR ====== `WWW::Search::ExciteForWebServers' is written by Paul Lindner, COPYRIGHT ========= Copyright (c) 1997,98 by the United Nations Administrative Committee on Coordination (ACC) All rights reserved.  File: pm.info, Node: WWW/Search/Fireball, Next: WWW/Search/FirstGov, Prev: WWW/Search/ExciteForWebServers, Up: Module List class for searching Fireball **************************** NAME ==== WWW::Search::Fireball - class for searching Fireball SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('Fireball'); DESCRIPTION =========== This class is an Fireball specialization of WWW::Search. It handles making and interpreting Fireball searches `http://www.fireball.de'. This class exports no public interface; all interaction should be done through WWW::Search objects. SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,. HOW DOES IT WORK? ================= `native_setup_search' is called before we do anything. It initializes our private variables (which all begin with underscores) and sets up a URL to the first results page in `{_next_url}'. `native_retrieve_some' is called (from `WWW::Search::retrieve_some') whenever more hits are needed. It calls the LWP library to fetch the page specified by `{_next_url}'. It parses this page, appending any search hits it finds to `{cache}'. If it finds a "next" button in the text, it sets `{_next_url}' to point to the page for the next set of results, otherwise it sets it to undef to indicate we're done. AUTHOR ====== `WWW::Search::Fireball' has been shamelessly copied by Andreas Borchert, from `WWW::Search::AltaVista' by John Heidemann, . COPYRIGHT ========= The original parts from John Heidemann are subject to following copyright notice: Copyright (c) 1996-1998 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/FirstGov, Next: WWW/Search/FolioViews, Prev: WWW/Search/Fireball, Up: Module List class for searching http://www.firstgov.gov ******************************************* NAME ==== WWW::Search::FirstGov - class for searching http://www.firstgov.gov SYNOPSIS ======== use WWW::Search; my $search = new WWW::Search('FirstGov'); # cAsE matters my $query = WWW::Search::escape_query("uncle sam"); $search->native_query($query); while (my $result = $search->next_result()) { print $result->url, "\n"; } DESCRIPTION =========== Class specialization of WWW::Search for searching `http://www.firstgov.gov'. FirstGov.gov can return up to 100 hits per page. This class exports no public interface; all interaction should be done through WWW::Search objects. OPTIONS ======= The following search options can be activated by sending a hash as the second argument to native_query(). { 'begin_at' => '100' } Retrieve results starting at 100th match. { 'pl' => 'domain', 'domain' => 'osec.doc.gov+itd.doc.gov' } The query is limited to searching the domains osec.doc.gov and itd.doc.gov. SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,, or the specialized AltaVista searches described in options. See http://www.fed-search.org/specialized.html to learn more about specialized FirstGov searches. HOW DOES IT WORK? ================= `native_setup_search' is called before we do anything. It initializes our private variables (which all begin with underscores) and sets up a URL to the first results page in `{_next_url}'. `native_retrieve_some' is called (from `WWW::Search::retrieve_some') whenever more hits are needed. It calls the LWP library to fetch the page specified by `{_next_url}'. It parses this page, appending any search hits it finds to `{cache}'. If it finds a "next" button in the text, it sets `{_next_url}' to point to the page for the next set of results, otherwise it sets it to undef to indicate we're done. AUTHOR ====== `WWW::Search::FirstGov' is written and maintained by Dennis Sutch - . LEGALESE ======== THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. BUGS ==== None reported. VERSION HISTORY =============== 1.03 2001-03-01 - Removed 'require 5.005_62;'. 1.02 2001-03-01 - Removed 'my' declarations for package variables. 1.01 2001-02-26 - Fixed problem with quoted sring on MSWin. Removed 'our' declarations. 1.00 2001-02-23 - First publicly-released version.  File: pm.info, Node: WWW/Search/FolioViews, Next: WWW/Search/Go, Prev: WWW/Search/FirstGov, Up: Module List class for searching Folio Views ******************************* NAME ==== WWW::Search:: FolioViews - class for searching Folio Views SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('FolioViews'); DESCRIPTION =========== This class is an Folio Views specialization of WWW::Search. It queries and interprets searches based on Folio Views, which is available at `http://www.openmarket.com' This class exports no public interface; all interaction should be done through WWW::Search objects. OPTIONS ======= This search supports sytandard WWW::Search arguments search_url The Folio Views URL to search. This usually looks like `http://somehost/.../cgi-bin/search2.pl' search_args The arguments used for the search engine, separate them by &. SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,, AUTHOR ====== `WWW::Search::FolioViews' is written by Paul Lindner, , Nicholas Sapirie COPYRIGHT ========= Copyright (c) 1998 by the United Nations Administrative Committee on Coordination (ACC) All rights reserved.  File: pm.info, Node: WWW/Search/Go, Next: WWW/Search/Gopher, Prev: WWW/Search/FolioViews, Up: Module List backend class for searching with go.com *************************************** NAME ==== WWW::Search::Go - backend class for searching with go.com SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('Go'); DESCRIPTION =========== This class is an Go specialization of WWW::Search. It handles making and interpreting Go searches `http://www.Go.com', older Infoseek search engine. This class exports no public interface; all interaction should be done through WWW::Search objects. USAGE EXAMPLE ============= use WWW::Search; my $oSearch = new WWW::Search('Go'); $oSearch->maximum_to_retrieve(100); #$oSearch ->{_debug}=1; my $sQuery = WWW::Search::escape_query("cgi"); $oSearch->gui_query($sQuery); while (my $oResult = $oSearch->next_result()) { print $oResult->url,"\t",$oResult->title,"\n"; } AUTHOR ====== `WWW::Search::Go' is written by Alain BARBET, alian@alianwebserver.com  File: pm.info, Node: WWW/Search/Gopher, Next: WWW/Search/HeadHunter, Prev: WWW/Search/Go, Up: Module List class for searching Gopher pages ******************************** NAME ==== WWW::Search::Gopher - class for searching Gopher pages SYNOPSIS ======== require WWW::Search; $search = new WWW::Search('Gopher'); DESCRIPTION =========== This class is a specialization of WWW::Search that searches Gopher index items. This class exports no public interface; all interaction should be done through WWW::Search objects. AUTHOR ====== `WWW::Search::NULL' is written by Paul Lindner, COPYRIGHT ========= Copyright (c) 1997,98 by the United Nations Administrative Committee on Coordination (ACC) All rights reserved.  File: pm.info, Node: WWW/Search/HeadHunter, Next: WWW/Search/HotBot, Prev: WWW/Search/Gopher, Up: Module List class for searching HeadHunter ****************************** NAME ==== WWW::Search::HeadHunter - class for searching HeadHunter SYNOPSIS ======== use WWW::Search; my $oSearch = new WWW::Search('HeadHunter'); my $sQuery = WWW::Search::escape_query("unix and (c++ or java)"); $oSearch->native_query($sQuery, {'SID' => 'CA', 'Freshness' => 14}); while (my $res = $oSearch->next_result()) { print $res->company . "\t" . $res->title . "\t" . $res->change_date . "\t" . $res->location . "\t" . $res->url . "\n"; } DESCRIPTION =========== This class is a HeadHunter specialization of WWW::Search. It handles making and interpreting HeadHunter searches at `http://www.HeadHunter.net'. HeadHunter supports Boolean logic with "and"s "or"s. See `http://www.HeadHunter.net/Help/jobquerylang.htm' for a full description of the query language. The returned WWW::SearchResult objects contain url, title, *company*, location and change_date fields. OPTIONS ======= The following search options can be activated by sending a hash as the second argument to native_query(). Restrict by Date ---------------- The default is to return jobs posted in last 30 days (internally done by HeadHunter search engine). {'Freshness' => $number} Display jobs posted in last $number days Restrict by Location -------------------- No restriction by default. {'Town' => $town} To select jobs from approximately 30 miles around the city. {'SID' => $loc} Only jobs in state/province $loc (two letters only). {'CID' => 'US'} To view only US jobs. To see jobs from other countries, check out the acceptable country list at `http://www.Headhunter.net/listcoun.htm'. Restrict by Salary ------------------ No restrictions by default. {'Pay' => 'P1'} - less than $15,000 Per Year {'Pay' => 'P2'} - $15,000 - $30,000 Per Year {'Pay' => 'P3'} - $30,000 - $50,000 Per Year {'Pay' => 'P4'} - $50,000 - $75,000 Per Year {'Pay' => 'P4'} - $75,000 - $100,000 Per Year {'Pay' => 'P6'} - more than $100,000 Per Year To select several pay ranges use a '+' sign, e.g. {'Pay' => 'P3+P4'} Restrict by Employment Type --------------------------- No restrictions by default. {'EmpType' => 'Typ1'} - Employee {'EmpType' => 'Typ2'} - Contract {'EmpType' => 'Typ3'} - Employee or Contract {'EmpType' => 'Typ4'} - Intern Restrict by Job Category ------------------------ No restriction by default. To select jobs from a specific job category use the following option: {'Cats' => $job_category} See below the list of acceptable values of $job_category. Multiple selections are possible (up to five) using a '+' sign, e.g. {'Cats' => 'Cat001+Cat002'}. * Cat001 - Accounting * Cat002 - Activism * Cat003 - Administration * Cat004 - Advertising * Cat005 - Aerospace * Cat110 - Agriculture * Cat006 - Air Conditioning * Cat007 - Airlines * Cat008 - Apartment Management * Cat009 - Architecture * Cat010 - Art * Cat011 - Automotive * Cat012 - Aviation * Cat013 - Banking * Cat015 - Bilingual * Cat111 - Biotechnology * Cat016 - Bookkeeping * Cat017 - Broadcasting * Cat018 - Care Giving * Cat112 - Carpentry * Cat113 - Chemistry * Cat019 - Civil Service * Cat020 - Clerical * Cat021 - College * Cat114 - Communication * Cat022 - Computer * Cat023 - Construction * Cat125 - Consulting * Cat024 - Counseling * Cat025 - Customer Service * Cat026 - Decorating * Cat027 - Dental * Cat028 - Design * Cat029 - Driving * Cat030 - Education * Cat031 - Electronic * Cat032 - Emergency * Cat033 - Employment * Cat034 - Engineering * Cat035 - Entertainment * Cat036 - Environmental * Cat037 - Executive * Cat115 - Fabrication * Cat116 - Facilities * Cat038 - Fashion/Apparel * Cat039 - Financial * Cat040 - Food Services * Cat042 - Fundraising * Cat044 - General Office * Cat126 - Government * Cat045 - Graphics * Cat046 - Grocery * Cat047 - Health/Medical * Cat048 - Home Services * Cat049 - Hospital * Cat050 - Hotel/Motel * Cat052 - Human Resources * Cat053 - HVAC * Cat054 - Import/Export * Cat117 - Industrial * Cat055 - Installer * Cat056 - Insurance * Cat118 - Internet * Cat057 - Janitorial * Cat119 - Journalism * Cat058 - Law Enforcement * Cat059 - Legal * Cat060 - Maintenance * Cat061 - Management * Cat062 - Manufacturing * Cat063 - Marketing * Cat064 - Mechanical * Cat065 - Media * Cat066 - Merchandising * Cat127 - Military * Cat067 - Mining * Cat128 - Mortgage * Cat069 - Multimedia * Cat070 - Nursing * Cat071 - Nutrition * Cat121 - Packaging * Cat122 - Painting * Cat073 - Pest Control * Cat129 - Pharmaceutical * Cat075 - Photography * Cat076 - Plumbing * Cat077 - Printing * Cat078 - Professional * Cat079 - Property Management * Cat080 - Public Relations * Cat081 - Publishing * Cat082 - Purchasing * Cat083 - Quality Control * Cat123 - Radio * Cat084 - Real Estate * Cat085 - Recreation * Cat086 - Research * Cat087 - Restaurant * Cat088 - Retail * Cat089 - Sales * Cat090 - Science * Cat124 - Secretarial * Cat091 - Security * Cat092 - Services * Cat093 - Shipping/Receiving * Cat094 - Social Services * Cat130 - Supply Chain * Cat095 - Teaching * Cat096 - Technical * Cat097 - Telecommunications * Cat098 - Telemarketing * Cat099 - Television * Cat100 - Textile * Cat101 - Trades * Cat102 - Training * Cat103 - Transportation * Cat104 - Travel * Cat105 - Utilities * Cat106 - Warehouse * Cat107 - Waste Management * Cat108 - Word Processing * Cat109 - Work From Home AUTHOR ====== `WWW::Search::HeadHunter' is written and maintained by Alexander Tkatchev (Alexander.Tkatchev@cern.ch). LEGALESE ======== THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  File: pm.info, Node: WWW/Search/HotBot, Next: WWW/Search/HotFiles, Prev: WWW/Search/HeadHunter, Up: Module List backend for searching hotbot.lycos.com ************************************** NAME ==== WWW::Search::HotBot - backend for searching hotbot.lycos.com SYNOPSIS ======== use WWW::Search; my $oSearch = new WWW::Search('HotBot'); my $sQuery = WWW::Search::escape_query("+sushi restaurant +Columbus Ohio"); $oSearch->native_query($sQuery); while (my $oResult = $oSearch->next_result()) { print $oResult->url, "\n"; } DESCRIPTION =========== This class is a HotBot specialization of WWW::Search. It handles making and interpreting HotBot searches `http://www.hotbot.com'. This class exports no public interface; all interaction should be done through *Note WWW/Search: WWW/Search, objects. By default, WWW::Search::HotBot uses hotbot.com's text-only interface, which can be found at http://hotbot.lycos.com/text . If you want to perform a query with the same default options as if a user typed it in the browser window (i.e. at http://www.hotbot.com), call $oSearch->gui_query($sQuery) instead of ->native_query(). The default behavior is for HotBot to look for "any of" the query terms: $oSearch->native_query(escape_query('Dorothy Oz')); If you want "all of", call native_query like this: $oSearch->native_query(escape_query('Dorothy Oz'), {'SM' => 'MC'}); If you want to send HotBot a boolean phrase, call native_query like this: $oSearch->native_query(escape_query('Oz AND Dorothy NOT Australia'), {'SM' => 'B'}); See below for other query-handling options. OPTIONS ======= The following search options can be activated by sending a hash as the second argument to native_query(). Format / Treatment of Query Terms --------------------------------- The default is logical OR of all the query terms. {'SM' => 'MC'} "Must Contain": logical AND of all the query terms. {'SM' => 'SC'} "Should Contain": logical OR of all the query terms. This is the default. {'SM' => 'B'} "Boolean": the entire query is treated as a boolean expression with AND, OR, NOT, and parentheses. {'SM' => 'name'} The entire query is treated as a person's name. {'SM' => 'phrase'} The entire query is treated as a phrase. {'SM' => 'title'} The query is applied to the page title. (I assume the logical OR of the query terms will be applied to the page title.) {'SM' => 'url'} The query is assumed to be a URL, and the results will be pages that link to the query URL. Restricting Search to a Date Range ---------------------------------- The default is no date restrictions. {'date' => 'within', 'DV' => 90} Only return pages updated within 90 days of today. (Substitute any integer in place of 90.) {'date' => 'range', 'DR' => 'newer', 'DY' => 97, 'DM' => 12, 'DD' => 25} Only return pages updated after Christmas 1997. (Substitute any year, month, and day for 97, 12, 25.) {'date' => 'range', 'DR' => 'older', 'DY' => 97, 'DM' => 12, 'DD' => 25} Only return pages updated before Christmas 1997. (Substitute any year, month, and day for 97, 12, 25.) Restricting Search to a Geographic Area --------------------------------------- The default is no restriction to geographic area. {'RD' => 'AN'} Return pages from anywhere. This is the default. {'RD' => 'DM', 'Domain' => 'microsoft.com, .cz'} Restrict search to pages located in the listed domains. (Substitute any list of domain substrings.) {'RD' => 'RG', 'RG' => '.com'} Restrict search to North American commercial web sites. {'RD' => 'RG', 'RG' => '.edu'} Restrict search to North American educational web sites. {'RD' => 'RG', 'RG' => '.gov'} Restrict search to United Stated Government web sites. {'RD' => 'RG', 'RG' => '.mil'} Restrict search to United States military commercial web sites. {'RD' => 'RG', 'RG' => '.net'} Restrict search to North American '.net' web sites. {'RD' => 'RG', 'RG' => '.org'} Restrict search to North American organizational web sites. {'RD' => 'RG', 'RG' => 'NA'} "North America": Restrict search to all of the above types of web sites. {'RD' => 'RG', 'RG' => 'AF'} Restrict search to web sites in Africa. {'RD' => 'RG', 'RG' => 'AS'} Restrict search to web sites in India and Asia. {'RD' => 'RG', 'RG' => 'CA'} Restrict search to web sites in Central America. {'RD' => 'RG', 'RG' => 'DU'} Restrict search to web sites in Oceania. {'RD' => 'RG', 'RG' => 'EU'} Restrict search to web sites in Europe. {'RD' => 'RG', 'RG' => 'ME'} Restrict search to web sites in the Middle East. {'RD' => 'RG', 'RG' => 'SE'} Restrict search to web sites in Southeast Asia. Requesting Certain Multimedia Data Types ---------------------------------------- The default is not specifically requesting any multimedia types (presumably, this will NOT restrict the search to NON-multimedia pages). {'FAC' => 1} Return pages which contain Adobe Acrobat PDF data. {'FAX' => 1} Return pages which contain ActiveX. {'FJA' => 1} Return pages which contain Java. {'FJS' => 1} Return pages which contain JavaScript. {'FRA' => 1} Return pages which contain audio. {'FSU' => 1, 'FS' => '.txt, .doc'} Return pages which have one of the listed extensions. (Substitute any list of DOS-like file extensions.) {'FSW' => 1} Return pages which contain ShockWave. {'FVI' => 1} Return pages which contain images. {'FVR' => 1} Return pages which contain VRML. {'FVS' => 1} Return pages which contain VB Script. {'FVV' => 1} Return pages which contain video. Requesting Pages at Certain Depths on Website --------------------------------------------- The default is pages at any level on their website. {'PS'=>'A'} Return pages at any level on their website. This is the default. {'PS' => 'D', 'D' => 3 } Return pages within 3 links of "top" page of their website. (Substitute any integer in place of 3.) {'PS' => 'F'} Only return pages that are the "top" page of their website. SEE ALSO ======== To make new back-ends, see *Note WWW/Search: WWW/Search,. CAVEATS ======= When www.hotbot.com reports a "Mirror" URL, WWW::Search::HotBot ignores it. Therefore, the number of URLs returned by WWW::Search::HotBot might not agree with the value returned in approximate_result_count. BUGS ==== Please tell the author if you find any! AUTHOR ====== As of 1998-02-02, `WWW::Search::HotBot' is maintained by Martin Thurn (MartinThurn@iname.com). `WWW::Search::HotBot' was originally written by Wm. L. Scheding, based on `WWW::Search::AltaVista'. LEGALESE ======== THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. VERSION HISTORY =============== If it is not listed here, then it was not a meaningful nor released revision. 2.21, 2000-12-11 ---------------- new URL for advanced search 2.19, 2000-10-11 ---------------- added AM1=MC to all URLs in GUI mode (hotbot.com seems to "randomly" add this if you search manually at their site) 2.18, 2000-06-26 ---------------- fix for only one page of gui results; and "next" link in new place 2.17, 2000-05-24 ---------------- was still missing first URL of non-gui(?) results! 2.16, 2000-05-17 ---------------- was missing first URL of gui results 2.15, 2000-04-03 ---------------- fixed gui_query() 2.14, 2000-02-01 ---------------- testing now uses WWW::Search::Test module 2.13, 2000-01-31 ---------------- bugfix: was missing title 2.12, 2000-01-19 ---------------- new function gui_query(), and handle output from it 2.10, 1999-12-22 ---------------- handle new result format 2.09, 1999-12-15 ---------------- handle new result count format 2.08, 1999-12-10 ---------------- handle new output format 2.07, 1999-11-12 ---------------- BUGFIX for domain-limited URL parsing (thanks to Leon Brocard) 2.06, 1999-10-18 ---------------- www.hotbot.com changed their output format slightly; now uses strip_tags() on title and description 2.05, 1999-10-05 ---------------- now uses hash_to_cgi_string(); new test cases 2.03, 1999-09-28 ---------------- BUGFIX: was missing the "Next page" link sometimes. 2.02, 1999-08-17 ---------------- Now is able to parse "URL-only" format (i.e. {'DE' => 0}) and "brief description" format (i.e. {'DE' => 1}) if the user so desires. 1.34, 1999-07-01 ---------------- New test cases. 1.32, 1999-06-20 ---------------- Now unescapes the URLs before returning them. 1.31, 1999-06-11 ---------------- www.hotbot.com changed their output format ever so slightly. (Thanks to Jim jsmyser@bigfoot.com for pointing it out) 1.30, 1999-04-12 ---------------- BUG FIX: results for domain-limited search were not parsed. (Thanks to Christopher York yorkc@ccwf.cc.utexas.edu for pointing it out) 1.29, 1999-02-22 ---------------- www.hotbot.com changed their output format. (Thanks to Tim Chklovski timc@mit.edu for pointing it out) 1.27, 1998-11-06 ---------------- HotBot changed their output format(?). HotBot.pm now uses hotbot.com's text-only search results format. Minor documentation changes. 1.25, 1998-09-11 ---------------- HotBot changed their output format ever so slightly. Documentation added for all known HotBot query options! 1.23 ---- Better documentation for boolean queries. (Thanks to Jason Titus jason_titus@odsnet.com) 1.22 ---- www.hotbot.com changed their output format. 1.21 ---- www.hotbot.com changed their output format. 1.17 ---- www.hotbot.com changed their search script location and output format on 1998-05-21. Also, as many as 6 fields of each SearchResult are now filled in. 1.13 ---- Fixed the maximum_to_retrieve off-by-one problem. Updated test cases. 1.12 ---- www.hotbot.com does not do truncation. Therefore, if the query contains truncation characters (i.e. '*' at end of words), they are simply deleted before the query is sent to www.hotbot.com. 1.11, 1998-02-05 ---------------- Fixed and revamped by Martin Thurn.