|
||
|
| Searching |
Email This
View My Personal Library |
|
Better Performance September 2000 Vol.6 Issue 9 Page(s) 36-43 in print issue |
Searching Internet Scouring Techniques That Go Beyond The Basics | ||
|
Dylan Hunter, product manager of Internet search for Fast Search (http://www.alltheweb.com), explains that when search engines look for Web pages, they send out tools called crawlers (also known as spiders, robots, or bots). These crawlers begin by finding a number of seed pages that include a lot of links. Crawler follow these links, retrieving the linked pages, then retrieving the pages that those pages link to, and so forth, until they come back with a raw index of pages. Hunter says that on Fast Search's most recent crawl, it retrieved 1.1 billion pages. The crawler also had another 1.3 billion in a queue to retrieve when Fast Search turned the crawler off. Hunter points out, however, that this is not an entirely accurate count because as many as two-thirds of the pages were duplicates. But even without the duplicates, there are still an awful lot of pages to scour through in order to find the gems you really want. Besides the traditional search engine, another type of search tool is called a Web directory. These sites are created by human editors who define categories, then assign different Web sites to each category. Yahoo! (http://www.yahoo.com) is an example of this type of search tool. Instead of using searching techniques (although you can enter keyword searches), you drill down through a series of categories to find information about a particular subject. Although most of us use search tools to find information on the Web, we still employ the most primitive search techniques, entering only a few keywords. There are better, more sophisticated ways to find the information you seek. This article will present some advanced searching techniques, perhaps ones with which you were not familiar, and provide expert tips along the way. Refine your search. Danny Sullivan, editor of Search Engine Watch (http://www.searchenginewatch.com), a site devoted to search engine research and searching techniques, says that perhaps the most effective way to find information is to think about your query and refine your search as needed.
Try another search engine. Greg Notess, reference librarian at Montana State University and Webmaster at http://www.notess.com, a Web site devoted to search engine use and statistics, suggests that if you cannot find your results in one search engine, then try a different one. Notess says that databases are constantly changing, so you should try different ones to see if you get dramatically better results by entering the same search in a different engine. Sullivan compares it to wearing a pair of shoes. He says, "You don't wear all the same shoes all the time." There are different shoes for different types of activities, just as there are different search engines for different types of searches. He compares wearing hiking boots to searching one of the broad search engines such as AltaVista (http://www.altavista.com), while a brief index search such as visiting Yahoo! is more like wearing a pair of running shoes because you want to conduct a search quickly by using the index entries to guide you. With the French wine example we used earlier, you can go to Yahoo! and rapidly move through the index to locate information by selecting Countries on the Yahoo! home page under the Regional category. Next, select France from the list of Countries, then select Society And Culture from the France subtopic page, followed by Food And Drink, and finally Wines And Food From France. It might sound complicated, but as you go through such a process, swiftly clicking your way through the index, you'll find a relevant site rather quickly. If you want a larger variety of results, go to AltaVista and type French Wine to receive 2,500 results that include a range of related topics such as French wine regions, wine seminars, wine tours, and much more. Some users might just be content to find a couple of results with a Yahoo! type of search, but if this is not what you are looking for, you might need to cast a wider net with the AltaVista search. Look beyond.Notess suggests looking beyond the initial results. Sometimes the search engine points you in the right direction to a site related to your search, but at first glance, it may not seem to be what you are looking for. Notess says you may need to look a little deeper into a site to find what you need. To do that, he suggests checking the site map. Maybe you can find your answer in another part of the site. Or, check the navigation bar or menus. You can often find material related to your search by looking at the subjects in a site's navigation bar or drop-down menus. Or, better yet, try using the site's own internal search engine. Many sites have such a search function. Enter your keywords to see if what you want is elsewhere on the site. For example, let's assume that you want to find a wine dealer. You went to your favorite search engine, typed French Wine, and found a page titled: Find Your Favorite French Wine. You click the link to display Wine Searcher.com (http://www.winevin.com/french.html) on-screen. This page allows you to search for different wines, but you want to see a list of wine dealers. At first glance, you might discard this page thinking it is not what you are looking for, but after looking at the navigation bar at the top of the page, you decide to click Our Merchants. As a result, you find the list of wine dealers you were looking for. You can often find more specific information in this manner. Use search engine math.When most people think of Boolean logic, their eyes glaze over. Many search engines have tried to compensate for this by simplifying the act of using the Boolean search words (AND, NOT, OR) in combinations with a group of keywords to find what you are looking for. Sullivan has come up with a concept to simplify the process of using and thinking about Boolean logic. He calls it "search engine math." You can learn about it in detail at his Web site (http://www.searchenginewatch.com/facts/math.html), but it basically involves building equations using the plus (+) or minus (-) signs to clarify your query. The sign you use can have a dramatic effect on your results. "When you say Boolean, most people get scared," Sullivan says, "but [everyone] knows how to add and subtract using a plus or minus sign." For example, if you wanted to find information on Microsoft and you went to Northern Light, typing Microsoft into its search field would generate more than 5.5 million results. But suppose you wanted specific information about the Microsoft lawsuit. Try typing Microsoft + Lawsuit. That brings you down to 74,924 results. Next, let's assume that you wanted to see documents that contained references to Judge Jackson, the presiding judge in the Microsoft antitrust lawsuit. Type Microsoft + Lawsuit + Jackson in the search field, and this modified query displays only 24,888 matches.
Put it in quotes.Another way to find exactly what you want is to put your search query inside quotes. According to Sullivan in his article "Power Searching For Anyone" (http://www.searchenginewatch.com/facts /powersearch.html), when you put a phrase in quotes, it tells the search engine that you want to find this exact phrase on any Web page in the exact order you specified. This provides a good method for finding pages mentioning a brand name, specific title, or a person. If you type Boston Red Sox in the search field, you probably want to find information about the baseball team, but you could end up with matches that include the words Boston, Red, or Sox, many of which have nothing to do with the baseball team. However, by placing the phrase in quotes, the query tells the search engine that these three words must appear in succession on Web pages to be valid results. And by specifying this, you won't receive bogus results for pages featuring the Boston Massacre, the Boston Tea Party, Boston tourism information, the Red Tide, Red Lobster, etc. We should also point out that many search engines are getting better at understanding this type of query without quotes. Learn about advanced techniques. Richard Seltzer (http://www.samizdat.com), a consultant and co-author (with Eric J. Ray and Deborah S. Ray) of "The AltaVista Search Revolution" recommends looking for an advanced search page or trying out the advanced searching functions of any search engine's site. Seltzer wrote a tutorial on how to search effectively using AltaVista's advanced search page, which you can find at the AltaVista Web site (http://www.altavista.com; click Advanced Search tab, then click Advanced Search Tutorial link on the right side of the page). Seltzer says research shows that only 10% of the people who visit the AltaVista search engine actually try the advanced search functions. The only ones taking advantage of this advanced functionality seem to be professional researchers, but if the pros are using it, all of us should be, too. Seltzer says, "Very often, any complex piece of software does more than anyone knows." The same is true for search engines. Seltzer also says, "Search engines are powerful and complex, yet people assume you type a few words. If you know what you're doing, you can get at much more information." Sullivan agrees and encourages people to learn about the different commands. He also recommends that people check out the menus and options, as well as read related help documents and FAQs (frequently asked questions). He says that by taking the time to read about the searching information, you'll learn how that specific search engine works. Although all search engines work similarly, they each have different ways of presenting search options. Reading about a search site can go a long way toward helping you be a more effective searcher. Know when to use truncation.Librarian and researcher Notess recommends using truncation as a way to broaden your results. Truncation allows you to find alternate spellings or forms of words using an asterisk to represent the truncated characters. Not all sites accept this format, but for those that do (such as AltaVista and Northern Light), truncation can be an effective way to conduct a search. Notess says you can use truncation when you want to broaden your search. For example, if you were doing a search that included the word color, you could type colo*r in the search field to find both the British and American English spellings. This is most effective, Notess says, when used in combination with a phrase search. Keep in mind that this technique does not work in directories such as Yahoo!, although Yahoo! performs automatic truncation. For example, if you type Straw into a Yahoo! search field, you'll see results that include Straw, Strawn (a town in Texas), The Strawbs (a band), and Strawberry (the fruit, along with the baseball player Darryl Strawberry). If you want to limit this to just straw, you could negate the truncation by including quotes around your entry (and typing "straw"), thus restricting results to include only the actual word straw. Try using a nested Boolean query. As you grow more comfortable and familiar with advanced searching techniques such as Boolean logic, you can begin to use these techniques in very powerful ways to find the information you want with pinpoint precision. Nested queries use a combination of words, joined together using AND, NOT, or OR, along with a parenthetic expression. Notess suggests that a recipe may be the simplest way to illustrate this. So, for this example, let's assume that you look around your kitchen and find the following ingredients: chicken, rice, basil, oregano, and thyme. You want to find a recipe that will use rice and chicken in combination with one of the herbs you have available. To do this, you open a search engine site in your browser window, and using a nesting Boolean query, you try to find only recipes that match this search criteria. To achieve this, however, you would need to type the query as follows: Rice AND chicken AND (Basil OR Oregano OR Thyme). The search engine would locate any Web pages containing the words rice and chicken, then any that also contain one of three herbs: basil, oregano, or thyme. Limit the search to just titles or URLs. While you are conducting a search, you might be overwhelmed by the number of results. And even if you manage to reduce this overall number, you might still find that you're hitting every common occurrence of your word or phrase on every Web page in the search engine's database of Web pages. One good way to narrow this search, Notess says, is to limit the search to only titles by typing the title: command followed by your entry. As with the truncation or other advanced techniques, this method works best in search engines (rather than Web directories). It will have no effect in a directory such as Yahoo! where you'll generate the same results, regardless of whether you put title in front of your phrase or not.
Use real names.Sullivan suggests perhaps the most simple of all "advanced" techniques by recommending that you use real names. Many searchers have been frustrated in the past when looking for the Web site of a company with a well-known brand name. You type United Airlines into a search engine field expecting to find the airline's main Web site, but instead you find hundreds of pages devoted to the subject of United Airlines, with only a few intermixed pages that take you to a page at the airline's site. With the advent of RealNames (http://www.realnames.com), this has all changed You can now type a real name such as United Airlines and go directly to the home page of the United Airlines Web site (http://www.ual.com), or at the very least, generate a list of results with the airline's home page listed at the top as a keyword link. With this technique, you do not even have to open a search engine site, you can type the real name directly into the address field in Internet Explorer or Netscape Navigator. For example, if you type United Airlines directly into the Internet Explorer address field, the MSN (The Microsoft Network) search results window opens with the United Airlines "keyword" at the top of the page by default. As the link text indicates, this keyword link leads directly to the airline's Web site. Since most people do not know instinctively that United Airlines uses UAL in its URL, this method provides users with an easy way to find its Web site quickly. This same technique works for other company "real names," and in many other search engines, too (including Google and AltaVista). Use phrase searches in a directory. Most of the tips up to this point work best in search engines, rather than Web directories. So, as may have noticed, if you go to a directory such as Yahoo!, many of these tips will not work (or at least not as well). Notess suggests that you fashion your query in a special way when searching in a directory by using a phrase search. For example, if you wanted to find information about a physician in Massachusetts, you could type health care providers in Massachusetts in the search field. Yahoo! would then check its Massachusetts directory for health care providers. This type of search gives you a broad view of health care providers in a particular place. This is a more effective method for directories than trying one of the other search techniques discussed previously. View your results in a new window. This is not a search technique as much as a way to better manage your results. Once you generate a list of results, you generally review the list, then click one to see if it includes information on the subject of your search. When you do this, the search results window is replaced by the selected result. So, if you start clicking around the first site chosen from the list of results, you might end up moving to several different sites, and before you know it, that original results list is buried pretty far back in the browser's pages. Some sites compound this problem by refreshing each time you click the Back button, rather than letting you go back to the previous screen of results. You can try to find your original list of search results by clicking the Back button's drop-down list, but if you have conducted more than one search or have viewed more than one page of results, it can be confusing to find the exact page where you began. A good way to avoid this, Notess says, is to take advantage of some little-known functionality in your browser that allows you to view sites from the results in a separate browser window. Instead of clicking a particular result, right-click it and select Open In New Window from the shortcut menu. The result site then opens in a separate window, allowing you to browse it while keeping your original results list window open, as well. This will help prevent you from getting lost and having trouble returning to the results page where you started. This technique works in both Internet Explorer and Netscape Navigator. Save the best results with MSN. This tip is specific to the MSN search engine (http://www.msn.com). Oftentimes, when you conduct a search, you might find a particular site that suits your needs, but you want to continue to look at other results. It would be useful in this situation to save the results you like, while continuing to review other results in the results list, but most search engines will not let you do this. MSN, however, allows you to do just that by providing a Save function next to each result. Sullivan says that you can use this feature as long as you are using MSN in conjunction with the Internet Explorer 5.0 browser. (This function is not available with any other browser.) After you conduct your search, each search result has a small diskette icon next to it. Click the diskette to save the result. To view your saved results, simply click the Saved Results selection in the MSN navigation bar, and a window opens with your saved results. If you have saved more than one set of results, you also can view each set from this screen. To delete an individual result, click the Delete checkbox next to each item you want to delete and then click the Delete button. If you want to delete an entire set of results, click the Delete checkbox next to the search name. Follow the links. As mentioned in a previous tip, sometimes it pays to look around a site to find information. Along the same lines, it also pays to follow related links. Frequently you might find that the page in the results list does not have the information you need, but if you look around the site, you might find a page of related links that lead to the information you're seeking. Web designers often include links to related sites that visitors might find useful. For example, let's assume you are searching in Google for information about online investing. You type Online Investing Sites and peruse the results. You find the match CyberInvest.com (http://www.cyberinvest.com) and click it. When the page appears on-screen, look at the list of topics in the navigation bar at the top of the page. If you click the LINKSOUP option, you'll see that it leads to a page that's chock full of Web sites devoted to online investing, covering every subject you can imagine. Using this technique, you can find your way to other pages of links and discover numerous other related sites. Use a search agent. Search agent software is a meta search tool using a variety of engines and directories. It checks the search engines for answers to your queries, rather than maintaining its own directory or database of pages. Mata Hari (http://www.thewebtools.com, soon to be known as Lexibot) is one such search tool. Jerry Tardiff, founder of VisualMetrics (http://www.visualmetrics.com), the company that produces Mata Hari/Lexibot, says the program searches each of the search engines you select with your query. With the new Lexibot product, you can customize the list to include databases you subscribe to such as LEXIS-NEXIS (http://www.lexis-nexis.com/lncc). Lexibot allows you to fashion your query any way you like and includes a Boolean "checker" to make sure you have entered your Boolean query correctly. It then downloads every page of text that matches your search criteria, leaving out any graphics or files, which Tardiff says amounts to 20KB (kilobytes) to 40KB of data per page. Once you have the results, you can use the powerful sorting tools in Lexibot to pinpoint the exact information you want. Plus, to speed up your research time, you can filter out any search engines that are not likely to include the types of information you frequently search for. These tools, search agent software, take time to conduct searches, but they do provide you with powerful tools and techniques that are not available with most online search engines, so you can organize your results more efficiently. VisualMetrics is also working on a new tool called CompletePlanet. Tardiff says it will allow users to find—what his company calls—the "deep Web" sites that are created dynamically based on visitor input and retrieved from information stored in back-end databases. This new tool will allow users to access these hidden database pages, providing far wider access than is available from current Web searching tools. With these tips, and the ones you see in the accompanying sidebars, you'll be able to increase your searching efficiency and better manage your results. By taking the time to familiarize yourself with these advanced techniques and understanding how your favorite search engines and Web directories work, you can generate better results and use your Internet research time more resourcefully than ever before. by Ron Miller
|
|
Home Copyright & Legal Information Privacy Policy Site Map Contact Us