![]() | ![]() |
||||||||||||||||||||||||||||||
|
Home / Research Tools & Catalog / Research Guides / Library Staff Publications & Presentations / The Hidden Web By Dan Giancaterino, Internet Librarian 24 Pennsylvania Law Weekly 1022 (September 3, 2001) Updated for Web usage - July 16, 2003 General-purpose Web search engines such as AltaVista and Google are great for many types of searches. They're also rather large - they index anywhere from 500 million to 1.3 billion Web pages. But they aren't appropriate for every type of search. This leads to an important question: Why don't Google or Fast or Excite or any of the other commercial search engines index the entire World Wide Web? The Hidden Web Though general-purpose search tools are pretty big, there are large areas of the Web that their indexing software - called crawlers, spiders, bots, etc.- cannot access. Crawlers index pages by following one hypertext link to another. If a source does not have a permanent address or for some reason blocks the crawler, that source won't be indexed. Examples of this are:
* Searchable databases This content is called the "Hidden Web," the "Invisible Web" or even the "Deep Web". Whatever it is called, it is sobering to consider some conclusions drawn by BrightPlanet.com in its July 2000 "White Paper," available at www.brightplanet.com/deepcontent/index.asp
* The Hidden Web (HW) is more than 500 times larger than the known, indexed portion of the Web. Much HW content exists in the areas of business, law and government, and people, and is therefore of interest to attorneys. This article will cover some of the most valuable free HW content in these fields, as well as HW search engines and alerting services. Business Resources Three of the most popular Web sites featuring searchable databases of company information are Hoover's Online (www.hoovers.com), EDGAR (www.sec.gov) and the U.S. Patent and Trademark Office (www.uspto.gov). They are all part of the HW because search engines do not index their information. However, since these three sites are well known, they will not be discussed in this article. A good source for company profiles - especially for privately held firms - is Business.com (www.business.com). The site provides detailed information, including officers, competitors, financials and news, for more than 10,000 U.S. public, 44,000 U.S. private, and 14,000 international, companies. You can search either by company name or ticker symbol. If you need information on international companies, try Corporate Information (www.corporateinformation.com), an index to more than 350,000 company profiles worldwide. To research a non-profit organization, visit GuideStar (www.guidestar.com). Search the database of 700,000 non-profit organizations by name, location or keywords. The data is obtained from the U.S. Internal Revenue Service. For information on customer complaints, try the Better Business Bureau, Mid-Atlantic Region (www.easternpa.bbb.org/search.html). The searchable database provides access to BBB reports on companies located in the Washington, D.C., and Philadelphia metropolitan areas. Data is based on customer feedback that the BBB has received during the three previous years. Search by company name, address, or telephone number. News sites are very useful when researching companies. You can search Bizjournals.com (bizjournals.bcentral.com/search.html) for articles in business journals covering 40 major U.S. markets - including Philadelphia - from 1996 to the present. You can also use the Financial Times' (www.ft.com) global archive, a unique free source containing more than 10 million articles from more than 2,000 European, Asian and American news sources. The database is updated on a 24-hour, seven-days-a-week basis - more than 1,200 articles are added every hour! Finally, many state home pages allow you to search corporate records online. Corporate records provide date of incorporation, status (active or expired), name of registered agent, and address, among other things. They will generally be available from the Department of State. Unfortunately, Delaware and Pennsylvania do not offer online corporate records. New Jersey allows you to search records for free, but you must pay a fee to see the complete record. Law and Government The following HW resources are probably familiar to most legal researchers, but it doesn't hurt to mention them one more time. For federal legislation, visit THOMAS (thomas.loc.gov). THOMAS features a searchable database containing the full text of bills and laws from 1989 to the present and bill summaries back to 1973. If you need to do a legislative history, the full text of the Congressional Record (from 1989 to the present), roll call votes and committee reports (from 1995 to the present) are available as well. You can search by keywords, words in the text or title, sponsor, or bill number. An excellent, searchable version of the U.S. Code is available from the U.S. House of Representatives Web site at uscode.house.gov/usc.htm. If you need to keep up with federal rules and regulations, GPO Access (www.access.gpo.gov) will be invaluable to you. GPO Access features the full text of the Code of Federal Regulations from1996 to the present. GPO Access also has the full text of the Federal Register from 1995 to the present. There are two excellent sources of free case law on the Web. FindLaw (www.findlaw.com) provides searchable access to the full text of Supreme Court opinions from 1893 to the present and appellate court decisions from 1995 to the present. West Group owns FindLaw. lexisONE (www.lexisone.com) features the full text of Supreme Court opinions from 1790 to the present and federal circuit and state decisions from Jan. 1, 1996 to the present. Registration is required for lexisONE. At the state level, both the Pennsylvania Administrative Code (www.pacode.com) and the Pennsylvania Bulletin (www.pabulletin.com) are searchable online. Finally, a searchable version of the Philadelphia Municipal Code is available at www.amlegal.com/philadelphia_pa/. [Editor's note: On 7/15/2003 the URL for the Philadelphia Code changed to municipalcodes.lexisnexis.com/codes/philadelphia/.] (For a good collection of free municipal codes available on the Web, plus links to codes produced by six publishers, visit the Seattle Public Library's Web site at www.spl.org/default.asp?pageID=collection_municodes.) People When researching individuals, the first resource to use is a telephone directory such as AnyWho (www.anywho.com) or InfoSpace (www.infospace.com). You can search yellow pages, white pages, and email addresses. The reverse lookup feature allows you to find a person by entering a telephone number. Maps and driving directions are also provided. No luck? That's not surprising, since white pages databases contain only telephone numbers that are not unlisted and email databases are created primarily from Usenet postings. Plus, individuals' names often change due to marriage or divorce. What additional resources can you use? For starters, there's AlumniNet (www.alumni.net), which allows you to search by school, location, or person's name. Alumni.net claims to have more than 1.8 million members and is a free service, but you must register to use it. Many state home pages allow you to search unclaimed property online. Often the records will include a last known address or an address for a beneficiary, especially for such things as insurance premiums. Unclaimed property is generally available from the Department of the Treasury. Delaware, Pennsylvania, and New Jersey all allow you to search unclaimed property online. Has the individual been incarcerated? The Pennsylvania Inmate Locator (www.cor.state.pa.us/portal/cwp/view.asp?a=380&q=126864) is a database containing information about each inmate currently under the jurisdiction of the Pennsylvania Department of Corrections. Search by name or date of birth; limit by race, sex or committing county. Records contain both real names and aliases. Don't forget to check the Social Security Death Index, available from www.ancestry.com. The SSDI is a free, searchable database of individuals whose deaths have been reported to the U.S. Social Security Administration since 1962. Information provided includes full name, Social Security number, last known address, and dates of birth and death. The database contains more than 64 million records. Looking for a professional? You can use the American Medical Association's doctor finder ( www.ama-assn.org), a database of almost 700,000 licensed doctors of medicine or doctors of osteopathic medicine. Information includes medical school, residency information and office information. State Web sites often feature searchable databases of licensed professionals. For example, the Maryland Division of Occupational and Professional Licensing (www.dllr.state.md.us/query/index.html) lets you search for licensed accountants, architects, engineers, plumbers, and more. Finally, if you need a property assessment for a person living in Maryland, New Jersey, or Delaware or Montgomery counties in Pennsylvania, you can search Taxrecords.com (www.taxrecords.com). Records include owner(s), property location, assessment, and date. You will need to know the individual's address in order to use this service. Search Engines There are many HW search tools that can help you find other "hidden" gems like the ones mentioned above. Direct Search (gwis2.circ.gwu.edu/~gprice/direct.htm) is a site developed and maintained by Gary Price, a librarian at George Washington University. Price is regarded as an expert in HW resources. Direct Search is impressive in its depth and breadth, but the site is hard to scan and loads slowly. The site's search feature helps overcome its organizational shortcomings. [Editor's note: Site taken down 6/4/2002.] The Librarians' Index to the Internet (www.lii.org), from the Library of California, is also very valuable. When searching, you can limit your results specifically to databases. For public records, Search Systems (www.searchsystems.net) is the place to go. It provides links to more than 7,700 free searchable databases worldwide. U.S. coverage includes both national and state resources. And of course, I can't forget to mention Jenkins Law Library (www.jenkinslaw.org)! The Research Links pages contains hundreds of useful legal, government, business, and people-finding resources that aren't indexed by popular search tools. Research Links are regularly reviewed and updated by Jenkins research specialists. Other useful search engines include CompletePlanet (www.completeplanet.com), IncyWincy (www.incywincy.com), InvisibleWeb.com (www.invisibleweb.com), and SearchPower.com (www.searchpower.com). Alerting Services Keeping up with HW developments can seem daunting. However, the following alerting services will do most of the legwork for you. The best overall source for HW news is Search Engine Watch (www.searchenginewatch.com). Click on the "Search Engine Headlines" link to view recent articles and breaking headlines from around the Web. Information is updated daily. You can also visit the Librarians' Index to the Internet (www.lii.org) to get a weekly update on new Web sites. Click on the "New on ..." link to see a list of resources organized by type; jump to the "Databases" heading to look for HW resources. Research Buzz (www.researchbuzz.com) covers new developments in all aspects of Web searching. According to their Web site, "If in doubt, the final question is, 'Would a reference librarian find it useful?' If the answer's yes, in it goes!" For new government information, go to GPO Access' New Electronic Titles (www.access.gpo.gov/su_docs/locators/net/). For weekly news of interest to legal researchers, try LLRX Buzz (www.llrx.com/buzz/buzz.htm). Conclusion An effective legal researcher cannot always rely on using a "one size fits all" search engine. Attorneys and searchers need to take the time to learn relevant resources and search tools that specialize in HW content. Given that the HW is growing faster than the indexed Web, it will be time well spent. Dan Giancaterino is the Internet Librarian at Jenkins Law Library in Philadelphia. He teaches seven hands-on CLE Internet courses at Jenkins and also has presented for other CLE and non-CLE programs.
24 Pennsylvania Law Weekly 1022 (September 3, 2001) |
![]() |
|
|||||||||||||||||||||||||||||
This page was last updated 13-May-05 10:18:01 EDT Copyright © 1996 - 2010, Jenkins Law Library. All rights reserved. Disclaimer | Privacy Policy | Contact Us | Suggestions | |||||||||||||||||||||||||||||||