| I know, I know: You already know how to find websites.
But maybe there will be something new in this lesson that will surprise
you. Most students, for example, have never heard of Lii.org! Enjoy!
LEARNING OBJECTIVES
* To understand the difference between subject directories and search
engines and know the appropriate ways to use them in research.
* To recognize the 3 basic parts of a URL.
* To recognize domain names.
* To know the difference between general search engines and site-specific
search engines.
* To be aware of the common features of web search engines.
LESSON FOUR TABLE OF CONTENTS
1. Preface
2. Websites, URLs, and Domain Names
3. General Web Surfing
4. Subject Directories and Selective Directories
5. General Search Engines and Site-specific Search Engines
6. Common Features of General Search Engines
7. Key Points to Remember
1. PREFACE
Thus far in the course, we have examined two of the most widely used
information sources on the Internet: books and periodicals. We have used
online catalogs (to find books) and web databases (to find periodical
articles). In this lesson, we expand our search for information sources
to include information found in websites.
2. WEBSITES, URLs, AND DOMAIN NAMES
You may recall from Lesson 1 our definition of a website:
Every website (and every Web page) has a unique address known as a URL
(Uniform Resource Locator) which identifies where it is located on the
Web. For example, here is the URL for Cañada Library’s home
page:
http://www.canadacollege.net/library/
URLs have three basic parts: the protocol, the server name and the resource
ID. These parts provide "clues" to where a Web page originates
and who might be responsible for the information at that page or site.
Let's look at each part:
• PROTOCOL: appears at the start of the URL before the double slash
and identifies the method (set of rules) by which the resource is transmitted.
All Web pages use HyperText Transfer Protocol (HTTP). Thus, all Web URL's
begin with http://.
• SERVER NAME: appears between the double slash (//) and the first
single slash (/)
The server name for the Cañada Library URL is: canadacollege.edu
The server name identifies the computer on which the resource is found.
(Computers that store and "serve up" Web pages are called servers.)
This part of the URL commonly identifies which organization or company
is either directly responsible for the information or simply providing
the computer space where the information is stored.
The server name always ends with a dot and a three-letter or two-letter
extension called the domain name (sometimes called the domain type). The
domain is important because it usually identifies the type of organization
that created or sponsored the resource. Sometimes it indicates the country
where the server is located. The most common domain names are:
.com for company or commercial sites
.org for non-profit organization sites
.edu for educational sites (most commonly four-year universities)
.gov for government sites
.net for Internet service providers or other types of networks
.mil for a military body
If the domain name is two letters, it identifies a country, e.g. .us for
the United States, .uk for the United Kingdom, .au for Australia, .mx
for Mexico or .ca for Canada.
• RESOURCE ID: everything after the first single slash (/)
The resource ID for the Skyline Library URL is: library/index.html
The resource ID contains directories and subdirectories, thereby giving
you the exact location of the document on the server. Following the last
slash (/), you are given the file name for the specific page. (The file
name for the Skyline Library homepage is: index.html.) The file name ends
with a three or four letter designation that specifies the file type (e.g.,
.htm or .html for a standard Web page, .jpg or .gif for common graphic
files).
3. GENERAL WEB SURFING
At some point in your research -- usually after searching the Deep Web
using Web databases and online catalogs -- you may want to look for information
and opinion found on free websites within the Visible Web. This is often
referred to as general Web surfing. Be cautious, however, when searching
the Visible Web because no quality control is in effect here. You may
find highly accurate and reliable information at one website, and complete
falsehoods at another.
Two types of Web search tools are available to help you find websites
and/or web pages: subject directories and search engines. Let's examine
each separately.
4. SUBJECT DIRECTORIES AND SELECTIVE DIRECTORIES
Web subject directories (such as Yahoo!, LookSmart, LII, and many others)
provide lists of websites arranged by subject category. The websites included
at a subject directory are chosen by people known as indexers. Each site
in the directory is listed under one or more subject categories, as determined
by the directory's indexers. A brief description of each site listed is
usually included.
Directories are often a good place to start when you’re looking
for information on relatively general subjects or if you want an overview
of what’s available on the Web on a given subject.
Thus, to find websites on general subjects using subject directories:
• browse through the directory’s list of subject categories,
OR
• do a keyword search using search terms that describe your general
subject.
If, however, you already have a specific research question in mind, a
different approach can be used. To find websites on a specific research
question using subject directories:
• do a keyword search using terms that describe the overall general
subject under which your topic falls (click here for an example, use the
Back button to return here)
• choose a general website on that subject from your results list
(click here for an example, use the Back button to return here)
• use that general website and see if it has a site-specific search
engine that allows you to search that website’s collection of information
(click here for an example, use the Back button to return here), AND/OR
• look for links provided by the general website to websites or
webpages that focus on specific topics (click here for an example, use
the Back button to return here).
There is wide variation in the number and quality of sites included in
different Web subject directories. Many of the best-known directories,
such as Yahoo! or Excite, try to be as comprehensive as possible, with
very extensive listings. However, one disadvantage of these large directories
is that they usually do little, if any, evaluation of the quality of the
sites they list, thus making it difficult to find the best sites in a
particular subject area.
For that reason, you are wise to use a subject directory that only lists
sites known to be high quality. These directories are known as selective
directories. In addition to only indexing credible websites, selective
directories often provide links to other specialized sites, which in turn,
provide links to even more specific high-quality documents on a particular
subject or topic.
Recommended selective directories:
Librarians' Internet Index (http://www.lii.org) -- high-quality resources
on a range of general subjects
AcademicInfo (http://www.academicinfo.net) -- scholarly sites on a wide
range of subjects
InfoMine (http://infomine.ucr.edu/) -- academic resources
Scout Report Archives (http://scout.cs.wisc.edu/archives/) -- academic
resources
Tutorial: Using a Selective Directory
5. GENERAL SEARCH ENGINES AND SITE-SPECIFIC SEARCH ENGINES
Web search engines (such as AskJeeves, Google, AltaVista, and many others)
allow you to search through millions of websites using your own keyword(s).
Websites gathered and indexed by search engines are not selected, organized
or previewed by humans. Instead, their collection of websites is created
entirely by computer programs called spiders (also known as robots) that
continuously scan the Internet looking for sites to add to their index.
Since the collection of websites indexed by search engines are huge (numbering
in the millions) and have no subject organization at all, it is very important
to think carefully about what search words to use and be aware of the
various search features available before performing a search. Always look
for the "Search Help," "Search Tips," or other pages
that explain the features of the search engine you're using. Remember
that Web search engines, unlike library online catalogs, do not use a
common set of subject headings. Therefore, to use search engines effectively,
it is usually best to use very precise search words or phrases, or combine
several search terms using Boolean logic (as discussed in Lesson 3).
Search engines should be used when you have a focused research question
in mind or when you’re looking for a specific item of information,
such as a known document (e.g. the U.S. Declaration of Independence),
image, etc. They're not recommended for finding sites on broad subjects,
such as "astronomy" or "history." As discussed earlier,
Web subject directories should be used to find sites on general subjects.
Finally, there is a special type of search engine you should be aware
of. Sometimes, websites offer their own internal search engine that allows
you to search just that website’s collection of information. These
are known as site-specific search engines. Click HERE to see an example
of a website that contains a site-specific search engine.
6. COMMON FEATURES OF GENERAL SEARCH ENGINES
Listed below are features common to many search engines. Keep in mind,
however, that these features may not work the same -- or even be available
-- on every search engine.
AND: many search engines use the + sign (often called the "require"
sign) in front of words that must be included in the search results. For
example, + immigration +economy is often used instead of immigration AND
economy. Some search engines that allow the use of AND and OR require
that they be capitalized. (Thus, it's a good idea to always capitalize
these connectors if you use them.) Finally, some search engines, such
as Google, assume that a typed space equals AND. For example, immigration
economy would automatically be understood as immigration AND economy.
OR: some search engines assume that a typed space between search terms
equals OR. For example, economy business would automatically be understood
as economy OR business.
Phrase searching: by putting a phrase in quotation marks, documents will
be retrieved that contain that exact phrase. For example: "illegal
immigration" will retrieve documents containing those two words next
to each other as a phrase.
Truncation: a symbol (usually an asterisk) that allows you to search for
all variations of a common root. For example, econom* finds: economy,
economic, economics, economist, etc.
Parentheses: to designate which operations are to be carried out first.
For example, in this search statement:
("illegal immigra*" OR "undocumented workers") AND
econom*
a search engine would first search for ("illegal immigra*"
OR "undocumented workers"). That result would then be ANDed
with econom*.
Relevance ranking: a programming method that attempts to rank search results
based on various factors. Different search engines use different ranking
systems. Documents returned from a search can be ranked on such factors
as
• frequency of search words in document
• words found in title or near beginning of document
• search words found close to one another
7. KEY POINTS TO REMEMBER
• A website is a coherent collection of Web pages linked together.
• URLs have 3 basic parts: the protocol, the server name, and the
resource ID.
• The server name always ends with a dot and a 3-letter or 2-letter
extension called the domain name (or domain type). The domain name is
important because it usually identifies the type of organization that
created or sponsored the website.
• Looking for information and opinion found on free websites within
the Visible Web (as opposed to the Deep Web) is known as general Web surfing.
Be cautious, however, because surfing can uncover highly credible sites
as well as sites containing very questionable or false information.
• Two types of Web search tools are available to help you find websites
and/or web pages: subject directories and search engines.
• Web subject directories provide lists of websites arranged by
subject category. The websites included in a subject directory are selected,
organized, and previewed by human beings. They’re often a good place
to start when you’re looking for information on relatively general
subjects or if you want an overview of what is available on the Web on
a given subject.
• Selective directories, such as the Librarians' Internet Index,
are a type of subject directory that only list sites recognized to be
high in academic quality.
• Web search engines (such as AskJeeves, Google, AltaVista, and
many others) allow you to search through millions of websites using your
own keyword(s). Computer programs known as spiders collect and index the
websites found with a search engine. It is appropriate to use search engines
when you have a focused research question in mind rather than a broad
subject.
• Sometimes, websites offer their own internal search engine that
allows you to search just that website’s collection of information.
These are known as site-specific search engines.
Lesson 4 Assignment
DUE:
NAME:
Each question is worth 1 point unless otherwise noted. Total points for
this assignment: 35
Please answer all of the questions below and then email your completed
assignment to Dave Patterson at pattersond@smccd.edu .
1. Identify the type of organization that is responsible for each of
the following web pages:
a. http://www.whitehouse.gov/WH/html/handbook.html
b. http://www.sfsu.edu/online/clssch.htm
c. http://www.genentech.com/careers/college/internship.html
2. What is the difference between a subject directory and a search engine?
When is it appropriate to use a subject directory and when is it appropriate
to use a search engine? (4 points)
3. Why are selective directories valuable for the researcher?
4. What is a site-specific search engine?
The rest of this assignment asks you to find websites on your topic using
a selective directory (LII) and a general Web search engine (Google).
Begin by writing the exact current wording of your research question.
(This should be the same research question you used for the Lesson 3 Assignment,
unless the instructor specifically requested that you change your question
in some way.)
Using the search worksheet below, write the concepts and search terms
for your research question.
Include any corrections you received from the instructor on your Assignment
3.
(Use only those boxes that you need for your topic.)
[See section 10, “Advanced Search Strategy,” in Lesson 3 for
a review of search strategy and an explanation of using a search worksheet.]
(3 points)
Concept
# Search Terms
1 . . . .
2 . . . .
3 . . . .
5. Use the Librarians’ Internet Index (LII) to find websites related
to your research question. LII is a selective subject directory of about
17,000 websites. (You may want to read the LII “Help” screen
for a complete explanation of how to search this directory.)
You may recall from the reading that subject directories are used to
find high quality general websites on a subject and using these websites
to “drill down” and find websites that focus on a specific
topic.
Therefore, begin your search by typing in a word or phrase that describes
the general subject under which your specific research question falls.
For example:
Research question: How will increased use of genetically engineered crops
affect food safety?
General subject: genetic engineering
Type in the search box on LII: genetic engineering
Click here to see a search example in LII.
a. To search for sites on your general subject, what were the exact subject
word(s) you used in LII?
b. How many sites were found from that search? (See " Viewing 1
to … of … " at the top right of the results page.) Click
here to see an example of search results in LII.
c. What is the title of the best general website on your subject that
you found using the Librarians' Internet Index? (Make your selection based
on your reading of the descriptions of the websites on the LII Results
page and from browsing some of those websites.)
d. Copy and paste below the full URL (Internet address) for the page
you selected. (When copying the URL, please include the complete address
beginning with: http:// )
Using the website(s) that you found on your general subject (i.e. your
answer to question 3c), find a web page that focuses more directly on
your specific research question. (This is the “drilling down”
part.) Do this by using one or both of the following methods:
Method #1: Look to see if the website has a site-specific search engine.
If it does, use this search engine to try to find web pages on your specific
research question. Most likely you will do this search by typing in a
search term from one or more of your other concepts. For example:
Research question: How will increased use of genetically engineered crops
affect food safety?
Search terms to use on a site-specific search engine: food safety
Click here to see an example of a site-specific search.
Method #2: Look for links given by the general website you’ve chosen.
These links usually take you to web pages that discuss specific aspect(s)
of the general subject. Thus, some of these links may pertain to your
specific research question. Click here to see an example of a using a
link to find a more specific webpage.
6. What is the title of a webpage that focuses more directly on your
specific research question?
a. Copy and paste below the full URL (Internet address) of the page you
selected that discusses your specific research question. (When copying
the URL, please include the complete address beginning with: http:// )
7. A different approach to finding websites is to use a general Web search
engine to go directly to websites that are as closely related as possible
to your specific research question.
Use Google for this search. All your concepts should be included when
using a general Web search engine, ORs should be used between search words
for the same concept and quote marks should be used around phrases. (See
below.)
(You may want to read the U.C. Berkeley Google tutorial before completing
this question.)
a. What was the exact search statement you used? (Please follow the examples
below.) (2 points)
When using Google, please note the following:
• Be as precise as possible-- use at least one search term from
each of your concepts.
• Put quotation marks around phrases (more than one word).
• ANDs are not necessary, but ORs are required when linking synonymous
search terms for the same concept. ORs must be capitalized in Google.
• Truncation symbols (*) are not used in Google. Do not use them.
• You may add parentheses and ANDs to make it easier to see and
organize your concepts, but they are not necessary. Some examples:
(“genetically engineered crops” OR “genetically modified
foods”) AND (“food safety” OR “food quality”
OR “food contamination”)
("illegal immigration" OR "illegal aliens") AND (economy)
AND ("United States" OR U.S.)
b. How many web pages/sites did Google find using your search terms?
(See "Results 1 - 10 of about ---" at the top of the results
page.)
c. What are the titles of the two best web pages or websites that are
relevant to your research question that you found using Google? (2 points)
d. Copy and paste below the full Internet addresses for the pages/sites
you selected. (When copying the address, please include the complete address
beginning with: http:// ) (2 points)
8. Compare your searches and search results on Google and the Librarians'
Internet Index.
a. What were the advantages and disadvantages of each search tool in terms
of ease of searching and quality and quantity of the search results? (4
points)
b. Compare Google and the Librarians' Internet Index to the periodical
and book databases used in the previous assignments (i.e. the InfoTrac
databases and the PLS online catalog) in terms of:
- quality of the information you found
- relevancy to your research question
- ease of use
(5 points)
Revised for Cañada College Library by Lynne Vieth and Dave Patterson,
July 2006.
These materials may be used for educational purposes if you inform and
credit the author, Eric Brenner, and cite the source as “LSCI 100,
Introduction to Information Research.” All commercial rights are
reserved. To contact the author, send comments/suggestions to Brenner@smccd.net
.
|