FOIOTI: An implementation ofthe conceptualist approach to Internet Information Retrieval

The objective of this research project was to evaluate searching methodologies used by undergraduate learners in searching for academic information, and to design an aid if required. Literature surveys indicated that the sheer size ofthe Internet and lack of categorization ofthe information available makes finding relevant information a daunting task. Other problems include a lack of clear search specification formulation and inefficient usage of time and computing power by loading and using one search engine at a time. It was also clear from the literature that Internet searchers have difficulty in general to iocate relevant information. The methodology used included empirical experiments involving a total of 1109 learners in a series of empirical experiments to address this situation. Their failure/success, methodology and a number of other factors were measured, and an instrument was designed to overcome these problems. The main conclusion was that the use of this instrument (called FOIOTI: Finder Of Information On The Internet) increased the chances of success under controlled circumstances dramatically. This was achieved by hiding the operationalist detail from the user, allowing him/ her to concentrate on conceptualizing the topic.


I Introduction
The Internet offers its users the world's largest and most complex, chaotic and unstructured search space (Sherman 1999: 54).Although programs such as friendly browsers and free search engines exist to assist the user to find his/her way around this unknown data repository, general consensus exists that navigating the Internet is not a straightforward task (Voorbij 1999: 598).One specific skill, which eludes many of these average users, is the finding of relevant information on the Internet in a short time (Brewer 2001: 54).
An Internet search engine is a program, which allows the user to either specify (a) keyword(s), or drill down into a topic through various levels of directories.Both approaches should lead the user to 'the perfect website', which will contain the exact information required.Commonly used search engines include Google.AIITheWeb, Yahoo! and MSN Search (Sullivan 2003).These search engines each contain a front end, an index and a set of collectors.The front end is the human interface, i.e.. what the user interacts with on the screen during the searching process.The index is a large file which contains detail about millions of websites, and it is this file (not the Internet itself) which is queried when a user searches for information via a search engine.The collectors are either human editors (as with Yahoo) or automated programs called spiders, bots or crawlers (as used by Google and AltaVista).Both types of collectors gather information about available websites, and build it into the index file (Sullivan 1999: 34, Sullivan 2002).
All search engines offer a number of features to enable the user to find information with ease.However, it has been proven that most users do not avail themselves of these features (Weideman 2001: 197).The purpose of this article is to introduce the reader to an alternative approach to Internet searching -one that does not presuppose any knowledge of search engines or their syntax and operators.This approach is aimed at the typical IS/IT (Information Systems / Information Technology) university or technikon student who is not familiar with the detailed syntax of search engines.
The most basic form of Internet searching (inherited from pre-lnternet systems) is to load a search engine, type in a single word, and instruct the search engine to find the information.Not surprisingly, this method seldom delivers relevant answers, especially if the word has many different interpretations (Siegfried et al. 1993: 273).As an example, common words like 'religion, ' 'computers,' 'sport' and 'weather' produced the following approximate numbers of answers in four separate searches on Google: 12 000 000, 20 200 000, 23 100 000 and 28 000 000.
A logical next step would be to use either a very specific single word, or more than one word as a search query.Multiple-word queries fall into one of a number of categories: phrase searching, Boolean operators, inclusion/exclusion, combinational operators, and other methods.Another method which focuses results, is the use of the inclusion ( + ) and exclusion (-) operators.By using these operators, the presence or absence of certain words can be enforced.The information need of a user is clear from the following search query: +surf +board +specs -Internet -screen -website.
As a user progresses from simple to more advanced searching, operators can be combined to form very effective filters to produce a strongly focused search specification, for example: ( +Mathematics +formula) OR(+sdence +formula) AND 'volume of a sphere' However, a recent study claimed to have found that'... most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features' (Spink et al. 2001: 226).
2 Literature review 2.1 General searching Lancaster made a prediction about the way In which humans and computers would interact in 1978: The scientist of the year 2000 will use a terminal in many different ways: to receive, transmit, compose and search for text, seek answers to factual questions, build files and converse with colleagues.The terminal provides a single entry to a wide range of capabilities that will substitute, wholly or in part, for many activities now handled in different ways' (Lancaster 1978: 176).
Lancaster's 'scientist' is today's average PC (Personal Computer) user, his 'terminal' has become the desktop computer and 'conversing' has evolved into e-mail.The 'different ways' could be the shift from using an intermediate searching expert, as was done in the traditional library environment, to doing the search oneself.This moves the focus of this project to Internet search engines.
Schwartz stated that there could be a change towards smaller but better resources operating in a richer environment in the future (Schwartz 1998: 973), while Feldman claimed that the quality of results appeared to increase from one year to the next (Feldman 2000).However, a claim is made that search engines are complex, trusted without being understood and that users simply deal with their answers without understanding why they receive those answers (Lynch 2001:17).These disparate views require further investigation.

Search engines and operators
The design of search engine interfaces was criticised by Large and Beheshti, after having done some searching experiments with 50 grade six schoolchildren.They believed that long lists of irrelevant answers could be suppressed to enhance a child's experience of searching the WWW (World Wide Web) for relevant information (Large andBeheshti 2000: 1079).
The syntax of search engines differs widely, and a wide variety of operators exists to allow focusing of a search query.Lancaster referred to natural language searching, where no operators are used, in an early reference (Lancaster 1978: 279).As an alternate, commonly used operators include phrases (using quotes), the inclusion operator ( + ) and the exclusion operator (-).Other, more advanced and less often used ones include Boolean operators (AND.OR, NOT), proximity operators (NEAR), stemming (*) and field limiters (DOMAIN, TITLE).These operators are discussed by a variety of authors (Boulton 2002, Habib and Balliot 1999. Hock 1999:25. Notess 1997: 65. Notess 1999a: 64,65, Sullivan 1999:38).; However, Jansen and Pooch quote a number of other studies who claim a low percentage of real life searches having included Boolean operators: the Fireball study (less than 3% -done in 1998), the Webcrawler study (12% -done in 1997).Infoseek (10% -done in 1998) and Excite (29% -done In 2000) (Jansen and Pooch 2001: 237. 238).
During a survey of 316 Excite users.Spink, Bateman and Jansen stated that few users employed logical operators and even fewer used the syntax correctly.They also had problems with search phrases and with the construction of 'good search terms'and'complex search queries' (Spink eto/. 1999: 123, 125).
It can thus be concluded that the best way to construct a search query during an Internet search is not obvious to the average searcher.Where some search engines would add an implied OR operator between v/ords (Infoseek), others will insert an AND (AltaVista and Google in their basic searching modes), and some might even treat it as a phrase (Looksmart).These three widely differing approaches would produce very different results, which would confuse rather than enlighten an average searcher.An Internet search where five words with AND operators between them are used, will produce x answers from a given database.The same search using the same five terms and the same database, but using OR operators between the words, will produce y answers, where x< <y.

Multiple databases
Spink has confirmed that some users (without any advice or training) are doing more than one search on a variety of sources on OPACs (Online Public Access Catalog) and CD-ROM databases for better search results: 57% of a sample of 200 academic end-users conducted multiple search sessions (Spink 1996: 603).
The value of this richness of information is confirmed by Chun quoting Reid in saying that the more pieces of information available to choose from, the higher the chance of finding the correct one (Chun 1999: 137).The fact that those pieces of information could be spread over more than one database, assuming that the databases are all of the same quality, does not affect this basic truth.
However, a breed of search engine called a meta search engine makes this multiple searching possible in one Internet searching session.A meta search engine simply takes the user's query, and submits it to the indices of a number of search engines simultaneously.Once the results have been received, website summaries are combined and one set of answers is produced.
Garman in fact built a case for the use of meta search engines by strongly advising users to search multiple databases to satisfy the same query (Garman 1999: 75).Other authors confirmed the validity of this advice (Jacso 1999a: 227;Jacso 1999b: 99;Mickey 1999: 79;Williams 1996: 246).

Searching success
Searching success in the context used here can be described as finding information, which satisfies the need, within a reasonable time.Lancaster discussed some factors which influence success in online searching many years before the advent of Internet search engines (Lancaster 1978: 331).However, lately many authors agree that learners should know how to retrieve information from electronic sources (Cronje and Clarke 1999: 215;De Jager and Sayed 1998: 197;Ediing 2000: I I;Gan 1998: 29;Marais and Marais 1999: 85;Wong 1998: 5).IS managers consider the process of letting users (i.e.what the learners of today will become) find information for themselves as liberating to IS staff (Foley 1996: 45).
Some authors claim that, in general, finding relevant information in an information source is not a difficult task: • '... the majority of Australian academics have a high expectation of success as they engage in information seeking on the Internet' (Bruce 1998: 555).• '... Australian academic users generally ... are satisfied with information seeking ...' (Bruce 1998: 553).• '... even the novice informationsearcher is able to extract relevant documents in rankorder of calculated relevance.'(Ross and Wolfram 2000: 949).However, other authors claim that the opposite is true: • The very size of digital libraries begs the question of how average users will continue to find context and meaning.' (Huwe 1999:68).• 'Using Internet search engines to retrieve information can sometimes be a slow process ... and can result in far too many "hits" so that picking out relevant ones can be difficult' (Large et al. 1999: 5).• 'Our customers (where customers are library users) understand their information needs, but they don't understand the tools that can satisfy them' (O'Leary 2000: 22).• '.. .somerespondents seemed confused about what they were to report when asked to list query terms for their search' (Spink et a/. 1999: 122).• '.. .theuser's ability to specify good search terms and create complex search queries to clearly and precisely capture relevant retrieval seems rather low.' (Spink et al. 1999: I 29).• 'Only 33% of the Internet users agree or strongly agree with the statement "it is easy to perform subject searches on the Internet."' (Voorbij 1999: 604).• 'I find it difficult to search information on the Internet...' (Voorbij 1999: 605).
• '.. .informationseeking is a complex and difficult process for these students, who seek to reduce the task to finding an obvious answer or finding a good website...' (Wallace et ai.2000: 75).• '... both novice and experienced searchers were overconfident in their performance' (Wolfram and Dimitroff 1997: I 145).In conclusion, it became clear from the literature that many users found the formulation of a search query difficult.It also appears as if the general success rate of Internet searching is low.Lancaster proved as far back as 1968 that the low quality of query formulations was the main reason for the failure of a search (Frants and Shapiro 1991: 17).
The following points have emerged as fundamental issues from the literature survey: • the sheer amount of information available on the Internet is overwhelming.
• many features exist in search engines to help users find relevant data, • users find it difficult to construct a search query from their information need, • the use of more than one database seems to be advisable, and • many authors agree that learners do not appear to be capable searchers.
These apparent problems searchers experienced with searching the Internet in general, and with query formulation and syntax specifically, gave reason for concern.An attempt was made to increase the success rate of Internet searching by undergraduate academic users at universities and technikons, as discussed below.This would be done by designing, testing and implementing a model, which would increase the success rate of Internet searching done by this audience.

Population and sample
In line with the standard approach, the whole (population) from which the part (sample) had to be taken, was now defined (Forcese and Richer 1973: 121).The population was to be those who fall into the category of concern, viz.IT/IS learners.For the purposes of this study, a learner who is registered for at least one subject, which contains a major IT/IS component, would be part of the target population.Although this study was not limited to South African learners, budgetary constraints prescribed that they would form the bulk of the population.

International institutes
University of Maribor (Slovenia), University of Dallas (USA) and Columbia University (USA).
According to Alreck and Settle, a correctly chosen sample would present an accurate extract of the population, without bias or error in the results (Alreck and Settle 1985: 63).Kothari confirms that the choice of sample size is crucial.
If too small, it may not achieve the objective of accurate measurement, and if too large, wastage of resources and huge costs may be incurred (Kothari 1991: 121).Thus accuracy of the measurement must be balanced against cost and feasibility (Sapsford and Jupp 1996: 28).
Furthermore, as a result of the complex racial composition of the inhabitants of the country, it was considered necessary to ensure that all easily definable race groups be represented in the sample (in alphabetical sequence: African, Asian, Coloured and White).These race categories are commonly used in South Africa as a means of determining to what extent the government policies of Equal Opportunity and Redress are being followed.This study does not focus on racial issues, and the race of participants was considered merely to ensure that the results would be representative for the whole of South Africa.
Janes claimed that using a relatively small sample size can still provide useful answers, and provided a formula for the calculation of sample size, when the population size and other variables are known.The formula is: This formula to calculate sample size (n) was used as a control measure to determine the minimum sampling size (Janes 2000: IOO).It is based on population size (N), measure of error (e).measure of confidence (Z) and assumed proportion in population (p).A population size of 14 400 was obtained as follows: the estimated average number of IT/IS learners per institution was taken to be 400.which is a typical figure for the Cape Technikon.There are 21 universities and 15 technikons in South Africa, for a total of 36 institutions.Therefore: N = 400 X 36 = 14 400.
The values for e, Z and p were taken as recommended by Janes (e = 0.04, Z= 1.96, p = .O5).His claim that changes in N have little effect on n was confirmed by the figures in the table below.This formula produced an initial value of n = 576 (with N = 14400) learners for this study.A further control measure was the e-mail based statements by 28 colleagues from higher education institutions from all over the world (mostly the USA, but also Argentina.Mexico, New-Zealand, Norway, Romania and Venezuela) stating the number of IT/IS learners in their institution.The average number of IT/IS learners from these messages was calculated to be 660.A recalculation of N (population size) was done, using this new average of 660 instead of 400 as above, and using the same number of institutions of 36.This new calculation yielded an N of 23 760, which produced a new value for n of 585.
The author would attempt to involve at least 585 learners in the searching experiments.

Initial searching experiments
An extensive series of controlled experiments were done to establish how, where and for how long users search for information, given full Internet access, 30 minutes and one academic topic they could choose themselves.A total of I 109 students, spread across 20 Higher Education Institutes were involved in 46 separate searching sessions in this endeavour.
In the first phase, some pilot tests were done on 105 students to weed out any basic problems in the research methodology, and a test instrument was designed to measure searching success.Secondly, 293 other students used this instrument, and the results were recorded.The test instrument consisted of a fiowchart combined with descriptive text boxes, which lead the learners through a series of simple steps, guiding them as to decisions on choice of search engines, operators and searching methodology.At this point, a total of only 39.1 % of the participants claimed success in finding their information.This claim of success (or failure) was based on the learners' experience of finding/not finding relevant information to satisfy their information need within the allotted time.The basis of this decision was clearly explained to all participants.Some of the most common problems that were found include: • users were not capable of using the operators at their disposal.
• users could not formulate their search specification and • users wasted time by waiting for only one search at a time to produce answers.These findings correlate well with those of other researchers mentioned earlier.The test instrument evolved into a model, which users could use to focus their search, and this model underwent a series of refinements to enhance its operation.

Further experiments and model refinement
During the third phase, another 90 students were tested, and the results were used to further adjust the model used to assist them in searching.The second version was still paper-based, and was simpler but more focused than the first.It required the user to specify the information need, then choose any number of search engines from a list.Next, guidance was given on search query construction.It was found that many users experienced problems in the formulation of their Information need, as well as with converting the information need to a query.At this stage, it was decided to code the final model as a web-based program, which assisted the user in formulating the query.This had the advantage of being usable at remote sites, and also made it more easily accessible to other audiences.This program was called FOIOTI -Finder Of Information On The Internet.A total of 621 students participated in the final phase of testing.

3.4
The operationalist vs the conceptualist approach FOlOTI's design is based on Fidel's definition of two distinct modes of searching: the operationaiist versus the conceptuaiist approach (Fidel 1991: 520, 521).The operationalist decides on an information need, and then concentrates on trying to find the best mix of search engine operators to achieve success.The searcher could even use different search engines (a good idea in principle) in the search for that idea!website.This approach is clearly favored by the technically minded searcher.However, the conceptualist attempts to think around the information need, and changes it as the search progresses.This person realizes that the formulation and subsequent fine-tuning of the search query is crucial to searching success.
Heinstrom compares the conceptualist approach to the wholist learning style, where the user tries to place the search within a context.The most important aspect of the request is taken as the starting point, and the search is then widened, which produces good recall.Operationalists concentrate mostly on high retrieval.Their searches tend to be narrow and exact, and they aim at finding only the necessary information to solve their problem (Heinstrom 2003).SA_/n/L)bs& Info Sc/2005 71(1) Definite positive aspects characterize both approaches.An in-depth understanding of the search operators of a few well-known search engines is a formidable weapon in the constant drive to find relevant information on the Internet.Similarly, the ability to manipulate the definition of an information need in order to filter out unwanted answers also goes a long way towards finding the exact information that is required.FOIOTI was designed to hide the technicalities of search engine operators and the construction of the search query from the user.This 'construction' process is actually a basic form of computer programming -both involve instructions in a non-natural language given to a computer.FOIOTI further simplifies the choice of search engine (which in itself could be a topic for research) to the click of a button.
The user types in the information need, extracts the required components from it.and initiates the search.Since no 'programming' is involved, the user is freed from the technicalities (FOIOTI takes over the operationalist component), and finds it easier to concentrate on the information need (the conceptualist approach).
As an example, assume your information need is: 'I need to find the final score in the 1995 World Cup rugby final.'The operationalist could submit a variety of queries, such as: 'World AND rugby AND final AND 1995 +score,' or ' + rugby +score + 1995 +final -football,' etc.If unsuccessful, he/she could change to another search engine.
However, the conceptualist might remember a photograph on a newspaper front page, showing Nelson Mandela handing over the cup to Francois Pienaar.These two persons have very little in common: the first is an ex-political prisoner, then became a much-honored (now retired) President, while the second is a rugby captain.By changing the information need to 'I need an article where the names Nelson Mandela and Francois Pienaar appear in close proximity.'the chances of success are increased.A resultant search query like 'Nelson Mandela' NEAR 'Francois Pienaar' AND 1995 could be successful.
Singh further discussed these two concepts, and highlighted the fact that the operational searcher concentrates on high precision searches, by manipulating the system and its operators.The conceptuatist chases a higher recall figure by spending more time on concepts and terminology, and generates subsets of his/her results (Singh 2003).

Population groups
As indicated earlier, an attempt was made to ensure that the major South Afr can population groups would all be represented in the sample.It was practically impossible to mirror the exact distribution of the South African population across the four generally accepted categories in this experiment.However, Graph I indicates the actual figures of the race groups as covered by this experiment.

Search Engine choice
A decision had to be taken on which search engines were to be included in the final version of the model.A weighting system was used to combine search engines identified as being "good" choices based on the literature survey with those chosen by the learners.
A total of 20 search engines were selected, based on positive comments made by a variety of authors listed in the literature survey.A score of 10.5 points was allocated to all of them (10.5 is the average of 20 + 19+ 18 + ... + 2 + I).
These scores are based on the assumption that scores between I and 20 would have been allocated to them if they were selected on preference, and is called Score I (see Table 2).
Secondly, all the search engines chosen by the learners received a score, ranked according to the number of learners choosing them (column marked # USERS 2 in Table 2).Thirdly, since there are 22 search engines in this group, the search engine being chosen by the most learners received a score of 22, then 21, etc, Those with an identical number of votes received identical scores.This figure was called Score 2 in Table 2.The last column indicates the sum of Score I and Score 2, where a higher figure indicates a better choice of search engine.   2 above, all search engines with a total (Score I + Score 2) of 20 and higher were chosen to be subjected to further tests.The search engines identified this way are listed in Table 3.  4.3 Search Engine operator choice A summary was made of the methodology used by all those learners who did not find any information without a model, but did find it by using the model.Since these learners used some of the operators suggested in the initial versions of the model to achieve success, it was assumed that these operators have a higher probability of generating successful results for other searchers as well.Their queries were inspected, and the operators used by these successful searchers are summarized below.These findings confirmed those of Spink, Bateman and Jansen as well as Voorbij, who stated that searchers seem to favour simple operators and avoid more complex ones {Spink et al. 1999: 123; Voorbij 1999: 603).It was decided to do further tests on these operators, to determine which one{s) produced the most relevant results when used in combination with the search engines listed in Table 3.

Search Engines/operator experiments
The next step was to combine the use of the search engines identified in Table 3, with the operators identified in Table 4 above.Since this research project is aimed at an audience of real-world users, a real-world testing scenario was preferred above a theoretical one.The author did over 300 detailed Internet searches, recorded the results and a final decision about the operation of FOIOTI was taken.One example taken from these searches is given below.The searches start with a REQUEST each, which is based on a perceived information need (PIN) of a typical IT/iS learner (Mizzaro 1998: 306).A variety of QUERIES were then to be constructed, based on the operators identified in Table 4 above.Half of the queries done were constructed from the popular operators (the first two rows of Table 4: Inclusion and Phrases), and the other half from the least popular operators (the last three rows of Table 4: Logical and Field).
Rows number three and four (Stemming and Exclusion) were considered useful mostly on repeated successive searches on the same topic.For example, a user does a search on Intel processors, using the query: 'Intel' + processors.Only after receiving some incomplete answers, then realizes that the last's' in the query could filter out wanted answers, then changes the query to 'Intel' -I-processor*.Similarly, a search for password rules in Windows 2000 using the query: Windows AND 2000 AND password could produce superfluous answers including all other Windows versions.Afterwards the user might adjust the query to: Windows AND 2000 AND password -NT -Me -XP -98 -95 -3.* to exclude unwanted version numbers.For this reason these two operators were excluded from the experiments.
Although the first ten queries were generated by the author, each one's topic was based on query topics of some of the participants up to this point.The author constructed each query to be generic in nature, and did not consider the exact syntax of any specific search engine.This was done to mimic the final version of the model, which would treat the components of the query as submitted by the user the same.Each query would be submitted to each one of the search engines identified in Table 3 above.
The first 10 results of each of the searches would be inspected in each case, since various authors noted that most users view only the first few results (Courtois and Berry 1999: 44;Notes 1999b: 84).A decision would then be taken about whether or not a relevant answer to the query was found amongst these ten websites.The author's own relevance judgement was considered reliable enough to make this judgement, since all the REQUESTS are in his field of knowledge.

Details about the interpretation of the table and QUERIES below follow.
• The first row in each table contains an abbreviation for each search engine listed in Table 3 above.
• The first column in each table lists the queries described directly above that table.
• The numbers in the other columns indicate the number of answers received.A lowercase t indicates a time-out error given by the browser, with no figure as a result.A lowercase x indicates an error message from the search engine concerning the query syntax, with no figure as a result.The figures in bold include at least one relevant site under the first 10 (i.e.success), as judged by the author.• Queries based on the Inclusion and Phrase operators are: QI.3,1.5,2.3,2.5,3.3,3.5,4.3,4.5,5.3,5.5,6.1.6.3,7.1,7.3.8.1.8.3,9.1.9.3,I O.I,10.3 plus all queries from II to 20 both included.• Queries based on Logical and Field operators are all those not listed above.Only one of the 20 queries done is listed as an example below.The results of these 20 sets of searches showed that the use of simple phrases combined with the inclusion operator, for the three chosen search engines (identified below), consistently provided a higher success rate.For example, Q4.3 and Q4.5 (see Table 5) each has a relatively high number of Bold figures, which is an indication of success rate.This trend was also evident in the remaining queries.It was therefore concluded that the operation of FOIOTI would be based on the following: • overcoming common errors identified in searching strategy.
• the most successful operators, i.e. inclusion and phrases, and • the three most successful search engines, when used In combination with these operators, i.e.Hotbot, Webcrawler and Excite (Table 6).NOTE: Subsequent to the completion of these experiments, the Webcrawler search engine has been replaced in FOIOTI by Google, as part of a further refinement experiment.
FOIOTI's actual operation consists of the following simple algorithm: • ignore contents of the specification box (it served its purpose to force the user to think clearly about the information need) • surround contents of caps and phrase boxes with quotes to preserve capitalization and sequence • concatenate the last four boxes in a sequence of keywords • precede each one of the four components with an inclusion sign.For example, assume a user completes the FOIOTI boxes as follows: CAPS BOX: Turkey PHRASE BOX: death toll WORD BOX: earthquake WORD BOX: 1999 FOIOTI will then produce the query, and submit it to the search engine, from these boxes as follows: -I-Turkey' -F'death toll' +earthquake + 1999 FOIOTI's design is therefore based on Fidel's statement that '... analyzing users' seeking and searching behavior as it occurs in actual situations is a promising method for ... suggesting improvements in system design and in search environments' (Fidel et al. 1999: 36).

FOIOTI usage
FOIOTI is available at www.mwe.co.za.Follow the links from the homepage (lava must be enabled on the browser to make FOIOTI visible, and Ctrl should be held down while clicking to disable any pop-up blockers).
The success of this model is based on the fact that the user is shielded from search engine syntax and other technicalities.The only two prerequisites are a clear understanding of exactly what the search topic is, and basic mastery of the English language.The reader is reminded that FOIOTI was designed for 'the average' undergraduate IS/IT learner.
The large gray FOIOTI box (see Figure I) requests that the user must complete all five rectangular white boxes, before clicking on any one (or more) of three search engine buttons at the bottom.It will then construct a query, using the information just supplied, load the chosen search engine(s), and feed the query to it.The result screens are presented to the user in the same way had he/she simply accessed the search engine the normal way (Weideman 2002a).
The first box requests the user to type out the information need in a full English sentence.This is a crucial first step, since it forces him/her to think carefully about what the information need actually is.Specific words and phrases should be used, for example 'BMW rather than 'motorcar,' 'standard marathon' rather than 'jogging.'etc.Although this first box plays no part in FOIOTI's final construction of the search query, it does force the user to think about the topic formulation.It also produces the word(s) required by the other four boxes.If the user has trouble in completing the information requested in the remaining four boxes, it normally implies that the information need in the first box is incomplete.The user may at any time simply come back to this box.add more specific terms/phrases, update the other four boxes, and try again.
The second box requires that a word be entered (it could be more than one) which is normally spelt using (a) capital letter(s).First names and abbreviations immediately come to mind.Users should take care to type this word the way it is used in everyday terms, i.e. 'MICROSOFT is correctly spelt, but 'Microsoft' is a more common way of using this word.Other examples: FBI.DVD, Monaco and Charles.FOIOTI preserves the capitalization of this box before the query is submitted.If no caps words can be deduced from the spec, box, any lowercase meaningful word could be used (stop words such as: the, that.in.information, computer, etc should be avoided).
The third box requests a phrase.FOIOTI's definition of a phrase is: 'Two or more words which have been copied from a grammatically correct.English sentence (in the Spec box) without disturbing the order.'Users should avoid general phrases such as 'information about computers,' 'John Smith,' 'world record,' etc. Better examples are 'lower Manhattan,' 'Tour de France' and 'UK economic indicators.'An important feature of the contents of this box is that it can be used effectively to focus the scope of a search.FOIOTI preserves the sequence, capitalization and spelling of this phrase in the search query.
The phrase should be used to widen or narrow the scope of the search.If too many answers are received, the phrase should be lengthened.Similarly, if zero answers are produced while using a six-word phrase, cut off one word at a time to broaden the focus, which should produce more answers.The following four phrases, as an example, could be used as part of a complete FOIOTI specification to successively apply a sharper focus to a search: ' 'teaching materials' • 'teaching materials for English' • 'teaching materials for English literature' • 'teaching materials for Shakespearean English literature' The final two boxes require any two words related to the topic to be typed in.Again, an attempt should be made to refrain from using general terms (such as "bicycle' or 'President') -substitute with specific words (such as 'Cannondale' and 'Putin').Once all five boxes have been completed, the user should now click on one or more of the search engine buttons at the bottom of the gray box.When visiting the second site from the result summaries, a document is found which contains a full explanation of the original information need -see Figure 3 for an excerpt.

Th
Iglateq Lir>KS he following websites are related to the current topic "OSI Peference Model".Feel free to click on them for more

Final results
The most basic purpose of this research project was to enable a learner to find relevant information on the Internet.One way to measure FOIOTI's success in achieving this goal, is to compare the number of Yes answers {i.e.'Yes', I have found relevant information within the time and other constraints of the experiment) before FOIOTI with the number of Yes answers thereafter.This led to the definition of four possible classes, depending on the final answer given for each experiment (before/after): No/No, No/Yes, Yes/No and Yes/Yes.However, since the two related experiments were often done on separate days at the same institution, it could happen that a learner attended only one of the two sessions.If one of the two an.swers were absent, that participant's contribution would become invalid, producing a fifth class.Other reasons for classifying a response as invalid included incomplete forms and invalid Yes/No answers, pointed out by other information supplied on the same form.These five classes were now formalised.For ease of interpretation, they were arranged from most wanted result (Class I) to least wanted result (Class 4).9. Overall, FOIOTi thus improved the searching ability of 71.0% of learners, it had no positive effect on 25.6% of learners, and it had a negative effect on 3.4% of learners.

Conclusion
Considering the limits of the study in terms of defining the population, the practical issues of working in computer laboratories, etc, the results showed that FOIOTI can be successful in increasing the success rate of the population sample.
It is doubtful whether there will ever be an up-to-date, central index from where users can access all the information on the Internet in an orderly and consistent way.The freedom which users have to add and modify information on websites, makes it virtually impossible to manage and control an index of this kind.
Search engine databases are in a constant state of change, meaning that the same search on the same search engine on two consecutive days often yields different results.Interfaces and even searching algorithms are updated, forcing the user to alter his/her searching strategy.
The result is that it is up to the user to acquire a certain level of navigational skill, and specifically searching competence.A basic understanding of the alternative search engine operators outlined earlier is essential for survival in a world where information overload commonly occurs.Carefully focussed searches using more advanced operators and concepts are becoming almost a necessity and not the exception.However, extensive searching experience and consultation by the author have shown that most users have a strong resistance to: the technicalities of searching, changing to new search engines and the learning curve of search strategies.Products like FOIOTI could solve this problem.
A number of dedicated data bases and/or search engines have been in existence for some time, servicing some of the major disciplines such as law, medicine, finance, etc. (Weideman 2002b).These search engines focus on one specific area only, and offers the user a selection of links in a very specific and narrow area.However, most of them still slow the user down by offering unfriendly interfaces, technical procedures and limited depth databases.The FOIOTI concept could be applied to one of these search engines, allowing the user easier access to a set of selected, focused set of information.
It is finally concluded that, if used by an audience as defined, there is a strong possibility that FOIOTI will increase the success rate of Internet searching.
A new research direction which the author has embarked on.flowing from the research described here, is website visibility.This article (and other work done at this point) looks at the website from the user's side: how can the user easily find the website?Website visibility looks at the same website from a different angle: how can the website owner ensure that, once a search engine has found the website, it will be indexed properly?Work is currently in progress towards defining a best practice approach to designing websites for search engines.

Acknowledgements
The author wants to acknowledge the Cape Technikon for partial funding and leave to undertake this research project, as well as Hewlett Packard and Acer Africa for further funding.My thanks also go to Wouter Kritzinger for proofreading and finalizing the document.

REQUEST 4 I
need a description of the operation of the Melissa computer virus.
Virus OR URLVirus) AND (-I-Melissa -f computer -l-virus -^description) least successful search engines (Yahoo!and Lycos) were subsequently dropped from the last stage of experimentation.Furthermore, the successes (bold figures) per search engine were added, producing the following final number of successes:

Figure 3 .
Figure 3. Excerpt from explanation However, only class 1, 3 and 4 provide a true indication of FOIOTI's performance.In the case of class 2, FOIOTI could not have any positive effect on the learner's searching ability, and it did not have a negative effect.Similarly, class 5 is invalid for reasons outside the control of the author, and it has no bearing on FOIOTI's performance.Therefore both classes 2 and 5 are omitted for the final evaluation.FOIOTI's performance-based results of the last tv>/o sets of experiments are listed in Table Universities and technikons have traditionally supplied the local market with IT/IS learners, via qualifications such as a B.Sc. (Computer Science), B.Comm., National Diploma in Information Technology and National Diploma in Financial Information Systems.As a result, local universities and technikons, plus similar international institutions would form the population for this study.Those institutions where a controller could be identified to assist would be included in the study.The controller would be necessary to arrange computer rooms, groups of learners, Internet access and other facilities ahead of time.Eventually a total of six out of 15 technikons, and I I out of 21 universities in South Africa were included in the sample.These institutes are: Cape.Natal, Wits.Port Elizabeth, Orange Free State and Border.UniversitiesSteltenbosch, Durban-Westville, Natal, Rand Afrikaans.Orange Free State, plus the following Vista University campuses: Mamelodi, Port Elizabeth, Bloemfontein, Welkom.East Rand and Soweto.I

Table I
Interaction of Sample and Population size

Table 2
Search Engine Choice

Table 3
Highest Search Engine Scores

Table 4
Operators used by Searchers

Table 7
FOIOTI Class Definition

Table 8
Reasons for Invalid definition

Table 9
FOIOTI Class Results