Agent  |  Author  |  Consultant  |  Manager  |  Reporter  |  Webmaster      
   
 

 

   Investigating   

 Keys: investigate, search, find, internet, network, forums, email, webpages, keywords


Presentation:
Investigating on the internet sometimes means looking for a needle in a haystack. Information is not often easy to find, either because rare (few webpages talk about it), either because it is lost among thousands of other pages which deals with similar ideas or similar "keywords". I'll try here to explain my methodology to find information effectively on the net.

My advices:
Preparing: How to select your arsenal of tools
What you need for effective searches is to have a arsenal of tools ready to use.
Search tools on the net includes:
- search-engines
Now the most powerful search-engine is definitely Google.com, which allows web searching of text, images, news and newsgroup messages (ex-Deja.com). Even if it is only when Google results have been totally exploited that you may need to look at other search-engines, it is good to know 2 or 3 other seach-engines. In some places it is even compulsory: in China, for instance, Google is often blocked by the authorities! Abondance.com (in French) gives you good lists of search-engines & directories, and their specificities.

Notes:
- Yahoo uses the same index than Google, so don't expect much from Yahoo if you have access to Google.
- You can also use special software as a search-engine, which crawl the results from many search-engines. Copernic is a good example of that.

Tips with Google:
- To gain time, don't click directly on the links of Google results, but (under Windows) right-click and choose "Open link in a New window"; you can open several results very quickly and when you'll be browsing the first page, the other results will have time to finish being downloaded.
You can also change the Preferences of Google (specific to the computer you're using) so that any click to a result open a new window automatically.
- Google proposes cache of webpages, which means that if when clicking on a link from Google results you get a "Page cannot be displayed", you can go back to Google results and see the page "cache", to get the information! This is very powerful and useful.
- With Google you can use different interesting tips to precise your keywords. These tips are described here. The most useful are maybe the "..." which filter the pages which only have the exact group of words that appears between the brackets. The minus enable you to remove all the pages that propose a word, for instance if your results are polluted by porn pages, just put '-sex' (do not put the '), etc.


- newsgroups & forums
There are different means to browse Newsgroups. One is to use a website (like Google in the "Newsgroups" tab), the other one is to use a News client (for instance Outlook Express offers you both Email and Newsgroup services). Newsgroup search can be long, the best is to ask a question in a dynamic newsgroup which deals with the topic you're looking information on. It means that you need to know which Newsgroups cover this topic... thanks to the name of the Newsgroup or to the results of Google.
The problem is about the same for forums, you need to know the websites which proposes dynamic forums on your topic. I personnally use Newsgroup & Forums on regular (not new) topics, because I know well the NG & Forums associated.

- e(mail)-penpals
Your network of e-penpals can be also a very good solution to answer a question... sometimes we don't think of them! And if you feel that it is not the case, develop your penpals network!

- web chatrooms & software-based chat
For chatting with people you can either go on websites that proposes this service (it will depend on the topic of your research), or use the following software: ICQ, MSN Messenger, Yahoo Messenger, AOL Instant Messenger, IRC. I personnally use Trillian which merge all these software in one interface, very convenient.

- P2P Software
When you are looking for ebooks, video or music, the P2P (Peer-to-peer) software are effective tools. The most famous software are Naptster, Kazaa, WinMX, Gnutella, etc. and the recent Tesla.

Using Search-engines: Choosing the right keywords
The first problem that faces most of internet users is bad or poor results in their searches. The reason is 9/10 times a bad choice of keywords. Too generic keywords will bring you too many pages with very few connection with what you're looking for. Too many precise keywords will bring you to empty results.

To choose the right keywords, you need to:
- try to identify an "effective keyword", by finding a word which is rare enough to give limited results but which is 100% related to the information you want.
For instance if you're looking for a table to translate Korean characters, the word "romanization" is exactly what you need: it is not very common but used to describe exactly what you're looking for.
Request: romanization table Korean
- use not ambiguous words, choose words that have not a double meaning, and if they do, precise them with another word (if you can use "...", if they both form a common combination of words).
For instance, if you look information on a company producing TV shows called Case Production, the words "case" and "production" are too generic.
Request: "case production" entertainment
- think of words that have to be in the page where the information you're looking for is. You have to imagine the context where the information can be found.
For instance, if you look at the lyrics of "Music" (Madonna), as far as both the title and author are very common words, add one of the sentence of the song.
Request: music madonna lyrics "I wanna dance with my baby"
- filter by adding/removing keywords in your request.
For instance, if you're looking for information on a rare strategy book written by Toshishiro Obata, you may start with "toshishiro obata" but to remove all the websites which deals with generic information on this Shinkendo master, you'll add:
Request: "toshishiro obata" +strategy +book
Note: the + is optional in Google, it is put here to highlight the fact that you want only the webpages that have the word "strategy book" in their content.

Surfing: How to get more information from information
When you'll have found information which is related to what you're looking for, but you want to know more, you can:
- explore more deeply the website, using any mean, following all the links, absorbing the whole website with a Offline Browser and search it locally with Windows Search on text in the pages, etc. A technique to try is to remove the name of the file which appears in the web address, in order to try to get the directory list of the folder where the page is stored on the server. Sometimes, it shows you other pages that would not be easy to find by following the links of the website. You can remove also the last folder name and so on, to explore all the folders of the website.
- something which costs nothing is to email the webmaster and to directly ask for the information you're looking for. It does not work all the time but it is not rare to find very nice people who will answer you.
- and mostly, when you have got some precisions on the information you're looking for, don't forget that what you have discovered has to be re-used in your search-engine request, in order to filter the pages again and get more precise results!

Searching into webpages
This tip is a very trivial, but how often I see people who don't use it... when you look for precise info on the internet, you often find webpages with long texts. As soon as the webpage is downloaded (even before, if a part is already loaded), you should use the Find tool of your browser (Control-F for Internet Explorer) in order to go straight to the word that interest you and to see if the information of this page is valuable or if you can skip it and try another one.

Looking into the Source of the page
Looking into the source of the page (Menu View/Source for Internet Explorer) means to look into the programming code of the page. It is scary for people who have never do any programmation, but it can be useful. If you know a bit of HTML language, it will for instance enable you to steal any picture, even if protected by the website. For not IT skilled people, it can help find the webmaster name/email, in the first lines of the pages, or also to see the list of keywords that have been used to register this page in the search-engines (all these info starts with "<meta name=...>") and can help you find more precise keywords.

For looking for all the email address of a page, sometimes they're hidden in links or forms, just do a search of the @ character in the source of the page.

Searching in multi-languages
A limitation of your search is the languages that you can master. But you don't need to speak a language to find information in this language! Today, many "translation assistants" can immediately (and for free) translate a webpage for you. Of course the translation is not very accurate and many mistakes can make it hard to re-use as it is, it is only an "assistant", but it can give a good overall idea of what the webpage is about. And after, why not trying to contact the webmaster for more precisions, in English!

Translation assistants: AV Babelfish, Systran, Reverso, WordLingo, Google linguistic tool, etc.

Case study: Bwang, a martial arts from Micronesia
Here is a concrete example of investigation on the internet, that happened to me few years ago. At the redaction of Karate-Bushido magazine, the editor, Patrick Lombardo evocated a martial arts he had heard of in the past and had no news of... it was called Bwang. It was impossible to find anything in the different resources we had. Back home, I decided to surf the internet. At that time Google was not born and Altavista was the most efficient seach-engine. But a search on Bwang (Request: bwang) did not give anything except links to people called B.Wang or things like this. To filter these unvaluable results, I added the word "martial" (Request: bwang martial) and got a few pages. Some of them were things like resume of some Mr B.Wang who have been practising Aikido when they were young... and only one page talked about what interested me. This page was a very simple HTML page, with no link, only text, a bibliography (http://www.uog.edu/up/micronesica/indexes/toc.htm, today the page has changed). In these references list, one article was mentioned: "Bwang, A Martial Art of the Caroline Islands, par William A. Lessa & Carlos G. Velez-I". It was both a very interesting information, it meant that the word Bwang was the right word with no mistake. This page was the only one and was not linked to anything. So from that point, my methodology gave me several means to continue. One was to look again into the search-engine using the names of the writers of this article in order maybe to contact them. Another was to ask to my penpals who knows about that subject (difficult in that case, the topic is too precise, too rare). What I did was first to look at the source of the page and I found the name of the one who created the webpage, but there was no email. Then I removed the name of the HTML file in the URL (http://www.uog.edu/up/micronesica/indexes/). The website was the one of the University of Guam, an island of Micronesia. By going up into the website, I managed to find lists of people of the Univeristy and their email contact. Then it was easy, I even managed to find the one who did the webpage and contacted him. I managed to get a xerox-copy of this article in the issue of Micronesica, the Journal of the University of Guam (old from 1978) for the price of the postage, and got a very nice article of several pages with technical pictures of Bwang :-)

Note that the results have changed since that time because my website and other websites who has visibly read my story appears in the first ranks with info on Bwang. Moreover (and fortunately) the Guam University website has been totally re-designed since that time.




 

 

Quick access
My Profiles casting - international
My Pictures action - misc.
My Videos knife - action
My Resumes en - fr - zh   Acrobat Reader needed to read my CV -- Download it here!
My Publications  articles - books
 

Research in progress
Genealogy
Hoplology
Martial Arts
Strategy
Guanxi
Gods Of War
CJK Languages
Golden Mean
Investigating
Survival
Camouflage
Urban Graphism
Travelling

 

"When you seek it, you cannot find it."
cf. Anonymous (popular saying).

Some people say it is a Zen riddle... Anyway it is so true sometimes, particularly when you're looking for people or abstract things. But when seeking on the internet, I hope you'll manage to turn this quote into a lie.

 

 
 
© 2000-2008 Guillaume Morel - http://www.guillaumemorel.com
    Conditions & Credits :: Design & Ergonomy :: Links :: Contact