![](https://secure.gravatar.com/avatar/fce1e7abb5e18fd793c2c40f1252e9ef.jpg?s=120&d=mm&r=g)
Dear friends Please find an interesting article related to Information Retrieval. Source: http://www.hinduonnet.com/stories/2003072800120200.htm Aman Kumar Jha Librarian South Asia Human Rights Documentation Centre New Delhi Information retrieval with cURL THIS WEEK NetSpeak focuses on a tool for automatically transferring files from the Net using a variety of Net protocols. The innumerable and wide ranging Net resources available on the various servers across the Net are stored/retrieved using different well-defined methods or protocols. The web protocol HTTP (Hyper Text Transfer Protocol), the FTP (File Transfer Protocol) and the Gopher protocol are three popular protocols. To retrieve resources stored under various servers, many tools that understand these protocols have been developed. These are called Net clients. The web browser, which understands the HTTP protocol, is a client tool for retrieving resources from a web server. Similarly there are clients for other protocols such as FTP and POP. But the drawback of most of these tools is that, though quite easy to operate, they lack enough flexibility. For example, the browser does not have an easy mechanism to selectively download a few web pages from a site. Again, if you want to automatically download regularly all newly updated files with a specific extension from a site, current browser features become inadequate. Here is another scenario: Suppose a Net resource is mirrored at several locations and to speed up the download process you want to split the file into multiple parts and download each of them from separate locations, instead of downloading the file completely from one location. None of the current browsers can do of do this. Al these point to the fact that there is need for a better information retrieval tool that will help automate some of the download tasks and allow downloading of materials as per requirements. One such client tool is cURL, which can be used to download Net resources automatically with extreme ease and flexibility. cURL As per its web site (http://curl.haxx.se/) cURL (stands for `client URL') is a command-line utility for transferring files from diverse Net servers such as FTP, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP. Any Net resource that uses standard URL format (like http://, ftp://) can be retrieved with this tool. Please note that cURL has nothing to do with 'Curl', the web programming language discussed in this column (http://www.hinduonnet.com/thehindu/biz/2001/12/13/stories/2001121300 340100.htm) long ago. You may also note that this free tool, which runs on platforms such as Windows and Linux, not only allows you to download files but can also be used to upload files on to such servers as web and FTP. Another highlight of this service is the availability of the free curl library libcurl that can be integrated with popular programming languages such as Basic, C, C++, Python and PHP to develop curl based applications. As usual, to get a good hold and internalise the capabilities of this program, let us go through a few examples: The command: `curl http://www.hinduonnet.com' will download the home page of The Hindu's site and will display it on your screen. If you want to view the HTTP exchange or the conversation that takes place between the client and the web server while downloading the web page, execute the above command with the `v' (v stands for verbose) option (like this: curl - v http://www. hinduonnet.com) To upload a file on to an FTP server, use curl with the option `T' as follows: Curl - T readme-file ftp://user: pass word@ftp-domain-name.com The above command will upload the file `readme-file' on to the FTP server. If you have a file stored in two locations, using cURL with `r' option, you can retrieve parts of the file simultaneously from these servers. For example, suppose you have a file called `example.doc' of size 39424 bytes, stored in two servers called server 1 and server 2. To download the file in two parts from these servers, use the following commands: curl -r 0-1500 -o example 1 http://server1/example.doc (this will download first 1500 bytes of the file from server 1 and store it as `example 1') curl -r 1501- -o example 2 http://server2/example.doc (this command will download the rest of the file from server 2 and store it as `example 2') Combining the two downloaded files, we will get the original file. Programs based on libcurl As already mentioned, along with the cURL client program, a curl `program library' that can be used to create cURL-based applications is also available at the cURL site. The free program Getleft developed for downloading complete web sites is a good example of a product created with the help of cURL library. The software, Getleft, written in TCL (Tool Command Language http://tcl.activestate.com/), can be used to download a web site completely by just entering the site's address in its input box. During the downloading process, it alters links in the original page so that you can browse the site locally without any trouble. For more details, check out: http://personal1.iddeo.es/andresgarci/ getleft/english/ Bootdisk If you are a regular computer user, it is likely that you have faced the problem of hard disk failure at least once. Once the machine fails to boot from the hard disk, the next feasible solution is to attempt to boot using a bootable floppy disk. If you have failed/forgotten to create a boot disk during the installation stage, you will not be able to take this course. In case you land up in such troubled situations, check out the site BootDisk (http://www.bootdisk.com), which hosts many programs that can be used to create boot disks. For example, if you want to create a DOS 6.22 boot disk, access the site, click on the `DOS/Windows... ' option and download the appropriate boot software. Now, insert a fresh floppy disk and execute the downloaded software, which will convert the floppy into bootable disk by transferring the DOS system files on to it. Apart from several `boot' programs, you will find many valuable materials that include information on hard disk partitioning, networking and several `how to guides' in this site. Free web space index It is quite likely that many of you have own web sites or are planning to launch one. One easy route is to avail the services of a free web space provider and build a site on its server. There are many free web space providers on the Net such as Tripod (http://www.tripod.com) and Netfirms (http://netfirms.com/) with varying tools/facilities. To get the latest information on the free web space providers, check out the search service `Free web hosting space finder' at: http://www.free-web-space-finder. com/ J. Murali _________________________________________________________________ They are beautiful. They are in danger. http://server1.msn.co.in/Slideshow/BeautyoftheBeast/index.asp Our four-legged friends.
participants (1)
-
Aman Kumar Jha