 Analog 5.32: Search arguments
 Analog 5.32: Search arguments/cgi-bin/script.pl?x=1&y=2runs the /cgi-bin/script.pl program with arguments x=1 and y=2. (Sometimes the server records these arguments in a separate field in the logfile, but if so you can use the %q field in the LOGFORMAT command, and analog will translate the filename to the above format).
You can tell analog either to read or to ignore the arguments using the commands ARGSINCLUDE and ARGSEXCLUDE which we'll discuss in a minute. But by default, all arguments are read, and as this is usually what you want, you don't usually need those commands.
You don't always see the arguments in the reports, even if they're being read, because analog doesn't show them if there aren't enough of them. In order to see them, you have to set the corresponding ARGSFLOOR parameter low enough.
Also note that within a report, the search arguments are listed immediately under the file to which they refer. This temporarily interrupts the normal order of the files. It may be clearer if you turn the N column on.
The reason is that, for example, the command
FILEINCLUDE /cgi-bin/script.pldoesn't match the file /cgi-bin/script.pl?x=1&y=2. To match that, you would have to use something like
FILEINCLUDE /cgi-bin/script.pl*instead. Similarly
FILEALIAS /cgi-bin/script.pl /script.plwill change /cgi-bin/script.pl itself, but not /cgi-bin/script.pl?x=1&y=2. You might want to use something like
FILEALIAS /cgi-bin/script.pl?* /script.pl?$1as well. (However, PAGEINCLUDE and PAGEEXCLUDE always refer to the part of the filename before the question mark.)
Conversely, because in the Request Report files with arguments are only included if their parent file is included, you can't just
REQINCLUDE /cgi-bin/script.pl?*x=1*or you will end up with nothing listed. You have to
REQINCLUDE /cgi-bin/script.plas well.
ARGSEXCLUDE /cgi-bin/script.plwere given, analog would ignore the arguments to that file, and so read /cgi-bin/script.pl?x=1&y=2 as just /cgi-bin/script.pl. On the other hand, if
ARGSINCLUDE /cgi-bin/script.plwere specified, analog would read the arguments, and so treat /cgi-bin/script.pl?x=1&y=2 as a different file from /cgi-bin/script.pl. REFARGSINCLUDE and REFARGSEXCLUDE are the same for referrers.
Technical note: the check for whether the arguments should be included happens before the filename has been subject to either built-in or user-specified aliases. So you have to use the unaliased name, exactly as it occurs in the logfile. For example, ARGSINCLUDE /~sret1/script.pl won't match /%7Esret1/script.pl even though they are really the same file. It also means that you can't use "pages" in the ARGSINCLUDE or ARGSEXCLUDE command, because we don't know whether a file is a page until after it's been aliased.
http://www.altavista.com/cgi-bin/query?pg=q&kl=XX&q=carrot+cakeThe search term is in the field q= so the appropriate SEARCHENGINE command is
SEARCHENGINE http://www.altavista.com/cgi-bin/query q(or even better
SEARCHENGINE http://*altavista.*/* qto allow for all their mirror sites in different countries.)
The command INTSEARCHENGINE is the same for search engines, or other scripts which take arguments, within your site. For example, you might have requests for files like
/cgi-bin/search?trm=chocolate+cakein which case you would specify
INTSEARCHENGINE /cgi-bin/search trmand (assuming you haven't done an ARGSEXCLUDE for that file) "chocolate cake" would then appear in your Internal Search Query Report.
Sometimes a search engine has two or more possible fields for the search term. In that case you can list all of them separated by commas, like this:
SEARCHENGINE http://*webcrawler.*/* search,searchText
I said previously that %7E in a URL is automatically converted to ~, etc. In fact this is only done to the ASCII-printable characters %20-%7E, because these are the only characters that are the same in every character set. (In fact, even that isn't true. Experts might want to know that ?, &, ; and = aren't converted either, to distinguish them from query-string delimiters: an encoded ?, &, ; or = is one that is not intended to be a delimiter. Also % isn't converted, to avoid confusing %25nm with %nm.)
But in the Search Query Report and Search Word Report it is useful to be able to convert non-ASCII characters too, so that you can see the actual words people typed, rather than get the %nm codes in place of all accented letters. So in these reports analog also converts characters %A0-%FF (if you are using an ISO-8859-* character set) or %80-%FF (for most other character sets).
However, there are reasons why you might not want this feature, and you can turn it off with the command
SEARCHCHARCONVERT OFFThese reasons include:
Stephen Turner
Need help with analog? Use the analog-help mailing list.