From 582b500cd996c96054615870fd13d6ab0ea77428 Mon Sep 17 00:00:00 2001
From: Jay Berkenbilt <ejb@ql.org>
Date: Sat, 10 Oct 2009 15:10:05 +0000
Subject: start integrating windows port

git-svn-id: svn+q:///qpdf/trunk@757 71b93d88-0707-0410-a8cf-f5a4172ac649
---
 external-libs/pcre/doc/pcretest.txt | 357 ++++++++++++++++++++++++++++++++++++
 1 file changed, 357 insertions(+)
 create mode 100644 external-libs/pcre/doc/pcretest.txt

(limited to 'external-libs/pcre/doc/pcretest.txt')

diff --git a/external-libs/pcre/doc/pcretest.txt b/external-libs/pcre/doc/pcretest.txt
new file mode 100644
index 00000000..0e9cd138
--- /dev/null
+++ b/external-libs/pcre/doc/pcretest.txt
@@ -0,0 +1,357 @@
+PCRETEST(1)                                                        PCRETEST(1)
+
+
+
+NAME
+       pcretest - a program for testing Perl-compatible regular expressions.
+
+SYNOPSIS
+       pcretest [-d] [-i] [-m] [-o osize] [-p] [-t] [source] [destination]
+
+       pcretest  was written as a test program for the PCRE regular expression
+       library itself, but it can also be used for experimenting with  regular
+       expressions.  This document describes the features of the test program;
+       for details of the regular expressions themselves, see the  pcrepattern
+       documentation.  For  details  of  PCRE and its options, see the pcreapi
+       documentation.
+
+
+OPTIONS
+
+
+       -C        Output the version number of the PCRE library, and all avail-
+                 able   information  about  the  optional  features  that  are
+                 included, and then exit.
+
+       -d        Behave as if each regex had the /D modifier (see below);  the
+                 internal form is output after compilation.
+
+       -i        Behave  as  if  each  regex  had the /I modifier; information
+                 about the compiled pattern is given after compilation.
+
+       -m        Output the size of each compiled pattern after  it  has  been
+                 compiled.  This  is  equivalent  to adding /M to each regular
+                 expression.  For  compatibility  with  earlier  versions   of
+                 pcretest, -s is a synonym for -m.
+
+       -o osize  Set  the number of elements in the output vector that is used
+                 when calling PCRE to be osize. The default value is 45, which
+                 is  enough  for  14 capturing subexpressions. The vector size
+                 can be changed for individual matching calls by including  \O
+                 in the data line (see below).
+
+       -p        Behave  as  if  each regex has /P modifier; the POSIX wrapper
+                 API is used to call PCRE. None of the other options  has  any
+                 effect when -p is set.
+
+       -t        Run  each  compile, study, and match many times with a timer,
+                 and output resulting time per compile or match (in  millisec-
+                 onds).  Do  not set -t with -m, because you will then get the
+                 size output 20000 times and the timing will be distorted.
+
+
+DESCRIPTION
+
+       If pcretest is given two filename arguments, it reads  from  the  first
+       and writes to the second. If it is given only one filename argument, it
+       reads from that file and writes to stdout.  Otherwise,  it  reads  from
+       stdin  and  writes to stdout, and prompts for each line of input, using
+       "re>" to prompt for regular expressions, and "data>" to prompt for data
+       lines.
+
+       The program handles any number of sets of input on a single input file.
+       Each set starts with a regular expression, and continues with any  num-
+       ber of data lines to be matched against the pattern.
+
+       Each  line  is  matched separately and independently. If you want to do
+       multiple-line matches, you have to use the \n escape sequence in a sin-
+       gle  line of input to encode the newline characters. The maximum length
+       of data line is 30,000 characters.
+
+       An empty line signals the end of the data lines, at which point  a  new
+       regular  expression is read. The regular expressions are given enclosed
+       in any non-alphameric delimiters other than backslash, for example
+
+         /(a|bc)x+yz/
+
+       White space before the initial delimiter is ignored. A regular  expres-
+       sion  may be continued over several input lines, in which case the new-
+       line characters are included within it. It is possible to  include  the
+       delimiter within the pattern by escaping it, for example
+
+         /abc\/def/
+
+       If  you  do  so, the escape and the delimiter form part of the pattern,
+       but since delimiters are always non-alphameric, this  does  not  affect
+       its  interpretation.   If the terminating delimiter is immediately fol-
+       lowed by a backslash, for example,
+
+         /abc/\
+
+       then a backslash is added to the end of the pattern. This  is  done  to
+       provide  a  way of testing the error condition that arises if a pattern
+       finishes with a backslash, because
+
+         /abc\/
+
+       is interpreted as the first line of a pattern that starts with  "abc/",
+       causing pcretest to read the next line as a continuation of the regular
+       expression.
+
+
+PATTERN MODIFIERS
+
+       The pattern may be followed by i, m, s, or x to set the  PCRE_CASELESS,
+       PCRE_MULTILINE,  PCRE_DOTALL,  or  PCRE_EXTENDED options, respectively.
+       For example:
+
+         /caseless/i
+
+       These modifier letters have the same effect as they do in  Perl.  There
+       are  others that set PCRE options that do not correspond to anything in
+       Perl: /A, /E, /N, /U, and /X  set  PCRE_ANCHORED,  PCRE_DOLLAR_ENDONLY,
+       PCRE_NO_AUTO_CAPTURE, PCRE_UNGREEDY, and PCRE_EXTRA respectively.
+
+       Searching  for  all  possible matches within each subject string can be
+       requested by the /g or /G modifier. After  finding  a  match,  PCRE  is
+       called again to search the remainder of the subject string. The differ-
+       ence between /g and /G is that the former uses the startoffset argument
+       to  pcre_exec()  to  start  searching  at a new point within the entire
+       string (which is in effect what Perl does), whereas the  latter  passes
+       over  a  shortened  substring.  This makes a difference to the matching
+       process if the pattern begins with a lookbehind assertion (including \b
+       or \B).
+
+       If  any  call  to  pcre_exec()  in a /g or /G sequence matches an empty
+       string, the next call is done with the PCRE_NOTEMPTY and  PCRE_ANCHORED
+       flags  set in order to search for another, non-empty, match at the same
+       point.  If this second match fails, the start  offset  is  advanced  by
+       one,  and  the normal match is retried. This imitates the way Perl han-
+       dles such cases when using the /g modifier or the split() function.
+
+       There are a number of other modifiers for controlling the way  pcretest
+       operates.
+
+       The  /+ modifier requests that as well as outputting the substring that
+       matched the entire pattern, pcretest  should  in  addition  output  the
+       remainder  of  the  subject  string. This is useful for tests where the
+       subject contains multiple copies of the same substring.
+
+       The /L modifier must be followed directly by the name of a locale,  for
+       example,
+
+         /pattern/Lfr
+
+       For  this reason, it must be the last modifier letter. The given locale
+       is set, pcre_maketables() is called to build a set of character  tables
+       for  the locale, and this is then passed to pcre_compile() when compil-
+       ing the regular expression. Without an /L modifier, NULL is  passed  as
+       the tables pointer; that is, /L applies only to the expression on which
+       it appears.
+
+       The /I modifier requests that pcretest  output  information  about  the
+       compiled  expression (whether it is anchored, has a fixed first charac-
+       ter, and so on). It does this by calling pcre_fullinfo() after  compil-
+       ing  an expression, and outputting the information it gets back. If the
+       pattern is studied, the results of that are also output.
+
+       The /D modifier is a PCRE debugging feature, which also assumes /I.  It
+       causes  the  internal form of compiled regular expressions to be output
+       after compilation. If the pattern was studied, the information returned
+       is also output.
+
+       The  /S  modifier causes pcre_study() to be called after the expression
+       has been compiled, and the results used when the expression is matched.
+
+       The  /M  modifier causes the size of memory block used to hold the com-
+       piled pattern to be output.
+
+       The /P modifier causes pcretest to call PCRE via the POSIX wrapper  API
+       rather  than  its  native  API.  When this is done, all other modifiers
+       except /i, /m, and /+ are ignored. REG_ICASE is set if /i  is  present,
+       and  REG_NEWLINE  is  set if /m is present. The wrapper functions force
+       PCRE_DOLLAR_ENDONLY always, and PCRE_DOTALL unless REG_NEWLINE is  set.
+
+       The  /8 modifier causes pcretest to call PCRE with the PCRE_UTF8 option
+       set. This turns on support for UTF-8 character handling in  PCRE,  pro-
+       vided  that  it  was  compiled with this support enabled. This modifier
+       also causes any non-printing characters in output strings to be printed
+       using the \x{hh...} notation if they are valid UTF-8 sequences.
+
+       If  the  /?  modifier  is  used  with  /8,  it  causes pcretest to call
+       pcre_compile() with the  PCRE_NO_UTF8_CHECK  option,  to  suppress  the
+       checking of the string for UTF-8 validity.
+
+
+CALLOUTS
+
+       If  the pattern contains any callout requests, pcretest's callout func-
+       tion will be called. By default, it displays the  callout  number,  and
+       the  start  and  current positions in the text at the callout time. For
+       example, the output
+
+         --->pqrabcdef
+           0    ^  ^
+
+       indicates that callout number 0 occurred for a match  attempt  starting
+       at  the fourth character of the subject string, when the pointer was at
+       the seventh character. The callout  function  returns  zero  (carry  on
+       matching) by default.
+
+       Inserting  callouts may be helpful when using pcretest to check compli-
+       cated regular expressions. For further information about callouts,  see
+       the pcrecallout documentation.
+
+       For  testing  the PCRE library, additional control of callout behaviour
+       is available via escape sequences in the data, as described in the fol-
+       lowing  section.  In  particular, it is possible to pass in a number as
+       callout data (the default is zero). If the callout function receives  a
+       non-zero number, it returns that value instead of zero.
+
+
+DATA LINES
+
+       Before  each  data  line is passed to pcre_exec(), leading and trailing
+       whitespace is removed, and it is then scanned for \  escapes.  Some  of
+       these  are  pretty esoteric features, intended for checking out some of
+       the more complicated features of PCRE. If you are just  testing  "ordi-
+       nary"  regular  expressions,  you probably don't need any of these. The
+       following escapes are recognized:
+
+         \a         alarm (= BEL)
+         \b         backspace
+         \e         escape
+         \f         formfeed
+         \n         newline
+         \r         carriage return
+         \t         tab
+         \v         vertical tab
+         \nnn       octal character (up to 3 octal digits)
+         \xhh       hexadecimal character (up to 2 hex digits)
+         \x{hh...}  hexadecimal character, any number of digits
+                      in UTF-8 mode
+         \A         pass the PCRE_ANCHORED option to pcre_exec()
+         \B         pass the PCRE_NOTBOL option to pcre_exec()
+         \Cdd       call pcre_copy_substring() for substring dd
+                      after a successful match (any decimal number
+                      less than 32)
+         \Cname     call pcre_copy_named_substring() for substring
+                      "name" after a successful match (name termin-
+                      ated by next non alphanumeric character)
+         \C+        show the current captured substrings at callout
+                      time
+         \C-        do not supply a callout function
+         \C!n       return 1 instead of 0 when callout number n is
+                      reached
+         \C!n!m     return 1 instead of 0 when callout number n is
+                      reached for the nth time
+         \C*n       pass the number n (may be negative) as callout
+                      data
+         \Gdd       call pcre_get_substring() for substring dd
+                      after a successful match (any decimal number
+                      less than 32)
+         \Gname     call pcre_get_named_substring() for substring
+                      "name" after a successful match (name termin-
+                      ated by next non-alphanumeric character)
+         \L         call pcre_get_substringlist() after a
+                      successful match
+         \M         discover the minimum MATCH_LIMIT setting
+         \N         pass the PCRE_NOTEMPTY option to pcre_exec()
+         \Odd       set the size of the output vector passed to
+                      pcre_exec() to dd (any number of decimal
+                      digits)
+         \S         output details of memory get/free calls during matching
+         \Z         pass the PCRE_NOTEOL option to pcre_exec()
+         \?         pass the PCRE_NO_UTF8_CHECK option to
+                      pcre_exec()
+
+       If \M is present, pcretest calls pcre_exec() several times,  with  dif-
+       ferent  values  in  the match_limit field of the pcre_extra data struc-
+       ture, until it finds the minimum number that is needed for  pcre_exec()
+       to  complete.  This  number is a measure of the amount of recursion and
+       backtracking that takes place, and checking it out can be  instructive.
+       For  most  simple  matches, the number is quite small, but for patterns
+       with very large numbers of matching possibilities, it can become  large
+       very quickly with increasing length of subject string.
+
+       When  \O is used, it may be higher or lower than the size set by the -O
+       option (or defaulted to 45); \O applies only to the call of pcre_exec()
+       for the line in which it appears.
+
+       A  backslash  followed by anything else just escapes the anything else.
+       If the very last character is a backslash, it is ignored. This gives  a
+       way  of  passing  an empty line as data, since a real empty line termi-
+       nates the data input.
+
+       If /P was present on the regex, causing the POSIX  wrapper  API  to  be
+       used,  only  0  causing  REG_NOTBOL  and  REG_NOTEOL  to  be  passed to
+       regexec() respectively.
+
+       The use of \x{hh...} to represent UTF-8 characters is not dependent  on
+       the  use  of  the  /8 modifier on the pattern. It is recognized always.
+       There may be any number of hexadecimal digits inside  the  braces.  The
+       result  is from one to six bytes, encoded according to the UTF-8 rules.
+
+
+OUTPUT FROM PCRETEST
+
+       When a match succeeds, pcretest outputs the list of captured substrings
+       that  pcre_exec()  returns,  starting with number 0 for the string that
+       matched the whole  pattern.  Here  is  an  example  of  an  interactive
+       pcretest run.
+
+         $ pcretest
+         PCRE version 4.00 08-Jan-2003
+
+           re> /^abc(\d+)/
+         data> abc123
+          0: abc123
+          1: 123
+         data> xyz
+         No match
+
+       If  the strings contain any non-printing characters, they are output as
+       \0x escapes, or as \x{...} escapes if the /8 modifier  was  present  on
+       the  pattern.  If  the pattern has the /+ modifier, then the output for
+       substring 0 is followed by the the rest of the subject string,  identi-
+       fied by "0+" like this:
+
+           re> /cat/+
+         data> cataract
+          0: cat
+          0+ aract
+
+       If  the  pattern  has  the /g or /G modifier, the results of successive
+       matching attempts are output in sequence, like this:
+
+           re> /\Bi(\w\w)/g
+         data> Mississippi
+          0: iss
+          1: ss
+          0: iss
+          1: ss
+          0: ipp
+          1: pp
+
+       "No match" is output only if the first match attempt fails.
+
+       If any of the sequences \C, \G, or \L are present in a data  line  that
+       is  successfully  matched,  the substrings extracted by the convenience
+       functions are output with C, G, or L after the string number instead of
+       a colon. This is in addition to the normal full list. The string length
+       (that is, the return from the extraction function) is given  in  paren-
+       theses after each string for \C and \G.
+
+       Note  that  while patterns can be continued over several lines (a plain
+       ">" prompt is used for continuations), data lines may not. However new-
+       lines can be included in data by means of the \n escape.
+
+
+AUTHOR
+
+       Philip Hazel <ph10@cam.ac.uk>
+       University Computing Service,
+       Cambridge CB2 3QG, England.
+
+Last updated: 09 December 2003
+Copyright (c) 1997-2003 University of Cambridge.
-- 
cgit v1.2.3-54-g00ecf