Index of Section 3 Manual Pages

Interix / SUApcrebuild.3Interix / SUA

PCRE(3)                                                   PCRE(3)



NAME
       PCRE - Perl-compatible regular expressions

PCRE BUILD-TIME OPTIONS

       This document describes the optional features of PCRE that
       can be selected when the library is compiled. They are all
       selected,  or deselected, by providing options to the con-
       figure script which is run before the  make  command.  The
       complete list of options for configure (which includes the
       standard ones such as the selection  of  the  installation
       directory) can be obtained by running

         ./configure --help

       The  following  sections  describe  certain  options whose
       names begin with --enable  or  --disable.  These  settings
       specify changes to the defaults for the configure command.
       Because of the way  that  configure  works,  --enable  and
       --disable  always  come  in  pairs,  so  the complementary
       option always exists as well,  but  as  it  specifies  the
       default, it is not described.


UTF-8 SUPPORT

       To  build  PCRE  with support for UTF-8 character strings,
       add

         --enable-utf8

       to the configure command. Of itself, this  does  not  make
       PCRE  treat  strings  as  UTF-8. As well as compiling PCRE
       with this option, you also have have to set the  PCRE_UTF8
       option when you call the pcre_compile() function.


CODE VALUE OF NEWLINE

       By  default,  PCRE  treats  character 10 (linefeed) as the
       newline character. This is the normal newline character on
       Unix-like  systems.  You can compile PCRE to use character
       13 (carriage return) instead by adding

         --enable-newline-is-cr

       to the configure command. For completeness there is also a
       --enable-newline-is-lf  option, which explicitly specifies
       linefeed as the newline character.


BUILDING SHARED AND STATIC LIBRARIES

       The PCRE building  process  uses  libtool  to  build  both
       shared  and static Unix libraries by default. You can sup-
       press one of these by adding one of

         --disable-shared
         --disable-static

       to the configure command, as required.


POSIX MALLOC USAGE

       When PCRE is called through the POSIX interface  (see  the
       pcreposix  documentation),  additional  working storage is
       required for holding the pointers to capturing  substrings
       because   PCRE  requires  three  integers  per  substring,
       whereas the POSIX interface provides only two. If the num-
       ber  of expected substrings is small, the wrapper function
       uses space on the stack, because this is faster than using
       malloc()  for each call. The default threshold above which
       the stack is no longer used is 10; it can  be  changed  by
       adding a setting such as

         --with-posix-malloc-threshold=20

       to the configure command.


LIMITING PCRE RESOURCE USAGE

       Internally,  PCRE  has  a function called match() which it
       calls repeatedly (possibly recursively) when performing  a
       matching  operation.  By limiting the number of times this
       function may be called, a  limit  can  be  placed  on  the
       resources  used by a single call to pcre_exec(). The limit
       can be changed at run time, as described  in  the  pcreapi
       documentation.  The default is 10 million, but this can be
       changed by adding a setting such as

         --with-match-limit=500000

       to the configure command.


HANDLING VERY LARGE PATTERNS

       Within a compiled pattern, offset values are used to point
       from  one  part  to  another (for example, from an opening
       parenthesis to an alternation metacharacter).  By  default
       two-byte  values  are used for these offsets, leading to a
       maximum size for a compiled pattern of around 64K. This is
       sufficient  to  handle all but the most gigantic patterns.
       Nevertheless, some people do want to process enormous pat-
       terns, so it is possible to compile PCRE to use three-byte
       or four-byte offsets by adding a setting such as

         --with-link-size=3

       to the configure command. The value given must be 2, 3, or
       4.  Using  longer offsets slows down the operation of PCRE
       because it has to  load  additional  bytes  when  handling
       them.

       If you build PCRE with an increased link size, test 2 (and
       test 5 if you are using UTF-8) will fail. Part of the out-
       put  of  these  tests  is a representation of the compiled
       pattern, and this changes with the link size.


AVOIDING EXCESSIVE STACK USAGE

       PCRE implements  backtracking  while  matching  by  making
       recursive calls to an internal function called match(). In
       environments where the size of the stack is limited,  this
       can severely limit PCRE's operation. (The Unix environment
       does not usually suffer from this problem.) An alternative
       approach  that uses memory from the heap to remember data,
       instead of using recursive function calls, has been imple-
       mented  to work round this problem. If you want to build a
       version of PCRE that works this way, add

         --disable-stack-for-recursion

       to the configure command. With  this  configuration,  PCRE
       will  use  the pcre_stack_malloc and pcre_stack_free vari-
       ables to call memory management functions. Separate  func-
       tions  are provided because the usage is very predictable:
       the block sizes requested are always  the  same,  and  the
       blocks  are  always freed in reverse order. A calling pro-
       gram might be able to implement optimized  functions  that
       perform better than the standard malloc() and free() func-
       tions. PCRE runs noticeably more slowly when built in this
       way.


USING EBCDIC CODE

       PCRE assumes by default that it will run in an environment
       where the character code is ASCII (or UTF-8,  which  is  a
       superset  of ASCII). PCRE can, however, be compiled to run
       in an EBCDIC environment by adding

         --enable-ebcdic

       to the configure command.

Last updated: 09 December 2003
Copyright (c) 1997-2003 University of Cambridge.



                                                          PCRE(3)

Interix / SUAHosted at SUA Community for Interix, SUA and SFUInterix / SUA