Developing applications against the Subversion library APIs
      is fairly straightforward.  Subversion is primarily a set of C
      libraries, with header (.h) files that live
      in the subversion/include directory of the
      source tree.  These headers are copied into your system
      locations (e.g., /usr/local/include)
      when you build and install Subversion itself from source.  These
      headers represent the entirety of the functions and types meant
      to be accessible by users of the Subversion libraries.  The
      Subversion developer community is meticulous about ensuring that
      the public API is well documented—refer directly to the
      header files for that documentation.
When examining the public header files, the first thing you
      might notice is that Subversion's datatypes and functions are
      namespace-protected.  That is, every public Subversion symbol
      name begins with svn_, followed by a short
      code for the library in which the symbol is defined (such as
      wc, client,
      fs, etc.), followed by a single underscore
      (_), and then the rest of the symbol name.
      Semipublic functions (used among source files of a given
      library but not by code outside that library, and found inside
      the library directories themselves) differ from this naming
      scheme in that instead of a single underscore after the library
      code, they use a double underscore
      (_ _).  Functions that are private to
      a given source file have no special prefixing and are declared
      static.  Of course, a compiler isn't
      interested in these naming conventions, but they help to clarify
      the scope of a given function or datatype.
Another good source of information about programming against the Subversion APIs is the project's own hacking guidelines, which you can find at http://subversion.tigris.org/hacking.html. This document contains useful information, which, while aimed at developers and would-be developers of Subversion itself, is equally applicable to folks developing against Subversion as a set of third-party libraries. [53]
Along with Subversion's own datatypes, you will see many
        references to datatypes that begin with
        apr_—symbols from the Apache Portable
        Runtime (APR) library.  APR is Apache's portability library,
        originally carved out of its server code as an attempt to
        separate the OS-specific bits from the OS-independent portions
        of the code.  The result was a library that provides a generic
        API for performing operations that differ mildly—or
        wildly—from OS to OS.  While the Apache HTTP Server was
        obviously the first user of the APR library, the Subversion
        developers immediately recognized the value of using APR as
        well.  This means that there is practically no OS-specific
        code in Subversion itself.  Also, it means that the Subversion
        client compiles and runs anywhere that the Apache HTTP Server
        does.  Currently, this list includes all flavors of Unix,
        Win32, BeOS, OS/2, and Mac OS X.
In addition to providing consistent implementations of
        system calls that differ across operating systems,
        [54]
        APR gives Subversion immediate access to many custom
        datatypes, such as dynamic arrays and hash tables.  Subversion
        uses these types extensively.  But
        perhaps the most pervasive APR datatype, found in nearly every
        Subversion API prototype, is the
        apr_pool_t—the APR memory pool.
        Subversion uses pools internally for all its memory allocation
        needs (unless an external library requires a different memory
        management mechanism for data passed through its API),
        [55]
        and while a person coding against the Subversion APIs is not
        required to do the same, she is
        required to provide pools to the API functions that need them.
        This means that users of the Subversion API must also link
        against APR, must call apr_initialize()
        to initialize the APR subsystem, and then must create and
        manage pools for use with Subversion API calls, typically by
        using svn_pool_create(),
        svn_pool_clear(), and
        svn_pool_destroy().
With remote version control operation as the whole point
        of Subversion's existence, it makes sense that some attention
        has been paid to internationalization (i18n) support.  After
        all, while “remote” might mean “across the
        office,” it could just as well mean “across the
        globe.” To facilitate this, all of Subversion's public
        interfaces that accept path arguments expect those paths to be
        canonicalized—which is most easily accomplished by passing
        them through the svn_path_canonicalize()
        function—and encoded in UTF-8.  This means, for example, that
        any new client binary that drives the
        libsvn_client interface needs to first
        convert paths from the locale-specific encoding to UTF-8
        before passing those paths to the Subversion libraries, and
        then reconvert any resultant output paths from Subversion
        back into the locale's encoding before using those paths for
        non-Subversion purposes.  Fortunately, Subversion provides a
        suite of functions (see
        subversion/include/svn_utf.h) that 
        any program can use to do these conversions.
Also, Subversion APIs require all URL parameters to be
        properly URI-encoded.  So, instead of passing
        file:///home/username/My File.txt as the URL of a
        file named My File.txt, you need to pass
        file:///home/username/My%20File.txt.  Again,
        Subversion supplies helper functions that your application can
        use—svn_path_uri_encode() and
        svn_path_uri_decode(), for URI encoding
        and decoding, respectively.
If you are interested in using the Subversion libraries in
        conjunction with something other than a C program—say, a
        Python or Perl script—Subversion has some support for this
        via the Simplified Wrapper and Interface Generator (SWIG).  The
        SWIG bindings for Subversion are located in
        subversion/bindings/swig.  They are still
        maturing, but they are usable.  These bindings allow you
        to call Subversion API functions indirectly, using wrappers that
        translate the datatypes native to your scripting language into
        the datatypes needed by Subversion's C libraries.
Significant efforts have been made toward creating functional SWIG-generated bindings for Python, Perl, and Ruby. To some extent, the work done preparing the SWIG interface files for these languages is reusable in efforts to generate bindings for other languages supported by SWIG (which include versions of C#, Guile, Java, MzScheme, OCaml, PHP, and Tcl, among others). However, some extra programming is required to compensate for complex APIs that SWIG needs some help translating between languages. For more information on SWIG itself, see the project's web site at http://www.swig.org/.
Subversion also has language bindings for Java.  The
        javahl bindings (located in
        subversion/bindings/java in the
        Subversion source tree) aren't SWIG-based, but are instead a
        mixture of Java and hand-coded JNI.  Javahl covers most
        Subversion client-side APIs and is specifically targeted at
        implementors of Java-based Subversion clients and IDE
        integrations.
Subversion's language bindings tend to lack the level of developer attention given to the core Subversion modules, but can generally be trusted as production-ready. A number of scripts and applications, alternative Subversion GUI clients, and other third-party tools are successfully using Subversion's language bindings today to accomplish their Subversion integrations.
It's worth noting here that there are other options for interfacing with Subversion using other languages: alternative bindings for Subversion that aren't provided by the Subversion development community at all. You can find links to these alternative bindings on the Subversion project's links page (at http://subversion.tigris.org/links.html), but there are a couple of popular ones we feel are especially noteworthy. First, Barry Scott's PySVN bindings (http://pysvn.tigris.org/) are a popular option for binding with Python. PySVN boasts of a more Pythonic interface than the more C-like APIs provided by Subversion's own Python bindings. And if you're looking for a pure Java implementation of Subversion, check out SVNKit (http://svnkit.com/), which is Subversion rewritten from the ground up in Java.
Example 8.1, “Using the Repository Layer”
        contains a code segment (written in C) that illustrates some
        of the concepts we've been discussing.  It uses both the
        repository and filesystem interfaces (as can be determined by
        the prefixes svn_repos_ and
        svn_fs_ of the function names,
        respectively) to create a new revision in which a directory is
        added.  You can see the use of an APR pool, which is passed
        around for memory allocation purposes.  Also, the code reveals
        a somewhat obscure fact about Subversion error
        handling—all Subversion errors must be explicitly
        handled to avoid memory leakage (and in some cases,
        application failure).
Example 8.1. Using the Repository Layer
/* Convert a Subversion error into a simple boolean error code.
 *
 * NOTE:  Subversion errors must be cleared (using svn_error_clear())
 *        because they are allocated from the global pool, else memory
 *        leaking occurs.
 */
#define INT_ERR(expr)                           \
  do {                                          \
    svn_error_t *__temperr = (expr);            \
    if (__temperr)                              \
      {                                         \
        svn_error_clear(__temperr);             \
        return 1;                               \
      }                                         \
    return 0;                                   \
  } while (0)
/* Create a new directory at the path NEW_DIRECTORY in the Subversion
 * repository located at REPOS_PATH.  Perform all memory allocation in
 * POOL.  This function will create a new revision for the addition of
 * NEW_DIRECTORY.  Return zero if the operation completes
 * successfully, nonzero otherwise.
 */
static int
make_new_directory(const char *repos_path,
                   const char *new_directory,
                   apr_pool_t *pool)
{
  svn_error_t *err;
  svn_repos_t *repos;
  svn_fs_t *fs;
  svn_revnum_t youngest_rev;
  svn_fs_txn_t *txn;
  svn_fs_root_t *txn_root;
  const char *conflict_str;
  /* Open the repository located at REPOS_PATH. 
   */
  INT_ERR(svn_repos_open(&repos, repos_path, pool));
  /* Get a pointer to the filesystem object that is stored in REPOS. 
   */
  fs = svn_repos_fs(repos);
  /* Ask the filesystem to tell us the youngest revision that
   * currently exists. 
   */
  INT_ERR(svn_fs_youngest_rev(&youngest_rev, fs, pool));
  /* Begin a new transaction that is based on YOUNGEST_REV.  We are
   * less likely to have our later commit rejected as conflicting if we
   * always try to make our changes against a copy of the latest snapshot
   * of the filesystem tree. 
   */
  INT_ERR(svn_repos_fs_begin_txn_for_commit2(&txn, repos, youngest_rev,
                                             apr_hash_make(pool), pool));
  /* Now that we have started a new Subversion transaction, get a root
   * object that represents that transaction. 
   */
  INT_ERR(svn_fs_txn_root(&txn_root, txn, pool));
  
  /* Create our new directory under the transaction root, at the path
   * NEW_DIRECTORY. 
   */
  INT_ERR(svn_fs_make_dir(txn_root, new_directory, pool));
  /* Commit the transaction, creating a new revision of the filesystem
   * which includes our added directory path.
   */
  err = svn_repos_fs_commit_txn(&conflict_str, repos, 
                                &youngest_rev, txn, pool);
  if (! err)
    {
      /* No error?  Excellent!  Print a brief report of our success.
       */
      printf("Directory '%s' was successfully added as new revision "
             "'%ld'.\n", new_directory, youngest_rev);
    }
  else if (err->apr_err == SVN_ERR_FS_CONFLICT)
    {
      /* Uh-oh.  Our commit failed as the result of a conflict
       * (someone else seems to have made changes to the same area 
       * of the filesystem that we tried to modify).  Print an error
       * message.
       */
      printf("A conflict occurred at path '%s' while attempting "
             "to add directory '%s' to the repository at '%s'.\n", 
             conflict_str, new_directory, repos_path);
    }
  else
    {
      /* Some other error has occurred.  Print an error message.
       */
      printf("An error occurred while attempting to add directory '%s' "
             "to the repository at '%s'.\n", 
             new_directory, repos_path);
    }
  INT_ERR(err);
} 
Note that in Example 8.1, “Using the Repository Layer”, the code could
        just as easily have committed the transaction using
        svn_fs_commit_txn().  But the filesystem
        API knows nothing about the repository library's hook
        mechanism.  If you want your Subversion repository to
        automatically perform some set of non-Subversion tasks every
        time you commit a transaction (e.g., sending an
        email that describes all the changes made in that transaction
        to your developer mailing list), you need to use the
        libsvn_repos-wrapped version of that
        function, which adds the hook triggering
        functionality—in this case,
        svn_repos_fs_commit_txn().  (For more
        information regarding Subversion's repository hooks, see the section called “Implementing Repository Hooks”.)
Now let's switch languages. Example 8.2, “Using the Repository layer with Python” is a sample program that uses Subversion's SWIG Python bindings to recursively crawl the youngest repository revision, and to print the various paths reached during the crawl.
Example 8.2. Using the Repository layer with Python
#!/usr/bin/python
"""Crawl a repository, printing versioned object path names."""
import sys
import os.path
import svn.fs, svn.core, svn.repos
def crawl_filesystem_dir(root, directory):
    """Recursively crawl DIRECTORY under ROOT in the filesystem, and return
    a list of all the paths at or below DIRECTORY."""
    # Print the name of this path.
    print directory + "/"
    
    # Get the directory entries for DIRECTORY.
    entries = svn.fs.svn_fs_dir_entries(root, directory)
    # Loop over the entries.
    names = entries.keys()
    for name in names:
        # Calculate the entry's full path.
        full_path = directory + '/' + name
        # If the entry is a directory, recurse.  The recursion will return
        # a list with the entry and all its children, which we will add to
        # our running list of paths.
        if svn.fs.svn_fs_is_dir(root, full_path):
            crawl_filesystem_dir(root, full_path)
        else:
            # Else it's a file, so print its path here.
            print full_path
def crawl_youngest(repos_path):
    """Open the repository at REPOS_PATH, and recursively crawl its
    youngest revision."""
    
    # Open the repository at REPOS_PATH, and get a reference to its
    # versioning filesystem.
    repos_obj = svn.repos.svn_repos_open(repos_path)
    fs_obj = svn.repos.svn_repos_fs(repos_obj)
    # Query the current youngest revision.
    youngest_rev = svn.fs.svn_fs_youngest_rev(fs_obj)
    
    # Open a root object representing the youngest (HEAD) revision.
    root_obj = svn.fs.svn_fs_revision_root(fs_obj, youngest_rev)
    # Do the recursive crawl.
    crawl_filesystem_dir(root_obj, "")
    
if __name__ == "__main__":
    # Check for sane usage.
    if len(sys.argv) != 2:
        sys.stderr.write("Usage: %s REPOS_PATH\n"
                         % (os.path.basename(sys.argv[0])))
        sys.exit(1)
    # Canonicalize the repository path.
    repos_path = svn.core.svn_path_canonicalize(sys.argv[1])
    # Do the real work.
    crawl_youngest(repos_path)
This same program in C would need to deal with APR's memory pool system. But Python handles memory usage automatically, and Subversion's Python bindings adhere to that convention. In C, you'd be working with custom datatypes (such as those provided by the APR library) for representing the hash of entries and the list of paths, but Python has hashes (called “dictionaries”) and lists as built-in datatypes, and it provides a rich collection of functions for operating on those types. So SWIG (with the help of some customizations in Subversion's language bindings layer) takes care of mapping those custom datatypes into the native datatypes of the target language. This provides a more intuitive interface for users of that language.
The Subversion Python bindings can be used for working
        copy operations, too.  In the previous section of this
        chapter, we mentioned the libsvn_client
        interface and how it exists for the sole purpose of
        simplifying the process of writing a Subversion client.  Example 8.3, “A Python status crawler” is a brief
        example of how that library can be accessed via the SWIG
        Python bindings to re-create a scaled-down version of the
        svn status command.
Example 8.3. A Python status crawler
#!/usr/bin/env python
"""Crawl a working copy directory, printing status information."""
import sys
import os.path
import getopt
import svn.core, svn.client, svn.wc
def generate_status_code(status):
    """Translate a status value into a single-character status code,
    using the same logic as the Subversion command-line client."""
    code_map = { svn.wc.svn_wc_status_none        : ' ',
                 svn.wc.svn_wc_status_normal      : ' ',
                 svn.wc.svn_wc_status_added       : 'A',
                 svn.wc.svn_wc_status_missing     : '!',
                 svn.wc.svn_wc_status_incomplete  : '!',
                 svn.wc.svn_wc_status_deleted     : 'D',
                 svn.wc.svn_wc_status_replaced    : 'R',
                 svn.wc.svn_wc_status_modified    : 'M',
                 svn.wc.svn_wc_status_merged      : 'G',
                 svn.wc.svn_wc_status_conflicted  : 'C',
                 svn.wc.svn_wc_status_obstructed  : '~',
                 svn.wc.svn_wc_status_ignored     : 'I',
                 svn.wc.svn_wc_status_external    : 'X',
                 svn.wc.svn_wc_status_unversioned : '?',
               }
    return code_map.get(status, '?')
def do_status(wc_path, verbose):
    # Build a client context baton.
    ctx = svn.client.svn_client_ctx_t()
    def _status_callback(path, status):
        """A callback function for svn_client_status."""
        # Print the path, minus the bit that overlaps with the root of
        # the status crawl
        text_status = generate_status_code(status.text_status)
        prop_status = generate_status_code(status.prop_status)
        print '%s%s  %s' % (text_status, prop_status, path)
        
    # Do the status crawl, using _status_callback() as our callback function.
    revision = svn.core.svn_opt_revision_t()
    revision.type = svn.core.svn_opt_revision_head
    svn.client.svn_client_status2(wc_path, revision, _status_callback,
                                  svn.core.svn_depth_infinity, verbose,
                                  0, 0, 1, ctx)
def usage_and_exit(errorcode):
    """Print usage message, and exit with ERRORCODE."""
    stream = errorcode and sys.stderr or sys.stdout
    stream.write("""Usage: %s OPTIONS WC-PATH
Options:
  --help, -h    : Show this usage message
  --verbose, -v : Show all statuses, even uninteresting ones
""" % (os.path.basename(sys.argv[0])))
    sys.exit(errorcode)
    
if __name__ == '__main__':
    # Parse command-line options.
    try:
        opts, args = getopt.getopt(sys.argv[1:], "hv", ["help", "verbose"])
    except getopt.GetoptError:
        usage_and_exit(1)
    verbose = 0
    for opt, arg in opts:
        if opt in ("-h", "--help"):
            usage_and_exit(0)
        if opt in ("-v", "--verbose"):
            verbose = 1
    if len(args) != 1:
        usage_and_exit(2)
            
    # Canonicalize the repository path.
    wc_path = svn.core.svn_path_canonicalize(args[0])
    # Do the real work.
    try:
        do_status(wc_path, verbose)
    except svn.core.SubversionException, e:
        sys.stderr.write("Error (%d): %s\n" % (e.apr_err, e.message))
        sys.exit(1)
As was the case in Example 8.2, “Using the Repository layer with Python”, this
        program is pool-free and uses, for the most part, normal
        Python datatypes.  The call to
        svn_client_ctx_t() is deceiving because
        the public Subversion API has no such function—this just
        happens to be a case where SWIG's automatic language
        generation bleeds through a little bit (the function is a sort
        of factory function for Python's version of the corresponding
        complex C structure).  Also note that the path passed to this
        program (like the last one) gets run through
        svn_path_canonicalize(), because to
        not do so runs the risk of triggering the
        underlying Subversion C library's assertions about such
        things, which translates into rather immediate and
        unceremonious program abortion.