module Xdr_mstring:sig..end
A managed string ms is declared in the XDR file as in
typedef _managed string ms<>;
In the encoded XDR stream there is no difference between strings and
managed strings, i.e. the wire representation is identical. Only
the Ocaml type differs to which the managed string is mapped. This
type is Xdr_mstring.mstring (below).
In the RPC context there is often the problem that the I/O backend would profit from a different string representation than the user of the RPC layer. To bridge this gap, managed strings have been invented. Generally, the user can determine how to represent strings (usually either as an Ocaml string, or as memory), and the I/O backend can request to transform to a different representation when this leads to an improvement (i.e. copy operations can be saved).
Only large managed strings result in a speedup of the program (at least several K).
There are two cases: The encoding case, and the decoding case.
In the encoding case the mstring object is created by the user
and passed to the RPC library. This happens when a client prepares
an argument for calling a remote procedure, or when the server
sends a response back to the caller. In the decoding case the client
analyzes the response from an RPC call, or the server looks at the
arguments of an RPC invocation. The difference here is that in the
encoding case user code can directly create mstring objects by
calling functions of this module, whereas in the decoding case the
RPC library creates the mstring objects.
For simplicity, let us only look at this problem from the perspective of an RPC client.
Encoding. Image a client wants to call an RPC, and one of the
arguments is a managed string. This means we finally need an mstring
object that can be put into the argument list of the call.
This library supports two string representation specially: The normal
Ocaml string type, and Netsys_mem.memory which is actually just
a bigarray of char's. There are two factories fac,
and both can be used to create the
mstring to pass to the
RPC layer. It should be noted that this layer can process the
memory representation a bit better. So, if the original data
value is a string, the factory for string should be used, and
if it is a char bigarray, the factory for memory should be used.
Now, the mstring object is created by
let mstring = fac # create_from_string data pos len copy_flag, or bylet mstring = fac # create_from_memory data pos len copy_flag.fac is the factory for strings, the create_from_string
method works better, and if fac is for memory, the create_from_memory
method works better. pos and len can select a substring of data.
If copy_flag is false, the mstring object does not copy the data
if possible, but just keeps a reference to data until it is accessed;
otherwise if copy_flag is true, a copy is made immediately.
Of couse, delaying the copy is better, but this requires that data
is not modified until the RPC call is completed.
Decoding. Now, the call is done, and the client looks at the
result. There is also an mstring object in the result. As noted
above, this mstring object was already created by the RPC library
(and currently this library prefers string-based objects if not
told otherwise). The user code can now access this mstring
object with the access methods of the mstring class (see below).
As these methods are quite limited, it makes normally only sense
to output the mstring contents to a file descriptor.
The user can request a different factory for managed strings. The
function Rpc_client.set_mstring_factories can be used for this
purpose. (Similar ways exist for managed clients, and for RPC servers.)
Potential. Before introducing managed strings, a clean analysis
was done how many copy operations can be avoided by using this
technique. Example: The first N bytes of a file are taken as
argument of an RPC call. Instead of reading these bytes into a
normal Ocaml string, an optimal implementation uses now a memory
buffer for this purpose. This gives:
memory value), and the second copy
writes the data into the socket.Unix.read and Unix.write
do a completely avoidable copy of the data which is prevented by
switching to Netsys_mem.mem_read and Netsys_mem.mem_write,
respectively. The latter two functions exploit an optimization
that is only possible when the data is memory-typed.
The possible optimizations for the decoding side of the problem
are slightly less impressive, but still worth doing it.
class type mstring =object..end
class type mstring_factory =object..end
mstring objects
val string_based_mstrings : mstring_factoryval string_to_mstring : ?pos:int -> ?len:int -> string -> mstringval memory_based_mstrings : mstring_factoryBigarray.Array1.createval memory_to_mstring : ?pos:int -> ?len:int -> Netsys_mem.memory -> mstringval paligned_memory_based_mstrings : mstring_factoryNetsys_mem.alloc_memory_pages if available, and
Bigarray.Array1.create if not.val memory_pool_based_mstrings : Netsys_mem.memory_pool -> mstring_factoryval length_mstrings : mstring list -> intval concat_mstrings : mstring list -> stringval prefix_mstrings : mstring list -> int -> stringprefix_mstrings l n: returns the first n chars of the
concatenated mstrings l as single stringval blit_mstrings_to_memory : mstring list -> Netsys_mem.memory -> unit : mstring -> int -> int -> mstringshared_sub_mstring ms pos len: returns an mstring that includes
a substring of ms, starting at pos, and with len bytes.
The returned mstring shares the buffer with the original mstring ms : mstring list -> int -> int -> mstring listval copy_mstring : mstring -> mstringval copy_mstrings : mstring list -> mstring listtypenamed_mstring_factories =(string, mstring_factory) Hashtbl.t