| Copyright | (c) 2011 MailRank, Inc. (c) 2011-2012 Leon P Smith |
|---|---|
| License | BSD3 |
| Maintainer | Leon P Smith <leon@melding-monads.com> |
| Stability | experimental |
| Safe Haskell | None |
| Language | Haskell98 |
Database.PostgreSQL.Simple
Contents
Description
A mid-level client library for the PostgreSQL database, aimed at ease of use and high performance.
- data Connection
- data Query
- class ToRow a
- class FromRow a
- newtype In a = In a
- newtype Binary a = Binary {
- fromBinary :: a
- newtype Only a = Only {
- fromOnly :: a
- data h :. t = h :. t
- data SqlError = SqlError {}
- data ExecStatus :: *
- data FormatError
- data QueryError
- data ResultError
- connectPostgreSQL :: ByteString -> IO Connection
- close :: Connection -> IO ()
- connect :: ConnectInfo -> IO Connection
- data ConnectInfo = ConnectInfo {}
- defaultConnectInfo :: ConnectInfo
- postgreSQLConnectionString :: ConnectInfo -> ByteString
- query :: (ToRow q, FromRow r) => Connection -> Query -> q -> IO [r]
- query_ :: FromRow r => Connection -> Query -> IO [r]
- queryWith :: ToRow q => RowParser r -> Connection -> Query -> q -> IO [r]
- queryWith_ :: RowParser r -> Connection -> Query -> IO [r]
- data FoldOptions = FoldOptions {}
- data FetchQuantity
- defaultFoldOptions :: FoldOptions
- fold :: (FromRow row, ToRow params) => Connection -> Query -> params -> a -> (a -> row -> IO a) -> IO a
- foldWithOptions :: (FromRow row, ToRow params) => FoldOptions -> Connection -> Query -> params -> a -> (a -> row -> IO a) -> IO a
- fold_ :: FromRow r => Connection -> Query -> a -> (a -> r -> IO a) -> IO a
- foldWithOptions_ :: FromRow r => FoldOptions -> Connection -> Query -> a -> (a -> r -> IO a) -> IO a
- forEach :: (ToRow q, FromRow r) => Connection -> Query -> q -> (r -> IO ()) -> IO ()
- forEach_ :: FromRow r => Connection -> Query -> (r -> IO ()) -> IO ()
- returning :: (ToRow q, FromRow r) => Connection -> Query -> [q] -> IO [r]
- execute :: ToRow q => Connection -> Query -> q -> IO Int64
- execute_ :: Connection -> Query -> IO Int64
- executeMany :: ToRow q => Connection -> Query -> [q] -> IO Int64
- withTransaction :: Connection -> IO a -> IO a
- withSavepoint :: Connection -> IO a -> IO a
- begin :: Connection -> IO ()
- commit :: Connection -> IO ()
- rollback :: Connection -> IO ()
- formatMany :: ToRow q => Connection -> Query -> [q] -> IO ByteString
- formatQuery :: ToRow q => Connection -> Query -> q -> IO ByteString
Writing queries
SQL-based applications are somewhat notorious for their susceptibility to attacks through the injection of maliciously crafted data. The primary reason for widespread vulnerability to SQL injections is that many applications are sloppy in handling user data when constructing SQL queries.
This library provides a Query type and a parameter substitution
facility to address both ease of use and security.
The Query type
A Query is a newtype-wrapped ByteString. It intentionally
exposes a tiny API that is not compatible with the ByteString
API; this makes it difficult to construct queries from fragments of
strings. The query and execute functions require queries to be
of type Query.
To most easily construct a query, enable GHC's OverloadedStrings
language extension and write your query as a normal literal string.
{-# LANGUAGE OverloadedStrings #-}
import Database.PostgreSQL.Simple
hello :: IO Int
hello = do
conn <- connectPostgreSQL ""
[Only i] <- query_ conn "select 2 + 2"
return iA Query value does not represent the actual query that will be
executed, but is a template for constructing the final query.
Parameter substitution
Since applications need to be able to construct queries with parameters that change, this library provides a query substitution capability.
The Query template accepted by query and execute can contain
any number of "?" characters. Both query and execute
accept a third argument, typically a tuple. When constructing the
real query to execute, these functions replace the first "?" in
the template with the first element of the tuple, the second
"?" with the second element, and so on. If necessary, each
tuple element will be quoted and escaped prior to substitution;
this defeats the single most common injection vector for malicious
data.
For example, given the following Query template:
select * from user where first_name = ? and age > ?
And a tuple of this form:
("Boris" :: String, 37 :: Int)The query to be executed will look like this after substitution:
select * from user where first_name = 'Boris' and age > 37
If there is a mismatch between the number of "?" characters in
your template and the number of elements in your tuple, a
FormatError will be thrown.
Note that the substitution functions do not attempt to parse or
validate your query. It's up to you to write syntactically valid
SQL, and to ensure that each "?" in your query template is
matched with the right tuple element.
Type inference
Automated type inference means that you will often be able to avoid supplying explicit type signatures for the elements of a tuple. However, sometimes the compiler will not be able to infer your types. Consider a case where you write a numeric literal in a parameter tuple:
query conn "select ? + ?" (40,2)
The above query will be rejected by the compiler, because it does
not know the specific numeric types of the literals 40 and 2.
This is easily fixed:
query conn "select ? + ?" (40 :: Double, 2 :: Double)
The same kind of problem can arise with string literals if you have
the OverloadedStrings language extension enabled. Again, just
use an explicit type signature if this happens.
Finally, remember that the compiler must be able to infer the type of a query's results as well as its parameters. We might like the following example to work:
print =<< query_ conn "select 2 + 2"
Unfortunately, while a quick glance tells us that the result type should be a single row containing a single numeric column, the compiler has no way to infer what the types are. We can easily fix this by providing an explicit type annotation:
xs <- query_ conn "select 2 + 2" print (xs :: [Only Int])
Substituting a single parameter
Haskell lacks a single-element tuple type, so if you have just one value you want substituted into a query or a single-column result, what should you do?
The obvious approach would appear to be something like this:
instance (ToField a) => ToRow a where
...Unfortunately, this wreaks havoc with type inference, so we take a
different tack. To represent a single value val as a parameter, write
a singleton list [val], use Just val, or use Only val.
Here's an example using a singleton list:
execute conn "insert into users (first_name) values (?)"
["Nuala"]A row of n query results is represented using an n-tuple, so
you should use Only to represent a single-column result.
Representing a list of values
Suppose you want to write a query using an IN clause:
select * from users where first_name in ('Anna', 'Boris', 'Carla')In such cases, it's common for both the elements and length of the
list after the IN keyword to vary from query to query.
To address this case, use the In type wrapper, and use a single
"?" character to represent the list. Omit the parentheses
around the list; these will be added for you.
Here's an example:
query conn "select * from users where first_name in ?" $
Only $ In ["Anna", "Boris", "Carla"]If your In-wrapped list is empty, the string "(null)" will be
substituted instead, to ensure that your clause remains
syntactically valid.
Modifying multiple rows at once
If you know that you have many rows of data to insert into a table,
it is much more efficient to perform all the insertions in a single
multi-row INSERT statement than individually.
The executeMany function is intended specifically for helping
with multi-row INSERT and UPDATE statements. Its rules for
query substitution are different than those for execute.
What executeMany searches for in your Query template is a
single substring of the form:
values (?,?,?)
The rules are as follows:
- The keyword
VALUESis matched case insensitively. - There must be no other "
?" characters anywhere in your template. - There must be one or more "
?" in the parentheses. - Extra white space is fine.
The last argument to executeMany is a list of parameter
tuples. These will be substituted into the query where the (?,?)
string appears, in a form suitable for use in a multi-row INSERT
or UPDATE.
Here is an example:
executeMany conn
"insert into users (first_name,last_name) values (?,?)"
[("Boris","Karloff"),("Ed","Wood")]The query that will be executed here will look like this (reformatted for tidiness):
insert into users (first_name,last_name) values
('Boris','Karloff'),('Ed','Wood')RETURNING: modifications that returns results
PostgreSQL supports returning values from data manipulation statements
such as INSERT and UPDATE. You can use these statements by
using query instead of execute. For multi-tuple inserts,
use returning instead of executeMany.
For example, were there an auto-incrementing id column and
timestamp column t that defaulted to the present time for the
sales table, then the following query would insert two new
sales records and also return their new ids and timestamps.
let q = "insert into sales (amount, label) values (?,?) returning id, t" xs :: [(Int, UTCTime)] <- query conn q (15,"Sawdust") ys :: [(Int, UTCTime)] <- returning conn q [(20,"Chips"),(300,"Wood")]
Extracting results
The query and query_ functions return a list of values in the
FromRow typeclass. This class performs automatic extraction
and type conversion of rows from a query result.
Here is a simple example of how to extract results:
import qualified Data.Text as Text xs <- query_ conn "select name,age from users" forM_ xs $ \(name,age) -> putStrLn $ Text.unpack name ++ " is " ++ show (age :: Int)
Notice two important details about this code:
- The number of columns we ask for in the query template must
exactly match the number of elements we specify in a row of the
result tuple. If they do not match, a
ResultErrorexception will be thrown. - Sometimes, the compiler needs our help in specifying types. It
can infer that
namemust be aText, due to our use of theunpackfunction. However, we have to tell it the type ofage, as it has no other information to determine the exact type.
Handling null values
The type of a result tuple will look something like this:
(Text, Int, Int)
Although SQL can accommodate NULL as a value for any of these
types, Haskell cannot. If your result contains columns that may be
NULL, be sure that you use Maybe in those positions of your
tuple.
(Text, Maybe Int, Int)
If query encounters a NULL in a row where the corresponding
Haskell type is not Maybe, it will throw a ResultError
exception.
Type conversions
Conversion of SQL values to Haskell values is somewhat permissive. Here are the rules.
- For numeric types, any Haskell type that can accurately represent
all values of the given PostgreSQL type is considered "compatible".
For instance, you can always extract a PostgreSQL 16-bit
SMALLINTcolumn to a HaskellInt. The HaskellFloattype can accurately represent aSMALLINT, so it is considered compatble with those types. - A numeric compatibility check is based only on the type of a
column, not on its values. For instance, a PostgreSQL 64-bit
BIGINTcolumn will be considered incompatible with a HaskellInt16, even if it contains the value1. - If a numeric incompatibility is found,
querywill throw aResultError. - The
StringandTexttypes are assumed to be encoded as UTF-8. If you use some other encoding, decoding may fail or give wrong results. In such cases, write anewtypewrapper and a customResultinstance to handle your encoding.
Types
data Connection
data Query
A query string. This type is intended to make it difficult to construct a SQL query by concatenating string fragments, as that is an extremely common way to accidentally introduce SQL injection vulnerabilities into an application.
This type is an instance of IsString, so the easiest way to
construct a query is to enable the OverloadedStrings language
extension and then simply write the query in double quotes.
{-# LANGUAGE OverloadedStrings #-}
import Database.PostgreSQL.Simple
q :: Query
q = "select ?"The underlying type is a ByteString, and literal Haskell strings
that contain Unicode characters will be correctly transformed to
UTF-8.
class ToRow a
A collection type that can be turned into a list of rendering
Actions.
Instances should use the toField method of the ToField class
to perform conversion of each element of the collection.
Minimal complete definition
Instances
| ToRow () | |
| ToField a => ToRow [a] | |
| ToField a => ToRow (Only a) | |
| (ToField a, ToField b) => ToRow (a, b) | |
| (ToRow a, ToRow b) => ToRow ((:.) a b) | |
| (ToField a, ToField b, ToField c) => ToRow (a, b, c) | |
| (ToField a, ToField b, ToField c, ToField d) => ToRow (a, b, c, d) | |
| (ToField a, ToField b, ToField c, ToField d, ToField e) => ToRow (a, b, c, d, e) | |
| (ToField a, ToField b, ToField c, ToField d, ToField e, ToField f) => ToRow (a, b, c, d, e, f) | |
| (ToField a, ToField b, ToField c, ToField d, ToField e, ToField f, ToField g) => ToRow (a, b, c, d, e, f, g) | |
| (ToField a, ToField b, ToField c, ToField d, ToField e, ToField f, ToField g, ToField h) => ToRow (a, b, c, d, e, f, g, h) | |
| (ToField a, ToField b, ToField c, ToField d, ToField e, ToField f, ToField g, ToField h, ToField i) => ToRow (a, b, c, d, e, f, g, h, i) | |
| (ToField a, ToField b, ToField c, ToField d, ToField e, ToField f, ToField g, ToField h, ToField i, ToField j) => ToRow (a, b, c, d, e, f, g, h, i, j) |
class FromRow a
A collection type that can be converted from a sequence of fields. Instances are provided for tuples up to 10 elements and lists of any length.
Note that instances can be defined outside of postgresql-simple, which is often useful. For example, here's an instance for a user-defined pair:
@data User = User { name :: String, fileQuota :: Int }
instance FromRow User where
fromRow = User <$> field <*> field
@
The number of calls to field must match the number of fields returned
in a single row of the query result. Otherwise, a ConversionFailed
exception will be thrown.
Note that field evaluates it's result to WHNF, so the caveats listed in
mysql-simple and very early versions of postgresql-simple no longer apply.
Instead, look at the caveats associated with user-defined implementations
of fromField.
Minimal complete definition
Instances
newtype In a
Wrap a list of values for use in an IN clause. Replaces a
single "?" character with a parenthesized list of rendered
values.
Example:
query c "select * from whatever where id in ?" (Only (In [3,4,5]))
Constructors
| In a |
newtype Only a
A single-value "collection".
This is useful if you need to supply a single parameter to a SQL query, or extract a single column from a SQL result.
Parameter example:
query c "select x from scores where x > ?" (Only (42::Int))Result example:
xs <- query_ c "select id from users"
forM_ xs $ \(Only id) -> {- ... -}data h :. t infixr 3
A composite type to parse your custom data structures without having to define dummy newtype wrappers every time.
instance FromRow MyData where ...
instance FromRow MyData2 where ...
then I can do the following for free:
res <- query' c "..."
forM res $ \(MyData{..} :. MyData2{..}) -> do
....
Constructors
| h :. t infixr 3 |
Exceptions
data ExecStatus :: *
Constructors
| EmptyQuery | The string sent to the server was empty. |
| CommandOk | Successful completion of a command returning no data. |
| TuplesOk | Successful completion of a command returning data (such as a SELECT or SHOW). |
| CopyOut | Copy Out (from server) data transfer started. |
| CopyIn | Copy In (to server) data transfer started. |
| BadResponse | The server's response was not understood. |
| NonfatalError | A nonfatal error (a notice or warning) occurred. |
| FatalError | A fatal error occurred. |
Instances
data FormatError
Exception thrown if a Query could not be formatted correctly.
This may occur if the number of '?' characters in the query
string does not match the number of parameters provided.
Instances
data QueryError
Exception thrown if query is used to perform an INSERT-like
operation, or execute is used to perform a SELECT-like operation.
Instances
Connection management
connectPostgreSQL :: ByteString -> IO Connection
Attempt to make a connection based on a libpq connection string. See http://www.postgresql.org/docs/9.3/static/libpq-connect.html for more information. Here is an example with some of the most commonly used parameters:
host='db.somedomain.com' port=5432 ...
This attempts to connect to db.somedomain.com:5432. Omitting the port
will normally default to 5432.
On systems that provide unix domain sockets, omitting the host parameter
will cause libpq to attempt to connect via unix domain sockets.
The default filesystem path to the socket is constructed from the
port number and the DEFAULT_PGSOCKET_DIR constant defined in the
pg_config_manual.h header file. Connecting via unix sockets tends
to use the peer authentication method, which is very secure and
does not require a password.
On Windows and other systems without unix domain sockets, omitting
the host will default to localhost.
... dbname='postgres' user='postgres' password='secret \' \\ pw'
This attempts to connect to a database named postgres with
user postgres and password secret ' \ pw. Backslash
characters will have to be double-quoted in literal Haskell strings,
of course. Omitting dbname and user will both default to the
system username that the client process is running as.
Omitting password will default to an appropriate password found
in the pgpass file, or no password at all if a matching line is
not found. See
http://www.postgresql.org/docs/9.3/static/libpq-pgpass.html for
more information regarding this file.
As all parameters are optional and the defaults are sensible, the empty connection string can be useful for development and exploratory use, assuming your system is set up appropriately.
On Unix, such a setup would typically consist of a local postgresql server listening on port 5432, as well as a system user, database user, and database sharing a common name, with permissions granted to the user on the database.
On Windows, in addition you will either need pg_hba.conf
to specify the use of the trust authentication method for
the connection, which may not be appropriate for multiuser
or production machines, or you will need to use a pgpass file
with the password or md5 authentication methods.
See http://www.postgresql.org/docs/9.3/static/client-authentication.html for more information regarding the authentication process.
close :: Connection -> IO ()
connect :: ConnectInfo -> IO Connection
Connect with the given username to the given database. Will throw an exception if it cannot connect.
data ConnectInfo
Constructors
| ConnectInfo | |
Fields
| |
Instances
defaultConnectInfo :: ConnectInfo
Default information for setting up a connection.
Defaults are as follows:
- Server on
localhost - Port on
5432 - User
postgres - No password
- Database
postgres
Use as in the following example:
connect defaultConnectInfo { connectHost = "db.example.com" }postgreSQLConnectionString :: ConnectInfo -> ByteString
Turns a ConnectInfo data structure into a libpq connection string.
Queries that return results
query :: (ToRow q, FromRow r) => Connection -> Query -> q -> IO [r]
Perform a SELECT or other SQL query that is expected to return
results. All results are retrieved and converted before this
function returns.
When processing large results, this function will consume a lot of
client-side memory. Consider using fold instead.
Exceptions that may be thrown:
FormatError: the query string could not be formatted correctly.QueryError: the result contains no columns (i.e. you should be usingexecuteinstead ofquery).ResultError: result conversion failed.SqlError: the postgresql backend returned an error, e.g. a syntax or type error, or an incorrect table or column name.
query_ :: FromRow r => Connection -> Query -> IO [r]
A version of query that does not perform query substitution.
Queries taking parser as argument
queryWith :: ToRow q => RowParser r -> Connection -> Query -> q -> IO [r]
A version of query taking parser as argument
queryWith_ :: RowParser r -> Connection -> Query -> IO [r]
A version of query_ taking parser as argument
Queries that stream results
data FoldOptions
Constructors
| FoldOptions | |
Fields | |
data FetchQuantity
Number of rows to fetch at a time. Automatic currently defaults
to 256 rows, although it might be nice to make this more intelligent
based on e.g. the average size of the rows.
defaultFoldOptions :: FoldOptions
defaults to Automatic, and TransactionMode ReadCommitted ReadOnly
fold :: (FromRow row, ToRow params) => Connection -> Query -> params -> a -> (a -> row -> IO a) -> IO a
Perform a SELECT or other SQL query that is expected to return
results. Results are streamed incrementally from the server, and
consumed via a left fold.
When dealing with small results, it may be simpler (and perhaps
faster) to use query instead.
This fold is not strict. The stream consumer is responsible for forcing the evaluation of its result to avoid space leaks.
This is implemented using a database cursor. As such, this requires
a transaction. This function will detect whether or not there is a
transaction in progress, and will create a ReadCommitted ReadOnly
transaction if needed. The cursor is given a unique temporary name,
so the consumer may itself call fold.
Exceptions that may be thrown:
FormatError: the query string could not be formatted correctly.QueryError: the result contains no columns (i.e. you should be usingexecuteinstead ofquery).ResultError: result conversion failed.SqlError: the postgresql backend returned an error, e.g. a syntax or type error, or an incorrect table or column name.
foldWithOptions :: (FromRow row, ToRow params) => FoldOptions -> Connection -> Query -> params -> a -> (a -> row -> IO a) -> IO a
The same as fold, but this provides a bit more control over
lower-level details. Currently, the number of rows fetched per
round-trip to the server and the transaction mode may be adjusted
accordingly. If the connection is already in a transaction,
then the existing transaction is used and thus the transactionMode
option is ignored.
Arguments
| :: FromRow r | |
| => Connection | |
| -> Query | Query. |
| -> a | Initial state for result consumer. |
| -> (a -> r -> IO a) | Result consumer. |
| -> IO a |
A version of fold that does not perform query substitution.
Arguments
| :: FromRow r | |
| => FoldOptions | |
| -> Connection | |
| -> Query | Query. |
| -> a | Initial state for result consumer. |
| -> (a -> r -> IO a) | Result consumer. |
| -> IO a |
Arguments
| :: (ToRow q, FromRow r) | |
| => Connection | |
| -> Query | Query template. |
| -> q | Query parameters. |
| -> (r -> IO ()) | Result consumer. |
| -> IO () |
A version of fold that does not transform a state value.
Arguments
| :: FromRow r | |
| => Connection | |
| -> Query | Query template. |
| -> (r -> IO ()) | Result consumer. |
| -> IO () |
A version of forEach that does not perform query substitution.
returning :: (ToRow q, FromRow r) => Connection -> Query -> [q] -> IO [r]
Execute INSERT ... RETURNING, UPDATE ... RETURNING, or other SQL
query that accepts multi-row input and is expected to return results.
Note that it is possible to write
in cases where you are only inserting a single row, and do not need
functionality analogous to query conn "INSERT ... RETURNING ..." ...executeMany.
If the list of parameters is empty, this function will simply return []
without issuing the query to the backend. If this is not desired,
consider using the Values constructor instead.
Throws FormatError if the query could not be formatted correctly.
Statements that do not return results
execute :: ToRow q => Connection -> Query -> q -> IO Int64
Execute an INSERT, UPDATE, or other SQL query that is not
expected to return results.
Returns the number of rows affected.
Throws FormatError if the query could not be formatted correctly, or
a SqlError exception if the backend returns an error.
execute_ :: Connection -> Query -> IO Int64
A version of execute that does not perform query substitution.
executeMany :: ToRow q => Connection -> Query -> [q] -> IO Int64
Execute a multi-row INSERT, UPDATE, or other SQL query that is not
expected to return results.
Returns the number of rows affected. If the list of parameters is empty,
this function will simply return 0 without issuing the query to the backend.
If this is not desired, consider using the Values constructor instead.
Throws FormatError if the query could not be formatted correctly, or
a SqlError exception if the backend returns an error.
For example, here's a command that inserts two rows into a table with two columns:
executeMany c [sql|
INSERT INTO sometable VALUES (?,?)
|] [(1, "hello"),(2, "world")]
Here's an canonical example of a multi-row update command:
executeMany c [sql|
UPDATE sometable
SET sometable.y = upd.y
FROM (VALUES (?,?)) as upd(x,y)
WHERE sometable.x = upd.x
|] [(1, "hello"),(2, "world")
Transaction handling
withTransaction :: Connection -> IO a -> IO a
Execute an action inside a SQL transaction.
This function initiates a transaction with a "begin
transaction" statement, then executes the supplied action. If
the action succeeds, the transaction will be completed with
commit before this function returns.
If the action throws any kind of exception (not just a
PostgreSQL-related exception), the transaction will be rolled back using
rollback, then the exception will be rethrown.
For nesting transactions, see withSavepoint.
withSavepoint :: Connection -> IO a -> IO a
Create a savepoint, and roll back to it if an error occurs. This may only be used inside of a transaction, and provides a sort of "nested transaction".
See http://www.postgresql.org/docs/current/static/sql-savepoint.html
begin :: Connection -> IO ()
Begin a transaction.
commit :: Connection -> IO ()
Commit a transaction.
rollback :: Connection -> IO ()
Rollback a transaction.
Helper functions
formatMany :: ToRow q => Connection -> Query -> [q] -> IO ByteString
Format a query string with a variable number of rows.
This function is exposed to help with debugging and logging. Do not use it to prepare queries for execution.
The query string must contain exactly one substitution group,
identified by the SQL keyword "VALUES" (case insensitive)
followed by an "(" character, a series of one or more "?"
characters separated by commas, and a ")" character. White
space in a substitution group is permitted.
Throws FormatError if the query string could not be formatted
correctly.
formatQuery :: ToRow q => Connection -> Query -> q -> IO ByteString
Format a query string.
This function is exposed to help with debugging and logging. Do not use it to prepare queries for execution.
String parameters are escaped according to the character set in use
on the Connection.
Throws FormatError if the query string could not be formatted
correctly.