Stochastic Nonsense

Put something smart here.

Querying Postgres or Greenplum From R on a Mac

So, I’m using snow leopard, and I want to query our postgres / greenplum database.

First things first: I’m familiar with the RODBC package on CRAN. This installs fine, since it’s a binary package. I also installed the ODBC Administrator app that you have to download from apple here . Now all I need is the postgres ODBC driver, which is harder to get your hands on than you’d think. I first installed postgres84 via ports, but that didn’t seem to include the ODBC driver. I then installed the full postgres84 package in a pre-packaged distro from EnterpriseDB. This required rebooting my mac and then manually disabling postgres db — since I only want the odbc drivers — by removing the obvious files from /Library/LaunchDaemons. Then… no love. I started ODBC Administrator, selected a System DSN, chose the psqlODBC driver, and then ended up with a screen that had no prompts and just a bunch of key / value pairs with no suggestions as to what might be required — typically some variation of host, hostname, user, username, etc. Unfortunately, clicking on the key field in the rows doesn’t allow me to edit them; Hitting enter allows me to modify the key, but hell if I know how to modify the value.

So my next attempt was installing the RPostgreSQL package from CRAN.

1
install.packages('RPostgreSQL')

fails, as by default R will only grab binary packages and this is a source package. You will have to do this:

1
install.packages('RPostgreSQL', type='source')

This, of course, then fails to build, complaining that it can’t find libpq-fe.h. Awesome.

If you look hard enough, the missing header file should be wherever you installed postgres. Either in /opt/local/something if you used ports to install postgres, or in /Library/PostgresPlus/8.4SS if you installed the binary distribution as I did. Inside that directory lives an include directory which has our .h file. Setting PG_INCDIR to that path (eg export PG_INCDIR="/Library/PostgresPlus/8.4SS/include") then running R from that shell now gets me far enough that when you rerun install.packages from R you get a complaint about a missing lib:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
> install.packages('RPostgreSQL')
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘RPostgreSQL’ is not available
> ? install.packages
> install.packages('RPostgreSQL', type='source')
also installing the dependency ‘DBI’

trying URL 'http://cran.stat.ucla.edu/src/contrib/DBI_0.2-5.tar.gz'
Content type 'application/x-tar' length 308395 bytes (301 Kb)
opened URL
==================================================
downloaded 301 Kb

trying URL 'http://cran.stat.ucla.edu/src/contrib/RPostgreSQL_0.1-6.tar.gz'
Content type 'application/x-tar' length 141399 bytes (138 Kb)
opened URL
==================================================
downloaded 138 Kb

* Installing *source* package ‘DBI’ ...
** R
** inst
** preparing package for lazy loading
Creating a new generic function for "summary" in "DBI"
** help
*** installing help indices
 >>> Building/Updating help pages for package 'DBI'
     Formats: text html latex example 
  DBI-internal                      text    html    latex
  DBIConnection-class               text    html    latex   example
  DBIDriver-class                   text    html    latex   example
  DBIObject-class                   text    html    latex   example
  DBIResult-class                   text    html    latex   example
  dbCallProc                        text    html    latex
  dbCommit                          text    html    latex   example
  dbConnect                         text    html    latex   example
  dbDataType                        text    html    latex   example
  dbDriver                          text    html    latex   example
  dbGetInfo                         text    html    latex   example
  dbListTables                      text    html    latex   example
  dbReadTable                       text    html    latex   example
  dbSendQuery                       text    html    latex   example
  dbSetDataMappings                 text    html    latex   example
  fetch                             text    html    latex   example
  make.db.names                     text    html    latex   example
  print.list.pairs                  text    html    latex   example
** building package indices ...
* DONE (DBI)
* Installing *source* package ‘RPostgreSQL’ ...
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for pg_config... no
configure: checking for PostgreSQL header files
checking for "/Library/PostgresPlus/8.4SS/include/libpq-fe.h"... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
** arch - i386
gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -I/Library/PostgresPlus/8.4SS/include -I/usr/local/include    -fPIC  -g -O2 -c RS-DBI.c -o RS-DBI.o
gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -I/Library/PostgresPlus/8.4SS/include -I/usr/local/include    -fPIC  -g -O2 -c RS-PostgreSQL.c -o RS-PostgreSQL.o
gcc -arch i386 -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -mmacosx-version-min=10.4 -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o RPostgreSQL.so RS-DBI.o RS-PostgreSQL.o -L -lpq -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: library not found for -lpq
collect2: ld returned 1 exit status
make: *** [RPostgreSQL.so] Error 1
ERROR: compilation failed for package ‘RPostgreSQL’
* Removing ‘/Library/Frameworks/R.framework/Versions/2.9/Resources/library/RPostgreSQL’

The downloaded packages are in
    ‘/private/var/folders/-E/-E9MDL2qECqW8Ik4CfUX6U+++TM/-Tmp-/RtmpvTtehd/downloaded_packages’
Updating HTML index of packages in '.Library'
Warning message:
In install.packages("RPostgreSQL", type = "source") :
  installation of package 'RPostgreSQL' had non-zero exit status

Thanks to an email to the R help list, the answer is to tell gcc where to find pg_config, which somehow magically solves this. eg:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
earl:bin $ export PG_CONFIG=/Library/PostgresPlus/8.4SS/bin/pg_config
earl:bin $ R

R version 2.9.2 (2009-08-24)
[...]
> install.packages('RPostgreSQL', type='source')
--- Please select a CRAN mirror for use in this session ---
[...]
checking for pg_config... /Library/PostgresPlus/8.4SS/bin/pg_config
checking for "/Library/PostgresPlus/8.4SS/include/libpq-fe.h"... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
** arch - i386
gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -I/Library/PostgresPlus/8.4SS/include -I/usr/local/include    -fPIC  -g -O2 -c RS-DBI.c -o RS-DBI.o
gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -I/Library/PostgresPlus/8.4SS/include -I/usr/local/include    -fPIC  -g -O2 -c RS-PostgreSQL.c -o RS-PostgreSQL.o
gcc -arch i386 -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -mmacosx-version-min=10.4 -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o RPostgreSQL.so RS-DBI.o RS-PostgreSQL.o -L/Library/PostgresPlus/8.4SS/lib -lpq -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
** R
** inst
** preparing package for lazy loading
Creating a new generic function for "format" in "RPostgreSQL"
Creating a new generic function for "print" in "RPostgreSQL"
** help
*** installing help indices
 >>> Building/Updating help pages for package 'RPostgreSQL'
     Formats: text html latex example 
  PostgreSQL                        text    html    latex   example
  PostgreSQLConnection-class        text    html    latex   example
  PostgreSQLDriver-class            text    html    latex   example
  PostgreSQLObject-class            text    html    latex   example
  PostgreSQLResult-class            text    html    latex   example
  S4R                               text    html    latex   example
  dbApply-methods                   text    html    latex   example
  dbApply                           text    html    latex   example
  dbBuildTableDefinition            text    html    latex
  dbCallProc-methods                text    html    latex
  dbCommit-methods                  text    html    latex   example
  dbConnect-methods                 text    html    latex   example
  dbDataType-methods                text    html    latex   example
  dbDriver-methods                  text    html    latex   example
  dbGetInfo-methods                 text    html    latex   example
  dbListTables-methods              text    html    latex   example
  dbObjectId-class                  text    html    latex   example
  dbReadTable-methods               text    html    latex   example
  dbSendQuery-methods               text    html    latex   example
  dbSetDataMappings-methods         text    html    latex   example
  fetch-methods                     text    html    latex   example
  isIdCurrent                       text    html    latex   example
  make.db.names-methods             text    html    latex   example
  postgresqlDBApply                 text    html    latex   example
  postgresqlSupport                 text    html    latex
  safe.write                        text    html    latex   example
  summary-methods                   text    html    latex
** building package indices ...
* DONE (RPostgreSQL)

The downloaded packages are in
    ‘/private/var/folders/-E/-E9MDL2qECqW8Ik4CfUX6U+++TM/-Tmp-/RtmpurzqTb/downloaded_packages’
Updating HTML index of packages in '.Library'
> library(RPostgreSQL)
Loading required package: DBI

You can now test this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
> library('RPostgreSQL')
Loading required package: DBI
> drv <- dbDriver('PostgreSQL')
> drv
PostgreSQLDriver:(1825) 
> db <- dbConnect(drv, host='greenplum.ip', user='earl', dbname='db01')
> db
PostgreSQLConnection:(1825,0) 
> dbGetQuery(db, 'select 1')
?column?
1        1
> 
> 
> 
> dbGetQuery(db, 'select count(*) from earl_fav_wd')
count
1    34

Success! I can query my greenplum db from R. Also, I hate computers.