ODBC /Databases for R (including Hadoop and NoSQL)

Create a System DSN in Windows XP

  1. Click Start, point to Control Panel, double-click Administrative Tools, and then double-click Data Sources(ODBC).
  2. Click the System DSN tab, and then click Add.
  3. Click the database driver that corresponds with the database type to which you are connecting, and then click Finish.
  4. Type the data source name. Make sure that you choose a name that you can remember. You will need to use this name later.
  5. Click Select.
  6. Click the correct database, and then click OK.
  7. Click OK, and then click OK.

Oracle

http://www.oracle.com/technetwork/database/enterprise-edition/downloads/index.html

 

How to do it-

http://st-curriculum.oracle.com/obe/db/10g/r1/2day_dba/install/install.htm

http://blog.rguha.net/?p=340

http://stackoverflow.com/questions/3407015/querying-oracle-db-from-revolution-r-using-rodbc

R Packages

http://cran.r-project.org/web/packages/DBI/index.html

|R Oracle

http://cran.r-project.org/web/packages/ROracle/index.html

using the RODBC

http://cran.r-project.org/web/packages/RODBC/index.html

db <- odbcConnect(dsn=”testdsn”, uid=”testuser”, pwd=”testpasswd”, believeNRows=FALSE )

MySQL

 

http://www.mysql.com/

http://www.mysql.com/products/community/

http://www.mysql.com/downloads/workbench/

How to-

http://rss.acs.unt.edu/Rdoc/library/RMySQL/html/RMySQL-package.html

http://erikvold.com/blog/index.cfm/2008/8/20/how-to-connect-to-mysql-with-r-in-wndows-using-rmysql

R Package

http://cran.r-project.org/web/packages/RMySQL/index.html

http://cran.r-project.org/web/packages/RODBC/index.html

PostgreSQL

 

http://www.postgresql.org/download/

Read-

http://cran.r-project.org/web/packages/RODBC/RODBC.pdf

R Package-

http://cran.r-project.org/web/packages/RODBC/index.html

Additional R Packages

http://rpgsql.sourceforge.net/

http://cran.r-project.org/web/packages/RPostgreSQL/

IMPORTANT UPDATE

http://cran.r-project.org/bin/windows/contrib/r-release/ReadMe

Pre-compiled binary packages for R-2.12.x for Windows
Packages related to many database system must be linked to the exact 
version of the database system the user has installed, hence it does 
not make sense to provide binaries for packages
	RMySQL, ROracle, ROracleUI, RPostgreSQL
although it is possible to install such packages from sources by
	install.packages('packagename', type='source')
after reading the manual 'R Installation and Administration'.



 R with MongoDB

http://cran.r-project.org/web/packages/rmongodb/rmongodb.pdf

This R package provides an interface to the NoSQL MongoDB database
using the MongoDB C-driver version 0.8

ps-

R with JSON

http://cran.r-project.org/web/packages/jsonlite/index.html

This package is a fork of the RJSONIO package by Duncan Temple Lang. It builds on the same libjson c++ parser, but implements a smarter mapping between JSON data and R classes. The vignette describes the behavior in great detail. In addition to drop-in replacements for toJSON and fromJSON, the package contains functions to serialize objects and many unit tests to make verify that all edge cases are encoded and decoded consistently for use with dynamic data in systems and applications.

R with CouchDB

https://github.com/wactbprot/R4CouchDB

R with MonetDB

http://cran.r-project.org/web/packages/MonetDB.R/index.html

MonetDB.R: Connect MonetDB to R

Allows to pull data from MonetDB into R

Cassandra with R

http://cran.r-project.org/web/packages/RCassandra/RCassandra.pdf

Neo4j with R

http://things-about-r.tumblr.com/post/47392314578/venue-recommendation-a-simple-use-case-connecting-r

# Function for querying Neo4j from within R 
# from http://stackoverflow.com/questions/11188918/use-neo4j-with-r
query <- function(querystring) {
    h = basicTextGatherer()
    curlPerform(url = "localhost:7474/db/data/ext/CypherPlugin/graphdb/execute_query", 
        postfields = paste("query", curlEscape(querystring), 
        sep = "="), writefunction = h$update, verbose = FALSE)
    result <- fromJSON(h$value())
    data <- data.frame(t(sapply(result$data, unlist)))
    names(data) <- result$columns
    return(data)
}
# -------------------------------------- 
# import all data into neo4j
# --------------------------------------
nrow(venueDataset)  # number of venues

https://github.com/RevolutionAnalytics/RHadoop/wiki

RHadoop consists of the following packages:

  • NEW! plyrmr – higher level plyr-like data processing for structured data, powered by rmr
  • rmr – functions providing Hadoop MapReduce functionality in R
  • rhdfs – functions providing file management of the HDFS from within R
  • rhbase – functions providing database management for the HBase distributed database from within R

The Latest Official RHadoop Releases

R with Spark

http://amplab-extras.github.io/SparkR-pkg/

SparkR is an R package that provides a light-weight frontend to use Apache Spark from R. SparkR exposes the Spark API through the RDD class and allows users to interactively run jobs from the R shell on a cluster.

R with Hive

http://nexr.github.io/RHive/

RHive is an R extension facilitating distributed computing via HIVE query. RHive allows easy usage of HQL(Hive SQL) in R, and allows easy usage of R objects and R functions in Hive.

http://cran.r-project.org/web/packages/RHive/index.html

DDR with R – Rhipe

http://www.datadr.org/

RImpala

A package to connect and run queries on Cloudera Impala (thanks to Mu Sigma)

http://cran.r-project.org/web/packages/RImpala/index.html

Pig with R

http://hortonworks.com/blog/bootstrap-sampling-with-apache-pig/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s