NBTree in R

NBTree is a hybrid between Naive Bayes and decision trees. The model, roughly speaking, is a tree with feature tests separating branches and Naive Bayes models on its leaves. Recently, I stumbled upon the need to implement a data analysis method employing this model, but I couldn't find a pure R package containing NBTree.

I found out that the Java library Weka contains this hybrid model, and Weka can be plugged into R via rWeka. In spite of seeming victory, I soon realized that NBTree didn't come right out of the box and I needed to undergo some hassle to get it working. Here I have documented the steps just in case anyone else is in need of this model and doesn't want to spend the evening reverse-engineering.

Step 1: Installing and loading R packages

Open up a R prompt and install RWeka along with its dependencies. You will have to have a working Java environment installed.

install.packages("rJava")
install.packages("RWekajars")
install.packages("RWeka")

library("rJava")
library("RWekajars")
library("RWeka")

Step 2: Install add-on from Weka package manager

Weka comes with a package system and manager (as there aren't enough package managers in the world already). We will have to use this package manager to install the NBTree add-on as shown below.

WPM("refresh-cache")
WPM("package-info", "repository", "naiveBayesTree")
WPM("install-package", "naiveBayesTree")

Step 3: Test that it was installed properly

You should see similar output as outlined below.

WOW("weka/classifiers/trees/NBTree")
-output-debug-info
        If set, classifier is run in debug mode and may output
        additional info to the console
-do-not-check-capabilities
        If set, classifier capabilities are not checked before
        classifier is built (use with caution).

Step 4: Register classifier and bind to variable name

NBTree <- make_Weka_classifier("weka/classifiers/trees/NBTree")

Step 5: Test on Iris data set

Your R environment may be pre-loaded with the Iris data set. If such is the case, you can quickly test if the method is working.

fitted.model <- NBTree(Species ~ ., data=iris)
print(fitted.model)
NBTree
------------------

Sepal.Width <= 3.35
|   Sepal.Width <= 2.95
|   |   Petal.Length <= 4.75: NB 3
|   |   Petal.Length > 4.75: NB 4
|   Sepal.Width > 2.95
|   |   Sepal.Length <= 5.25: NB 6
|   |   Sepal.Length > 5.25: NB 7
Sepal.Width > 3.35
|   Sepal.Length <= 5.9: NB 9
|   Sepal.Length > 5.9: NB 10

Leaf number: 3 Naive Bayes Classifier

Installing other Weka add-ons

Following these instructions you can perhaps install other add-ons successfully as well. A listing of all add-ons can be queried using the package manager.

WPM("list-packages", "available")
categories:
  • R
  • nbtree
  • weka
  • rweka
  • naive-bayes
  • decision-trees