`. For example, a signature of the method `Key#make` has been changed to `public static Key
make()`. Clients should always use `Key` with a specific target type (e.g., `Key`, `Key`).
diff --git a/h2o-docs/src/product/upgrade/Migration.md b/h2o-docs/src/product/upgrade/Migration.md
index 554a7e00346..05308e063e0 100644
--- a/h2o-docs/src/product/upgrade/Migration.md
+++ b/h2o-docs/src/product/upgrade/Migration.md
@@ -1,4 +1,4 @@
-#Migrating to H2O 3.0
+# Migrating to H2O 3.0
We're excited about the upcoming release of the latest and greatest version of H2O, and we hope you are too! H2O 3.0 has lots of improvements, including:
@@ -19,29 +19,30 @@ Overall, H2O 3.0 is more stable, elegant, and simplified, with additional capabi
---
-##Algorithm Changes
+## Algorithm Changes
Most of the algorithms available in previous versions of H2O have been improved in terms of speed and accuracy. Currently available model types include:
-###Supervised
+### Supervised
- **Generalized Linear Model (GLM)**: Binomial classification, multinomial classification, regression (including logistic regression)
- **Distributed Random Forest (DRF)**: Binomial classification, multinomial classification, regression
- **Gradient Boosting Machine (GBM)**: Binomial classification, multinomial classification, regression
- **Deep Learning (DL)**: Binomial classification, multinomial classification, regression
+- Naive Bayes
+- Stacked Ensembles
+- XGBoost
-###Unsupervised
+### Unsupervised
- K-means
- Principal Component Analysis
-- Autoencoder
+- Autoencoder
+- Generalized Low Rank Models
-There are a few algorithms that are still being refined to provide these same benefits and will be available in a future version of H2O.
-
-Currently, the following algorithms and associated capabilities are still in development:
-
-- Naïve Bayes
+### Miscellaneous
+- **Word2vec**
Check back for updates, as these algorithms will be re-introduced in an improved form in a future version of H2O.
@@ -49,13 +50,13 @@ Check back for updates, as these algorithms will be re-introduced in an improved
---
-##Parsing Changes
+## Parsing Changes
In H2O Classic, the parser reads all the data and tries to guess the column type. In H2O 3.0, the parser reads a subset and makes a type guess for each column. In Flow, you can view the preliminary parse results in the **Edit Column Names and Types** area. To change the column type, select an option from the drop-down menu to the right of the column. H2O 3.0 can also automatically identify mixed-type columns; in H2O Classic, if one column is mixed integers or real numbers using a string, the output is blank.
---
-##Web UI Changes
+## Web UI Changes
Our web UI has been completely overhauled with a much more intuitive interface that is similar to IPython Notebook. Each point-and-click action is translated immediately into an individual workflow script that can be saved for later interactive and offline use. As a result, you can now revise and rerun your workflows easily, and can even add comments and rich media.
@@ -63,7 +64,7 @@ For more information, refer to our [Getting Started with Flow](https://github.co
---
-##API Users
+## API Users
H2O's new Python API allows Pythonistas to use H2O in their favorite environment. Using the Python command line or an integrated development environment like IPython Notebook, H2O users can control clusters and manage massive datasets quickly.
@@ -71,7 +72,7 @@ H2O's REST API is the basis for the web UI (Flow), as well as the R and Python A
---
-##Java Users
+## Java Users
Generated Java REST classes ease REST API use by external programs running in a Java Virtual Machine (JVM).
@@ -79,7 +80,7 @@ As in previous versions of H2O, users can export trained models as Java objects
---
-##R Users
+## R Users
If you use H2O primarily in R, be aware that as a result of the improvements to the R package for H2O scripts created using previous versions (Nunes 2.8.6.2 or prior) will require minor revisions to work with H2O 3.0.
@@ -93,7 +94,7 @@ There is also an [R Porting Guide](#PortingGuide) that provides a side-by-side c
-#Porting R Scripts
+# Porting R Scripts
This document outlines how to port R scripts written in previous versions of H2O (Nunes 2.8.6.2 or prior, also known as "H2O Classic") for compatibility with the new H2O 3.0 API. When upgrading from H2O to H2O 3.0, most functions are the same. However, there are some differences that will need to be resolved when porting any scripts that were originally created using H2O to H2O 3.0.
@@ -105,9 +106,9 @@ For additional assistance within R, enter a question mark before the command (fo
There is also a "shim" available that will review R scripts created with previous versions of H2O, identify deprecated or renamed parameters, and suggest replacements. For more information, refer to the repo [here](https://github.com/h2oai/h2o-dev/blob/d9693a97da939a2b77c24507c8b40a5992192489/h2o-r/h2o-package/R/shim.R).
-##Changes from H2O 2.8 to H2O 3.0
+## Changes from H2O 2.8 to H2O 3.0
-###`h2o.exec`
+### `h2o.exec`
The `h2o.exec` command is no longer supported. Any workflows using `h2o.exec` must be revised to remove this command. If the H2O 3.0 workflow contains any parameters or commands from H2O Classic, errors will result and the workflow will fail.
The purpose of `h2o.exec` was to wrap expressions so that they could be evaluated in a single `\Exec2` call. For example,
@@ -129,23 +130,23 @@ A String array is ["f00", "b4r"], *not* "[\"f00\", \"b4r\"]"
Only string values are enclosed in double quotation marks (`"`).
-###`h2o.performance`
+### `h2o.performance`
To access any exclusively binomial output, use `h2o.performance`, optionally with the corresponding accessor. The accessor can only use the model metrics object created by `h2o.performance`. Each accessor is named for its corresponding field (for example, `h2o.AUC`, `h2o.gini`, `h2o.F1`). `h2o.performance` supports all current algorithms except for K-Means.
If you specify a data frame as a second parameter, H2O will use the specified data frame for scoring. If you do not specify a second parameter, the training metrics for the model metrics object are used.
-###`xval` and `validation` slots
+### `xval` and `validation` slots
The `xval` slot has been removed, as `nfolds` is not currently supported.
The `validation` slot has been merged with the `model` slot.
-###Principal Components Regression (PCR)
+### Principal Components Regression (PCR)
Principal Components Regression (PCR) has also been deprecated. To obtain PCR values, create a Principal Components Analysis (PCA) model, then create a GLM model from the scored data from the PCA model.
-###Saving and Loading Models
+### Saving and Loading Models
Saving and loading a model from R is supported in version 3.0.0.18 and later. H2O 3.0 uses the same binary serialization method as previous versions of H2O, but saves the model and its dependencies into a directory, with each object as a separate file. The `save_CV` option for available in previous versions of H2O has been deprecated, as `h2o.saveAll` and `h2o.loadAll` are not currently supported. The following commands are now supported:
@@ -165,11 +166,11 @@ Saving and loading a model from R is supported in version 3.0.0.18 and later. H2
-##GBM
+## GBM
N-fold cross-validation and grid search are currently supported in H2O 3.0.
-###Renamed GBM Parameters
+### Renamed GBM Parameters
The following parameters have been renamed, but retain the same functions:
@@ -187,7 +188,7 @@ H2O Classic Parameter Name | H2O 3.0 Parameter Name
`max.after.balance.size` | `max_after_balance_size`
-###Deprecated GBM Parameters
+### Deprecated GBM Parameters
The following parameters have been removed:
@@ -196,7 +197,7 @@ The following parameters have been removed:
- `holdout.fraction`: The fraction of the training data to hold out for validation is no longer supported.
- `grid.parallelism`: Specifying the number of parallel threads to run during a grid search is no longer supported.
-###New GBM Parameters
+### New GBM Parameters
The following parameters have been added:
@@ -204,7 +205,7 @@ The following parameters have been added:
- `score_each_iteration`: Display error rate information after each tree in the requested set is built.
- `build_tree_one_node`: Run on a single node to use fewer CPUs.
-###GBM Algorithm Comparison
+### GBM Algorithm Comparison
H2O Classic | H2O 3.0
------------- | -------------
@@ -247,7 +248,7 @@ H2O Classic | H2O 3.0
`grid.parallelism = 1)` |
-###Output
+### Output
The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
@@ -278,9 +279,9 @@ H2O Classic | H2O 3.0 | Model Type
---
-##GLM
+## GLM
-###Renamed GLM Parameters
+### Renamed GLM Parameters
The following parameters have been renamed, but retain the same functions:
@@ -293,7 +294,7 @@ H2O Classic Parameter Name | H2O 3.0 Parameter Name
`iter.max` | `max_iterations`
`epsilon` | `beta_epsilon`
-###Deprecated GLM Parameters
+### Deprecated GLM Parameters
The following parameters have been removed:
@@ -305,14 +306,14 @@ The following parameters have been removed:
- `disable_line_search`: This parameter has been deprecated, as it was mainly used for testing purposes.
- `max_predictors`: Stops training the algorithm if the number of predictors exceeds the specified value. (may be re-added)
-###New GLM Parameters
+### New GLM Parameters
The following parameters have been added:
- `validation_frame`: Specify the validation dataset.
- `solver`: Select IRLSM or LBFGS.
-###GLM Algorithm Comparison
+### GLM Algorithm Comparison
H2O Classic | H2O 3.0
@@ -356,7 +357,7 @@ H2O Classic | H2O 3.0
`max_predictors = -1)` | `max_active_predictors = -1)`
-###Output
+### Output
The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
@@ -382,9 +383,9 @@ H2O Classic | H2O 3.0 | Model Type
`@model$confusion` | | `binomial`
-##K-Means
+## K-Means
-###Renamed K-Means Parameters
+### Renamed K-Means Parameters
The following parameters have been renamed, but retain the same functions:
@@ -399,14 +400,14 @@ H2O Classic Parameter Name | H2O 3.0 Parameter Name
**Note** In H2O, the `normalize` parameter was disabled by default. The `standardize` parameter is enabled by default in H2O 3.0 to provide more accurate results for datasets containing columns with large values.
-###New K-Means Parameters
+### New K-Means Parameters
The following parameters have been added:
- `user` has been added as an additional option for the `init` parameter. Using this parameter forces the K-Means algorithm to start at the user-specified points.
- `user_points`: Specify starting points for the K-Means algorithm.
-###K-Means Algorithm Comparison
+### K-Means Algorithm Comparison
H2O Classic | H2O 3.0
------------- | -------------
@@ -424,7 +425,7 @@ H2O Classic | H2O 3.0
| `fold_assignment = c("AUTO", "Random", "Modulo"),`
| `keep_cross_validation_predictions = FALSE)`
-###Output
+### Output
The following table provides the component name in H2O and the corresponding component name in H2O 3.0 (if supported).
@@ -442,11 +443,11 @@ H2O Classic | H2O 3.0
---
-##Deep Learning
+## Deep Learning
**Note**: If the results in the confusion matrix are incorrect, verify that `score_training_samples` is equal to 0. By default, only the first 10,000 rows are included.
-###Renamed Deep Learning Parameters
+### Renamed Deep Learning Parameters
The following parameters have been renamed, but retain the same functions:
@@ -460,7 +461,7 @@ H2O Classic Parameter Name | H2O 3.0 Parameter Name
`dlmodel@model$valid_class_error` | `@model$validation_metrics@$MSE`
-###Deprecated DL Parameters
+### Deprecated DL Parameters
The following parameters have been removed:
@@ -468,7 +469,7 @@ The following parameters have been removed:
- `holdout_fraction`: Fraction of the training data to hold out for validation.
- `dlmodel@model$best_cutoff`: This output parameter has been removed.
-###New DL Parameters
+### New DL Parameters
The following parameters have been added:
@@ -479,7 +480,7 @@ The following options for the `loss` parameter have been added:
- `absolute`: Provides strong penalties for mispredictions
- `huber`: Can improve results for regression
-###DL Algorithm Comparison
+### DL Algorithm Comparison
H2O Classic | H2O 3.0
------------- | -------------
@@ -559,7 +560,7 @@ H2O Classic | H2O 3.0
| `fold_assignment = c("AUTO", "Random", "Modulo"),`
| `keep_cross_validation_predictions = FALSE)`
-###Output
+### Output
The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
@@ -581,9 +582,9 @@ H2O Classic | H2O 3.0 | Model Type
---
-##Distributed Random Forest
+## Distributed Random Forest
-###Changes to DRF in H2O 3.0
+### Changes to DRF in H2O 3.0
Distributed Random Forest (DRF) was represented as `h2o.randomForest(type="BigData", ...)` in H2O Classic. In H2O Classic, SpeeDRF (`type="fast"`) was not as accurate, especially for complex data with categoricals, and did not address regression problems. DRF (`type="BigData"`) was at least as accurate as SpeeDRF (`type="fast"`) and was the only algorithm that scaled to big data (data too large to fit on a single node).
In H2O 3.0, our plan is to improve the performance of DRF so that the data fits on a single node (optimally, for all cases), which will make SpeeDRF obsolete. Ultimately, the goal is provide a single algorithm that provides the "best of both worlds" for all datasets and use cases.
@@ -592,7 +593,7 @@ Please note that H2O does not currently support the ability to specify the numbe
**Note**: H2O 3.0 only supports DRF. SpeeDRF is no longer supported. The functionality of DRF in H2O 3.0 is similar to DRF functionality in H2O.
-###Renamed DRF Parameters
+### Renamed DRF Parameters
The following parameters have been renamed, but retain the same functions:
@@ -610,7 +611,7 @@ H2O Classic Parameter Name | H2O 3.0 Parameter Name
`nodesize` | `min_rows`
-###Deprecated DRF Parameters
+### Deprecated DRF Parameters
The following parameters have been removed:
@@ -623,13 +624,13 @@ The following parameters have been removed:
- `stat.type`: This parameter was used for SpeeDRF, which is no longer supported.
- `type`: This parameter was used for SpeeDRF, which is no longer supported.
-###New DRF Parameters
+### New DRF Parameters
The following parameter has been added:
- `build_tree_one_node`: Run on a single node to use fewer CPUs.
-###DRF Algorithm Comparison
+### DRF Algorithm Comparison
H2O Classic | H2O 3.0
------------- | -------------
@@ -673,7 +674,7 @@ H2O Classic | H2O 3.0
`type = "fast")` |
-###Output
+### Output
The following table provides the component name in H2O, the corresponding component name in H2O 3.0 (if supported), and the model type (binomial, multinomial, or all). Many components are now included in `h2o.performance`; for more information, refer to [(`h2o.performance`)](#h2operf).
@@ -700,7 +701,7 @@ H2O Classic | H2O 3.0 | Model Type
`@model$max_per_class_err` | currently replaced by `@model$training_metrics@metrics$thresholds_and_metric_scores$min_per_class_correct` | `binomial`
-##Github Users
+## Github Users
All users who pull directly from the H2O classic repo on Github should be aware that this repo will be renamed. To retain access to the original H2O (2.8.6.2 and prior) repository:
@@ -708,29 +709,29 @@ All users who pull directly from the H2O classic repo on Github should be aware
This is the easiest way to change your local repo and is recommended for most users.
-0. Enter `git remote -v` to view a list of your repositories.
-0. Copy the address your H2O classic repo (refer to the text in brackets below - your address will vary depending on your connection method):
+1. Enter `git remote -v` to view a list of your repositories.
+2. Copy the address your H2O classic repo (refer to the text in brackets below - your address will vary depending on your connection method):
```
H2O_User-MBP:h2o H2O_User$ git remote -v
origin https://{H2O_User@github.com}/h2oai/h2o.git (fetch)
origin https://{H2O_User@github.com}/h2oai/h2o.git (push)
```
-0. Enter `git remote set-url origin {H2O_User@github.com}:h2oai/h2o-2.git`, where `{H2O_User@github.com}` represents the address copied in the previous step.
+3. Enter `git remote set-url origin {H2O_User@github.com}:h2oai/h2o-2.git`, where `{H2O_User@github.com}` represents the address copied in the previous step.
**The more complicated way**
This method involves editing the Github config file and should only be attempted by users who are confident enough with their knowledge of Github to do so.
-0. Enter `vim .git/config`.
-0. Look for the `[remote "origin"]` section:
+1. Enter `vim .git/config`.
+2. Look for the `[remote "origin"]` section:
```
[remote "origin"]
url = https://H2O_User@github.com/h2oai/h2o.git
fetch = +refs/heads/*:refs/remotes/origin/*
```
-0. In the `url =` line, change `h2o.git` to `h2o-2.git`.
-0. Save the changes.
+3. In the `url =` line, change `h2o.git` to `h2o-2.git`.
+4. Save the changes.
The latest version of H2O is stored in the `h2o-3` repository. All previous links to this repo will still work, but if you would like to manually update your Github configuration, follow the instructions above, replacing `h2o-2` with `h2o-3`.
diff --git a/h2o-docs/src/product/upgrade/PressRelease.md b/h2o-docs/src/product/upgrade/PressRelease.md
index 61509ed3ac3..adf5a191b51 100644
--- a/h2o-docs/src/product/upgrade/PressRelease.md
+++ b/h2o-docs/src/product/upgrade/PressRelease.md
@@ -1,4 +1,4 @@
-#H2O 3.0 is here!
+# H2O 3.0 is here!
The new version of H2O offers a single integrated and tested platform for enterprise and open-source use, enhanced usability through a new web user interface (UI) with embeddable workflows, elegant APIs, and direct integration for Python and Sparkling Water.
diff --git a/h2o-docs/src/product/upgrade/PythonParity.md b/h2o-docs/src/product/upgrade/PythonParity.md
index 4305aafb8d3..5c859bae126 100644
--- a/h2o-docs/src/product/upgrade/PythonParity.md
+++ b/h2o-docs/src/product/upgrade/PythonParity.md
@@ -296,7 +296,7 @@ This group includes:
-####Summary Group
+#### Summary Group
This group includes:
@@ -318,7 +318,7 @@ This group includes:
|`all`| `all`|
|`any`|`any`|
-####Non-Group Generic
+#### Non Group Generic
This group includes:
diff --git a/h2o-docs/src/product/upgrade/RChanges.md b/h2o-docs/src/product/upgrade/RChanges.md
index e6577dcb274..50e5927e1d5 100644
--- a/h2o-docs/src/product/upgrade/RChanges.md
+++ b/h2o-docs/src/product/upgrade/RChanges.md
@@ -1,15 +1,15 @@
-#R Interface Improvements for H2O
+# R Interface Improvements for H2O
Recent improvements in the R wrapper for H2O may cause previously written R scripts to be inoperable. This document describes these changes and provides guidelines on updating scripts for compatibility.
-##H2O Connection Object
+## H2O Connection Object
The H2O connection object (`conn`) has been removed from nearly all calls.
The `conn` object is still used in the `h2o.clusterIsUp` command.
Any `conn` references for commands other than `h2o.clusterIsUp` must be removed from scripts to ensure compatibility.
-##Changes to `apply`
+## Changes to `apply`
The data shape returned by `apply` is now identical to the default behavior in R. Any column-wide changes produce column-wide results.
@@ -17,7 +17,7 @@ For example, in previous versions, if `apply` on `MARGIN` was equal to `2`, then
To revert to the previous behavior, use the transpose function using the R command `t`.
-##Temp Management
+## Temp Management
For users who regularly remove the temporary data frames and keys manually, the temp management rules have been improved in the following ways:
@@ -32,10 +32,10 @@ For users who regularly remove the temporary data frames and keys manually, the
- If your cluster is running low on memory, run an R GC cycle to delete temporary data frames and keys
-##S4 to S3
+## S4 to S3
The internal H2O object, which was previously an S4 object, is now an S3 object. You must use S3 operations to access objects (instead of S4). The risk of overloading depends on whether the package overloads the existing package type.
-##`frame_id` to `id`
+## `frame_id` to `id`
The `frame_id` property has been renamed to `id`. This property is used in the `h2o.getFrame` command.
\ No newline at end of file
diff --git a/h2o-docs/src/product/upgrade/Rdoc.md b/h2o-docs/src/product/upgrade/Rdoc.md
index 475d151377b..a33bf989a77 100644
--- a/h2o-docs/src/product/upgrade/Rdoc.md
+++ b/h2o-docs/src/product/upgrade/Rdoc.md
@@ -1,16 +1,18 @@
-#Intro to using H2O-Dev from R with data munging (for PUBDEV-562)
+# Intro to using H2O-Dev from R with data munging (for PUBDEV-562)
+
+>**Note**: This topic is no longer being maintained. Refer to the [R Booklet](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/booklets/RBooklet.pdf) for the most up-to-date documentation.
We have the reference doc for the H2O R binding, but we regularly get questions from new users asking about which parts of R are supported, in particular regarding data munging. A 15-20 page intro doc would be really useful. Perhaps this should be a new booklet in the small yellow book series.
It should give an overview of:
-0. how the big data is kept in the cluster and manipulated from R via references,
+1. how the big data is kept in the cluster and manipulated from R via references,
-1. how to move data back and forth between data in R ,
+2. how to move data back and forth between data in R ,
-2. what operations are implemented in the H2O back end,
+3. what operations are implemented in the H2O back end,
-3. example scripts which include simple data munging (frame manipulation via R expressions and ddply), perhaps based on the CityBike example (sans weather join) and Alex's examples.
+4. example scripts which include simple data munging (frame manipulation via R expressions and ddply), perhaps based on the CityBike example (sans weather join) and Alex's examples.
Per Ray, this doc should also include:
@@ -21,7 +23,7 @@ explanation of how it works
standard data prep
-#What is H2O?
+# What is H2O?
H2O is fast, scalable, open-source machine learning and deep learning for Smarter Applications. With H2O enterprises like PayPal, Nielsen, Cisco, and others can use all of their data without sampling and get accurate predictions faster. Advanced algorithms, like Deep Learning, Boosting, and Bagging Ensembles are readily available for application designers to build smarter applications through elegant APIs. Some of our earliest customers have built powerful domain-specific predictive engines for Recommendations, Customer Churn, Propensity to Buy, Dynamic Pricing, and Fraud Detection for the Insurance, Healthcare, Telecommunications, AdTech, Retail, and Payment Systems.
@@ -31,37 +33,37 @@ H2O implements almost all common machine learning algorithms, such as generalize
H2O is nurturing a grassroots movement of physicists, mathematicians, computer and data scientists to herald the new wave of discovery with data science. Academic researchers and Industrial data scientists collaborate closely with our team to make this possible. Stanford university giants Stephen Boyd, Trevor Hastie, and Rob Tibshirani advise the H2O team to build scalable machine learning algorithms. With hundreds of meetups over the past two years, H2O has become a growing word-of-mouth phenomenon amongst the data community, now implemented by 12,000+ users and deployed in 2000+ corporations using R, Python, Hadoop and Spark.
-#Intro
+# Intro
how the big data is kept in the cluster and manipulated from R via references
what operations are implemented in the H2O back end
-#Installation
+# Installation
-###Installing R or R Studio
+### Installing R or R Studio
To download R:
-0. Go to [http://cran.r-project.org/mirrors.html](http://cran.r-project.org/mirrors.html).
-0. Select your closest local mirror.
-0. Select your operating system (Linux, OS X, or Windows).
-0. Depending on your OS, download the appropriate file, along with any required packages.
-0. When the download is complete, unzip the file and install.
+1. Go to [http://cran.r-project.org/mirrors.html](http://cran.r-project.org/mirrors.html).
+2. Select your closest local mirror.
+3. Select your operating system (Linux, OS X, or Windows).
+4. Depending on your OS, download the appropriate file, along with any required packages.
+5. When the download is complete, unzip the file and install.
To download R Studio:
-0. Go to [http://www.rstudio.com/products/rstudio/](http://www.rstudio.com/products/rstudio/).
-0. Select your deployment type (desktop or server).
-0. Download the file.
-0. When the download is complete, unzip the file and install.
+1. Go to [http://www.rstudio.com/products/rstudio/](http://www.rstudio.com/products/rstudio/).
+2. Select your deployment type (desktop or server).
+3. Download the file.
+4. When the download is complete, unzip the file and install.
-#H2O Initialization
+# H2O Initialization
-0. Go to [h2o.ai/downloads](http://h2o.ai/downloads).
-0. Under **Download H2O**, select a build. The "bleeding edge" build contains the latest changes, while the "latest stable release" may be more reliable.
-0. Click the **Install in R** tab above the **Download H2O** button.
-0. Copy and paste the commands into R or R Studio, one line at a time.
+1. Go to [h2o.ai/downloads](http://h2o.ai/downloads).
+2. Under **Download H2O**, select a build. The "bleeding edge" build contains the latest changes, while the "latest stable release" may be more reliable.
+3. Click the **Install in R** tab above the **Download H2O** button.
+4. Copy and paste the commands into R or R Studio, one line at a time.
The lines are reproduced below; however, you should not copy and paste them, as the required version number has been replaced with asterisks (*). Refer to the [Downloads page](http://h2o.ai/downloads) for the latest version number.
@@ -89,7 +91,7 @@ The lines are reproduced below; however, you should not copy and paste them, as
You can also enter `install.packages("h2o")` in R to load the latest H2O R package from CRAN.
-###Making a Build from Source Code
+### Making a Build from Source Code
The R package is build as part of the standard build process. In the top-level `h2o-3` directory, use `./gradlew build`.
@@ -100,16 +102,16 @@ To build the R component by itself:
The build output is located a CRAN-like layout in the R directory.
-####Installation from the command line
+#### Installation from the command line
-0. Navigate to the top-level `h2o-3` directory: `cd ~/h2o-3`.
-0. Install the H2O package for R: `R CMD INSTALL h2o-r/R/src/contrib/h2o_****.tar.gz`
+1. Navigate to the top-level `h2o-3` directory: `cd ~/h2o-3`.
+2. Install the H2O package for R: `R CMD INSTALL h2o-r/R/src/contrib/h2o_****.tar.gz`
**Note**: Do not copy and paste the command above. You must replace the asterisks (*) with the current H2O .tar version number. Look in the `h2o-3/h2o-r/R/src/contrib/` directory for the version number.
### Installation from within R
-0. Detach any currently loaded H2O package for R.
+1. Detach any currently loaded H2O package for R.
`if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }`
```
@@ -117,7 +119,7 @@ The build output is located a CRAN-like layout in the R directory.
(as ‘lib’ is unspecified)
```
-0. Remove any previously installed H2O package for R.
+2. Remove any previously installed H2O package for R.
`if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }`
@@ -126,7 +128,7 @@ The build output is located a CRAN-like layout in the R directory.
(as ‘lib’ is unspecified)
```
-0. Install the dependencies for H2O.
+3. Install the dependencies for H2O.
**Note**: This list may change as new capabilities are added to H2O. The commands are reproduced below, but we strongly recommend visiting the H2O download page at [h2o.ai/download](http://h2o.ai/download) for the most up-to-date list of dependencies.
@@ -141,7 +143,7 @@ The build output is located a CRAN-like layout in the R directory.
if (! ("utils" %in% rownames(installed.packages()))) { install.packages("utils") }
```
-0. Install the H2O R package from your build directory.
+1. Install the H2O R package from your build directory.
`install.packages("h2o", type="source", repos=(c("http://h2o-release.s3.amazonaws.com/h2o/master/****/R")))`
**Note**: Do not copy and paste the command above. You must replace the asterisks (*) with the current H2O build number. Refer to the H2O download page at [h2o.ai/download](http://h2o.ai/download) for latest build number.
@@ -213,14 +215,14 @@ Note: As started, H2O is limited to the CRAN default of 2 CPUs.
> localH2O = h2o.init(nthreads = -1)
```
-##Munging operations in R:
+## Munging operations in R:
-###Overview:
+### Overview:
Operating on an `H2OFrame` object triggers the rollup of the expression to be executed, but the expression itself is not evaluated. Instead, an AST is built from the R expression using R's built-in parser, which handles operator precedence. In the case of assignment, the AST is stashed into the variable in the assignment. The AST is bound to an R variable as a promise to evaluate the expression on demand. When evaluation is forced, the AST is walked, converted to JSON, and shipped over to H2O. The result returned by H2O is a key pointing to the newly-created frame. Depending on the methods used, the results may not be an H2OFrame return type. Any extra preprocessing of data returned by H2O is discussed in each instance, as it varies from method to method.
-###What's implemented?
+### What's implemented?
Many of R's generic S3 methods can be combined with H2OFrame objects so that the result is coerced to an object of the appropriate type (typically an H2OFrame object). To view a list of R's generic methods, use `getGenerics()`. A call to `showMethods(classes="H2OFrame")` displays a list of permissible operations with H2OFrame objects. S3 methods are divided into four groups:
- Math
@@ -236,9 +238,9 @@ With the exception of Complex, H2OFrame methods fall into these categories as we
- Summary
-###List:
+### List:
-####Ops Group
+#### Ops Group
This group includes:
@@ -258,7 +260,7 @@ This group includes:
|`&`| `∣`| | |
-####Math Group
+#### Math Group
This group includes:
@@ -295,7 +297,7 @@ This group includes:
-####Summary Group
+#### Summary Group
This group includes:
@@ -313,7 +315,7 @@ This group includes:
|`sum`|`all`|
|`any`|
-####Non-Group Generic
+#### Non-Group Generic
This group includes:
@@ -356,25 +358,25 @@ This group includes:
-#Data Prep in R
+# Data Prep in R
standard data prep
-#Data Manipulation in R
+# Data Manipulation in R
how to move data back and forth between data in R
slicing
creating new columns
-#Examples/Demos
+# Examples/Demos
-#Support
+# Support
-Users of the H2O package may submit general inquiries and bug reports to the H2O.ai support address, [support@h2oai.com](mailto:support@h2oai.com). Alternatively, specific bugs or issues may be filed to the H2O JIRA, [https://0xdata.atlassian.net](https://0xdata.atlassian.net).
+Users of the H2O package may submit general inquiries and bug reports using the "h2o" tag on [Stack Overflow](https://stackoverflow.com/questions/tagged/h2o). Alternatively, specific bugs or issues may be filed to the H2O JIRA, [https://0xdata.atlassian.net](https://0xdata.atlassian.net).
-#References
+# References
-#Appendix
+# Appendix
(commands)
diff --git a/h2o-docs/src/product/upgrade/Upgrade.md b/h2o-docs/src/product/upgrade/Upgrade.md
index c8d6fec7acd..18a24230522 100644
--- a/h2o-docs/src/product/upgrade/Upgrade.md
+++ b/h2o-docs/src/product/upgrade/Upgrade.md
@@ -1,6 +1,6 @@
-#Upgrading to H2O 3.0
+# Upgrading to H2O 3.0
-##Why Upgrade?
+## Why Upgrade?
H2O 3.0 represents our latest iteration of H2O. It includes many improvements, such as a simplified architecture, faster and more accurate algorithms, and an interactive web UI.
@@ -8,37 +8,40 @@ As of May 15th, 2015, this version will supersede the previous version of H2O. S
For a comparison of H2O and H2O 3.0, please refer to this document.
-###Python Support
+### Python Support
Python is only supported on the latest version of H2O. For more information, refer to the Python installation instructions.
-###Sparkling Water Support
+### Sparkling Water Support
Sparkling Water is only supported with H2O 3.0. For more information, refer to the Sparkling Water repo.
-##Supported Algorithms
+## Supported Algorithms
H2O 3.0 will soon provide feature parity with previous versions of H2O. Currently, the following algorithms are supported:
-###Supervised
+### Supervised
- **Generalized Linear Model (GLM)**: Binomial classification, multinomial classification, regression (including logistic regression)
- **Distributed Random Forest (DRF)**: Binomial classification, multinomial classification, regression
- **Gradient Boosting Machine (GBM)**: Binomial classification, multinomial classification, regression
- **Deep Learning (DL)**: Binomial classification, multinomial classification, regression
+- Naive Bayes
+- Stacked Ensembles
+- XGBoost
-###Unsupervised
+### Unsupervised
- K-means
- Principal Component Analysis
-- Autoencoder
-
+- Autoencoder
+- Generalized Low Rank Models
-###Still In Testing
+### Miscellaneous
-- Naive Bayes
+- **Word2vec**
-##How to Update R Scripts
+## How to Update R Scripts
Due to the numerous enhancements to the H2O package for R to make it more consistent and simplified, some parameters have been renamed or deprecated.