top of page

Deep Learning with H2O in Python

H2O.ai is focused on bringing AI to businesses through software. Its flagship product is H2O, the leading open source platform that makes it easy for financial services, insurance companies, and healthcare companies to deploy AI and deep learning to solve complex problems. More than 9,000 organizations and 80,000+ data scientists depend on H2O for critical applications like predictive maintenance and operational intelligence. The company – which was recently named to the CB Insights AI 100 – is used by 169 Fortune 500 enterprises, including 8 of the world’s 10 largest banks, 7 of the 10 largest insurance companies, and 4 of the top 10 healthcare companies. Notable customers include Capital One, Progressive Insurance, Transamerica, Comcast, Nielsen Catalina Solutions, Macy’s, Walgreens, and Kaiser Permanente.


Using in-memory compression, H2O handles billions of data rows in-memory, even with a small cluster. To make it easier for non-engineers to create complete analytic workflows, H2O’s platform includes interfaces for R, Python, Scala, Java, JSON, and CoffeeScript/JavaScript, as well as a built-in web interface, Flow. H2O is designed to run in standalone mode, on Hadoop, or within a Spark Cluster, and typically deploys within minutes.


H2O includes many common machine learning algorithms, such as generalized linear modeling (linear regression, logistic regression, etc.), Na¨ıve Bayes, principal components analysis, k-means clustering, and word2vec. H2O implements bestin-class algorithms at scale, such as distributed random forest, gradient boosting, and deep learning. H2O also includes a Stacked Ensembles method, which finds the optimal combination of a collection of prediction algorithms using a process 6 | Installation known as ”stacking.” With H2O, customers can build thousands of models and compare the results to get the best predictions.


Here is an example to use H2O-deeplearning in Python-




Step 1- First of all , we need to install H2o package in Python.


on anaconda prompt

pip install h2o



Step 2- Initialize and start the cluster -


h2o.init()
from h2o.estimators.deeplearning import H2ODeepLearningEstimator


Step 3- load train and test data set-


train = h2o.import_file("https://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv")


Step 4- Creating test and train data set using split-


splits = train.split_frame(ratios=[0.75], seed=1234)



Step 5- Configuring the model-


model = H2ODeepLearningEstimator(distribution = "AUTO",activation = "RectifierWithDropout",hidden = [32,32],input_dropout_ratio = 0.2,l1 = 1e-5,epochs = 10)



Step 6- train(fit the model)-


model.train(x="sepal_len", y=["petal_len"], training_frame=splits[0])



Step 7- predicting using trained model and creating a new column in test data-


(splits[1]['predicted_sepal_len'])=model.predict(splits[1])




One can compare sepal_len ( actual) and predicted_sepal_len ( forecasted ) values.

53 views0 comments
bottom of page