Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

H2O 3 REST API Overview

151 Aufrufe

Veröffentlicht am

Design principles and examples for the REST API for the open source H2O Machine Learning platform.

Veröffentlicht in: Daten & Analysen
  • Als Erste(r) kommentieren

H2O 3 REST API Overview

  1. 1. H2O 3 REST API Overview Raymond Peck Director of Product Engineering, H2O.ai rpeck@h2oai.com © H2O.ai, 2015 1
  2. 2. Long version of this content is here: https://github.com/h2oai/h2o-3/blob/master/ h2o-docs/src/api/REST/h2o_3_rest_api_overview.md © H2O.ai, 2015 2
  3. 3. Why? • Use the REST API to drive H2O from an external script or program in any language. • Use the REST API when you want API stability. • Use the Java API if you want to call the internal APIs from Java, Scala, etc. © H2O.ai, 2015 3
  4. 4. Who? • Software developers proficient in a scripting or a programming language. • Those familiar with nested data representations like JSON. • Those familiar with the functionality of H2O • at least well enough to convert a Flow, R or Python script from a Data Scientist. © H2O.ai, 2015 4
  5. 5. What? Any H2O functionality in Flow, R or Python can be accessed via the REST API - data import - model building - model comparison - generating predictions - admin functions © H2O.ai, 2015 5
  6. 6. How? You can call the REST API: • from your browser • using browser tools such as Postman in Chrome • using curl • using the language of your choice © H2O.ai, 2015 6
  7. 7. Bindings For Python and R simply use the supplied packages. For JVM clients: - H2O currently ships with REST API payload POJOs. - We're working on endpoint proxies. - These are generated as part of the build using a Python script. We'll work with you to generate bindings for other languages. A user easily did C#. © H2O.ai, 2015 7
  8. 8. Versioning and Stability, Part 1 • Current version is 3. • Non-breaking changes are allowed; examples: • adding output fields • adding parameters with defaults that maintain old behavior • Well-written clients should not break as functionality is added to version 3. © H2O.ai, 2015 8
  9. 9. Versioning and Stability, Part 2 • Backward compatibility is tested with each release, including nightlies. • Functionality under development is version 99. • /99 endpoints can be called via /EXPERIMENTAL. © H2O.ai, 2015 9
  10. 10. URLs http://your_server:54321/version/Resource{/...} Examples: - /3/Frames - /3/Frames/my_frame - /3/Frames/my_frame/summary - /3/Models - /3/Models/my_model - /3/Cloud © H2O.ai, 2015 10
  11. 11. HTTP Verbs • GET requests fetch data and do not cause side effects. GET /3/Frames/my_frame_name? row_offset=10000&row_count=1000 • POST requests create a new object. They use x-www-form-urlencoded input format. • DELETE requests delete an object. • HEAD requests return just the HTTP status. © H2O.ai, 2015 11
  12. 12. HTTP Status Codes • 200 OK (all is well) • 400 Bad Request (the request URL is bad) • 404 Not Found (a specified object was not found) • 412 Precondition Failed (bad parameters or other problem handling the request) • 500 Internal Server Error (unanticipated failure) © H2O.ai, 2015 12
  13. 13. Schemas, Part 1 Schemas define input and output formats. Schemas fields can be simple values or nested schemas, or arrays or dictionaries (maps) of these. © H2O.ai, 2015 13
  14. 14. Schemas, Part 2 • type • default value • help string • direction (in, out or inout) • required • importance • allowed values for enumerated fields © H2O.ai, 2015 14
  15. 15. { "__meta": { "schema_name": "ModelParameterSchemaV3", "schema_type": "Iced", "schema_version": 3 }, "actual_value": { "URL": "/3/Models/prostate_glm", "__meta": { "schema_name": "ModelKeyV3", "schema_type": "Key<Model>", "schema_version": 3 }, "name": "prostate_glm", "type": "Key<Model>" }, "default_value": null, "help": "Destination id for this model; auto-generated if not specified", "label": "model_id", "level": "critical", "name": "model_id", "required": false, "type": "Key<Model>", "values": [] }, © H2O.ai, 2015 15
  16. 16. Error Condition Payloads • return a non-2xx HTTP status code • return standardized error payloads: • end-user message • developer message • HTTP status • optional dictionary of revelant values • exception information if applicable. © H2O.ai, 2015 16
  17. 17. Example Error { "__meta": { "schema_type": "H2OError", ... }, "timestamp": 1438634936808, "error_url": "/3/Frames/missing_frame", "msg": "Object 'missing_frame' not found for argument: key", "dev_msg": "Object 'missing_frame' not found for argument: key", "http_status": 404, "values": { "argument": "key", "name": "missing_frame" }, "exception_type": "water.exceptions.H2OKeyNotFoundArgumentException", "exception_msg": "Object 'missing_frame' not found for argument: key", "stacktrace": [ ... ] } © H2O.ai, 2015 17
  18. 18. Example Endpoints For the complete list check the reference docs or /Metadata/endpoints. As of August 6, 2015 there are 105 endpoints: Loading and parsing data files Frames and Models Administrative and utility Job management and polling Persistence © H2O.ai, 2015 18
  19. 19. Loading and parsing data files GET /3/ImportFiles Import raw data files into a single-column H2O Frame. POST /3/ParseSetup Guess the parameters for parsing raw byte-oriented data into an H2O Frame. POST /3/Parse Parse a raw byte-oriented Frame into a useful columnar data Frame. © H2O.ai, 2015 19
  20. 20. Frames GET /3/Frames - Return all Frames in the H2O distributed K/V store. GET /3/Frames/(?.*) - Return the specified Frame. GET /3/Frames/(?.*)/summary - Return a Frame, including the histograms, after forcing computation of rollups. GET /3/Frames/(?.*)/columns/(?.*)/summary - Return the summary metrics for a column, e.g. mins, maxes, mean, sigma, percentiles, etc. DELETE /3/Frames/(?.*) DELETE /3/Frames © H2O.ai, 2015 20
  21. 21. Building models GET /3/ModelBuilders Return the Model Builder metadata for all available algorithms. GET /3/ModelBuilders/(?.*) Return the Model Builder metadata for the specified algorithm. POST /3/ModelBuilders/deeplearning/parameters Validate a set of Deep Learning model builder parameters. POST /3/ModelBuilders/deeplearning Train a Deep Learning model on the specified Frame. © H2O.ai, 2015 21
  22. 22. Accessing and using models GET /3/Models Return all Models from the H2O distributed K/V store. GET /3/Models/(?.*?)(.java)? Return the specified Model. Use .java extension for Java POJO. POST /3/Predictions/models/(?.*)/frames/(?.*) Generate predictions for the specified Frame and Model. DELETE /3/Models/(?.*) DELETE /3/Models © H2O.ai, 2015 22
  23. 23. Administrative and utility GET /3/About Return information about this H2O cluster. GET /3/Cloud Determine the status of the nodes in the H2O cloud. HEAD /3/Cloud Determine the status of the nodes in the H2O cloud. © H2O.ai, 2015 23
  24. 24. Job management and polling GET /3/Jobs Get a list of all the H2O Jobs (long-running actions). GET /3/Jobs/(?.*) Get the status of the given H2O Job (long-running action). POST /3/Jobs/(?.*)/cancel Cancel a running job. © H2O.ai, 2015 24
  25. 25. Persistence POST /3/Frames/(?.*)/export Export a Frame to the given path with optional overwrite. POST /99/Models.bin/(?.*) Import given binary model into H2O. GET /99/Models.bin/(?.*) Export given model. © H2O.ai, 2015 25
  26. 26. Example workflows using curl Some fields have been omitted for brevity. When using curl you can pipe (|) the output through python -m json.tool to pretty-print the JSON: curl -X GET http://localhost:54321/3/Frames | python -m json.tool © H2O.ai, 2015 26
  27. 27. GBM_Example.flow, Step 1: Import In Flow: importFiles ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"] In curl: curl -X GET http://127.0.0.1:54321/3/ImportFiles?path= http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz © H2O.ai, 2015 27
  28. 28. GBM_Example.flow, Step 1 Result { "__meta": { "schema_name": "ImportFilesV3", "schema_type": "Iced", "schema_version": 3 }, "destination_frames": [ "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz" ], "fails": [], "files": [ "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz" ], "path": "http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz" } © H2O.ai, 2015 28
  29. 29. GBM_Example.flow, Step 2: ParseSetup In Flow: setupParse paths: ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"] In curl: curl -X POST http://127.0.0.1:54321/3/ParseSetup --data 'source_frames=["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]' © H2O.ai, 2015 29
  30. 30. GBM_Example.flow, Step 2 Result { "source_frames": [ { "URL": "/3/Frames/http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz" } ], "parse_type": "CSV", "separator": 44, "column_names": null, "column_types": [ "Numeric", "Numeric", ... ], "destination_frame": "arrhythmia.hex", "header_lines": 0, "number_columns": 280, "data": [ [ "75", "0", "190", ... ], ... ] © H2O.ai, 2015 30
  31. 31. GBM_Example.flow, Step 3: Parse In Flow: parseFiles paths: ["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"] destination_frame: "arrhythmia.hex" parse_type: "CSV" separator: 44 number_columns: 280 single_quotes: false column_names: null column_types: ["Numeric","Numeric",...,"Numeric"] delete_on_done: true check_header: -1 chunk_size: 4194304 © H2O.ai, 2015 31
  32. 32. GBM_Example.flow, Step 3: Parse In curl: curl -X POST http://127.0.0.1:54321/3/Parse --data 'destination_frame=arrhythmia.hex& source_frames=["http://s3.amazonaws.com/h2o-public-test-data/smalldata/flow_examples/arrhythmia.csv.gz"]& parse_type=CSV &separator=44& number_columns=280& single_quotes=false& column_names=& column_types=["Numeric"...,"Numeric","Numeric","Numeric","Numeric","Numeric","Numeric","Numeric"]& check_header=-1& delete_on_done=true& chunk_size=4194304' © H2O.ai, 2015 32
  33. 33. GBM_Example.flow, Step 3 Result { "job": { "key": { "URL": "/3/Jobs/$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a" }, "description": "Parse", "status": "RUNNING", "progress_msg": "Ingesting files.", "dest": { "URL": "/3/Frames/arrhythmia.hex" }, "exception": null, "messages": [ ], "error_count": 0 },... } © H2O.ai, 2015 33
  34. 34. GBM_Example.flow, Step 4: Poll for job completion Flow polls for Job completion automagically: © H2O.ai, 2015 34
  35. 35. GBM_Example.flow, Step 4: Result "jobs": [ { "key": { "URL": "/3/Jobs/$03010a010a7f32d4ffffffff$_b98fc5bba38d21ea53da2a0834c44f7a" }, "description": "Parse", "status": "RUNNING", "progress_msg": "Ingesting files.", "dest": { "name": "arrhythmia.hex", "URL": "/3/Frames/arrhythmia.hex" }, "error_count": 0, "exception": null, "messages": [], } ] © H2O.ai, 2015 35
  36. 36. GBM_Example.flow, Step 5: Train the Model In Flow: buildModel 'gbm', {"model_id":"gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1", "training_frame":"arrhythmia.hex","score_each_iteration":false, "response_column":"C1","ntrees":"20","max_depth":5, "min_rows":"25","nbins":20,"learn_rate":"0.3","distribution":"AUTO", "balance_classes":false,"max_confusion_matrix_size":20, "max_hit_ratio_k":10,"class_sampling_factors":[], "max_after_balance_size":5,"seed":0} © H2O.ai, 2015 36
  37. 37. GBM_Example.flow, Step 5: Train the Model In curl: curl -X POST http://127.0.0.1:54321/3/ModelBuilders/gbm --data 'model_id=gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1& training_frame=arrhythmia.hex&response_column=C1& score_each_iteration=false&ntrees=20&max_depth=5& min_rows=25&nbins=20&learn_rate=0.3&distribution=AUTO& balance_classes=false&max_confusion_matrix_size=20& max_hit_ratio_k=10&class_sampling_factors=& max_after_balance_size=5&seed=0' © H2O.ai, 2015 37
  38. 38. GBM_Example.flow, Step 5: Result { "job": { "key": { "URL": "/3/Jobs/$03010a010a7f32d4ffffffff$_881e60f52af792b71d20540604b742dd" }, "description": "GBM", "status": "RUNNING", "progress_msg": "Running...", "dest": { "URL": "/3/Models/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1", ... }, ... }, "algo": "gbm", "algo_full_name": "Gradient Boosting Machine", "messages": [], "error_count": 0, "parameters": [ ... ] } © H2O.ai, 2015 38
  39. 39. GBM_Example.flow, Step 6: Poll for job completion Same as for Parse © H2O.ai, 2015 39
  40. 40. GBM_Example.flow, Step 7: View the Model In Flow: getModel "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1" In curl: curl -X GET 'http://127.0.0.1:54321/3/Models/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1' © H2O.ai, 2015 40
  41. 41. GBM_Example.flow, Step 7: Result { "model_id": { "URL": "/3/Models/gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1" }, "algo": "gbm", "parameters": [...], "output": { "__meta": { "schema_name": "GBMModelOutputV3", }, "model_category": "Regression", "scoring_history": { ... }, "training_metrics": { "model_category": "Regression", "MSE": 31.32188458883, "r2": 0.88422887487626, "mean_residual_deviance": 31.32188458883 }, "status": "DONE", "run_time": 3211, © H2O.ai, 2015 41
  42. 42. GBM_Example.flow, Step 8: Predictions In Flow: predict model: "gbm-51b9780b-70d0-40d0-9b5a-c723a3f358c1", frame: "arrhythmia.hex", predictions_frame: "prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db" In curl: curl -X GET 'http://127.0.0.1:54321/3/Frames/prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db ?column_offset=0&column_count=20' © H2O.ai, 2015 42
  43. 43. GBM_Example.flow, Step 8: Result "model_metrics": [ { "predictions": { "frame_id": { "URL": "/3/Frames/prediction-9d6f23f3-45c2-4e1f-a48e-393b1b7de6db" }, "total_column_count": 1, "rows": 452, "columns": [ { "label": "predict", "data": [ 35.275735166748, 53.253980894466, 41.531820529033 ], } ], "MSE": 31.321880321916, "r2": 0.88422889064751, "mean_residual_deviance": 31.321880321916 © H2O.ai, 2015 43
  44. 44. Documentation • long version of this content is here: https://github.com/h2oai/h2o-3/blob/master/ h2o-docs/src/api/REST/h2o_3_rest_api_overview.md • reference in the Help sidebar in Flow • reference on the H2O.ai website, http://docs.h2o.ai/ • reference doc is generated via the /Metadata endpoints, so it's always current © H2O.ai, 2015 44
  45. 45. THANKS! Questions? © H2O.ai, 2015 45

×