Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
ÔÚ R ÖÐʹÓà MLflow ¸ú×Ù»úÆ÷ѧϰģÐÍ
 
  2308  次浏览      28
 2019-2-14
   
 
±à¼­ÍƼö:

±¾ÎÄÀ´×ÔÓÚibm£¬ÎÄÕ½éÉÜÈçºÎ°²×°ºÍÉèÖà MLflow »·¾³£¬ÔÚ R ÖÐѵÁ·ºÍ¸ú×Ù»úÆ÷ѧϰģÐÍ£¬½«Ô´´úÂëºÍÊý¾Ý·â×°ÔÚ MLproject ÖУ¬²¢Ê¹Óà mlflow run ÃüÁîÔËÐÐÏîÄ¿µÈ¡£

ÔÚ±¾ÎÄÕÂÖУ¬ÎÒ»á¼òÒªµØ½éÉÜ MLflow ¼°Æä¹¤×÷·½Ê½¡£MLflow ĿǰÌṩÁË Python ÖÐµÄ API£¬Äú¿ÉÒÔÔÚ»úÆ÷ѧϰԴ´úÂëÖе÷ÓÃÕâЩ API À´¼Ç¼ MLflow ¸ú×Ù·þÎñÆ÷Òª¸ú×ٵIJÎÊý¡¢Ö¸±êºÍ¹¤¼þ¡£

Èç¹ûÄúÊìϤ»úÆ÷ѧϰ²Ù×÷²¢ÔÚ R ÖÐÖ´ÐÐÁËÕâЩ²Ù×÷£¬ÄÇô¿ÉÄÜÏëҪʹÓà MLflow À´¸ú×ÙÄ£ÐͺÍÿ´ÎÔËÐС£Äú¿ÉÒÔʹÓÃÒÔϼ¸ÖÖ·½·¨£º

µÈ´ý MLflow ·¢²¼ R ÖÐµÄ API

·â×° MLflow RESTful API ²¢Í¨¹ý curl ÃüÁî½øÐмǼ

ʹÓÃһЩ¿Éµ÷Óà Python ½âÊÍÆ÷µÄ R °üÀ´µ÷ÓÃÏÖÓÐµÄ Python API

×îºóÒ»ÖÖ·½·¨¼òµ¥Ò×ÐУ¬²¢ÔÊÐíÄúÓë MLflow ½øÐн»»¥£¬¶øÎÞÐèµÈ´ýÌṩ R µÄ API¡£ÔÚ±¾½Ì³ÌÖУ¬ÎÒ½«ËµÃ÷ÈçºÎʹÓà reticulate R °üÀ´Ö´Ðд˲Ù×÷¡£

reticulate ÊÇÒ»¸ö¿ªÔ´ R °ü£¬ËüÔÊÐíͨ¹ýÔÚ R »á»°ÖÐǶÈë Python »á»°À´´Ó R Öе÷Óà Python¡£¸Ã°üÔÚ R Óë Python Ö®¼äÌṩÎÞ·ìµÄ¸ßÐÔÄÜ»¥²Ù×÷ÐÔ¡£ÔÚ CRAN ´æ´¢¿âÖÐÌṩÁ˸ðü¡£

MLflow »¹Ë渽ÁË Projects ×é¼þ£¬¸Ã×é¼þ»á½«Êý¾Ý¡¢Ô´´úÂë¼°ÃüÁî¡¢²ÎÊýºÍÖ´Ðл·¾³ÉèÖÃÒ»Æð´ò°üΪһ¸ö¶ÀÁ¢¹æ·¶¡£ÔÚ¶¨Òå MLproject ºó£¬¿ÉÒÔÔÚÈκεط½ÔËÐдËÏîÄ¿¡£Ä¿Ç°£¬MLproject ¿ÉÒÔÔËÐÐ Python ´úÂë»ò shell ÃüÁî¡£Ëü»¹¿ÉÒÔΪÓû§¶¨ÒåµÄ conda.yaml ÎļþÖÐÖ¸¶¨µÄÏîÄ¿ÉèÖà Python »·¾³¡£

¶ÔÓÚ R Óû§£¬Í¨³£»áÔÚ R Ô´´úÂëÖе¼ÈëһЩ°ü¡£±ØÐë°²×°ÕâЩ°ü²ÅÄÜÔËÐÐ R ´úÂ롣δÀ´¿ÉÒÔ×ö³öµÄÒ»ÏîÓÅ»¯ÊÇ£¬ÔÚ MLflow ÖÐÌí¼ÓÀàËÆÓÚ conda.yaml µÄ¹¦ÄÜÀ´ÉèÖà R °üÒÀÀµÏî¡£±¾½Ì³Ì½éÉÜÁËÈçºÎ´´½¨°üº¬ R Ô´´úÂëµÄ MLproject£¬ÒÔ¼°ÈçºÎʹÓà mlflow run ÃüÁîÔËÐдËÏîÄ¿¡£

ѧϰĿ±ê

ÔÚ±¾½Ì³ÌÖУ¬Äú½«°²×°ºÍÉèÖà MLflow »·¾³£¬ÔÚ R ÖÐѵÁ·ºÍ¸ú×Ù»úÆ÷ѧϰģÐÍ£¬½«Ô´´úÂëºÍÊý¾Ý·â×°ÔÚ MLproject ÖУ¬²¢Ê¹Óà mlflow run ÃüÁîÔËÐдËÏîÄ¿¡£

ǰÌáÌõ¼þ

ÔÚ¿ªÊ¼±¾½Ì³Ì֮ǰ£¬Ó¦¸ÃÏÈÔÚÔËÐÐ R µÄƽ̨Éϰ²×° Python¡£ÎÒÊ×Ñ¡°²×° miniconda¡£ÓÉÓÚ½«ÔÚ R ÖÐÍê³É»úÆ÷ѧϰѵÁ·£¬Òò´ËÒ²Ó¦¸ÃÔÚÆ½Ì¨Éϰ²×°ÁË R¡£

²½Öè

µÚ 1 ²½£º°²×° MLflow

Ϊ MLflow ´´½¨ virtualenv£¬²¢°´ÈçÏ·½Ê½°²×° mlflow °ü£¨Ê¹Óà conda£©£º

conda create -q -n mlflow python=3.6
source activate mlflow
pip install -U pip
pip install mlflo

µÚ 2 ²½£º°²×° reticulate R °ü

ͨ¹ý R °²×° reticulate °ü¡£

install.packages("reticulate")

reticulate ÔÊÐí R ÎÞ·ìµ÷Óà Python º¯Êý¡£Í¨¹ý import Óï¾ä×°Èë Python °ü¡£Í¨¹ý $ ÔËËã·ûµ÷Óú¯Êý¡£

> library(reticulate)
> path <- import("os.path")
> path$isdir("/tmp")
[1] TRUE

ÕýÈçÄúËù¼û£¬Ê¹Óô˰ü´Ó R Öе÷Óà os.path Ä£¿éÖÐµÄ Python º¯ÊýÊ®·Ö¼òµ¥¡£Í¨¹ýµ¼Èë mlflow °ü£¬È»ºóµ÷Óà mlflow$log_param ºÍ mlflow$log_metric ÒԼǼ R ½Å±¾µÄ²ÎÊýºÍÖ¸±ê£¬¿ÉÒÔ¶Ô mlflow °üÖ´ÐÐÏàͬµÄ²Ù×÷¡£

µÚ 3 ²½£ºÊ¹Óà SparkR ѵÁ· GLM Ä£ÐÍ

ÒÔÏ R ½Å±¾Ê¹Óà SparkR ¹¹½¨ÏßÐԻعéÄ£ÐÍ¡£¶ÔÓÚ´ËʾÀý£¬±ØÐëÒѰ²×° SparkR °ü¡£

# load the reticulate package and import mlflow Python module
library(reticulate)
mlflow <- import("mlflow")

# load SparkR package and start spark session
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sparkR.session(master="local[*]")

# convert iris data.frame to SparkDataFrame
df <- as.DataFrame(iris)

# parameter for GLM
family <- c("gaussian")

# log the parameter
mlflow$log_param("family", family)

# fit the GLM model
model <- spark.glm(df, Species ~ ., family = family)

# exam the model
summary(model)

# path to save the model
model_path <- "/tmp/mlflow-GLM"

# save the model
write.ml(model, model_path)

# log the artifact
mlflow$log_artifacts(model_path)

# stop spark session
sparkR.session.stop()

Äú¿ÉÒÔ½«½Å±¾¸´ÖƵ½ R »ò Rstudio ²¢ÒÔ½»»¥·½Ê½ÔËÐиýű¾£¬»òÕß½«Æä±£´æµ½ÎļþÖв¢Ê¹Óà Rscript ÃüÁîÔËÐиýű¾¡£È·±£ PATH »·¾³±äÁ¿°üº¬ mlflow Python virtualenv µÄ·¾¶¡£

µÚ 4 ²½£ºÆô¶¯ MLflow UI

ͨ¹ý´Ó shell ÖÐÔËÐÐ mlflow ui ÃüÁîÀ´Æô¶¯ MLflow UI¡£È»ºó£¬´ò¿ªä¯ÀÀÆ÷²¢Ê¹Óà URL http://127.0.0.1:5000 תÖÁÒ³ÃæÁ´½Ó¡£ÏÖÒÑÏÔʾÄú֮ǰµÄ GLM Ä£ÐÍѵÁ·£¬Äú¿ÉÒÔ¶ÔÆä½øÐиú×Ù¡£ÏÂͼÏÔʾÁËÆä½ØÆÁ¡£

µÚ 5 ²½£ºÑµÁ·¾ö²ßÊ÷Ä£ÐÍ

½«ÒªÑ§Ï°µÄ wine-quality.csv Êý¾ÝÏÂÔØµ½ÄúµÄƽ̨¡£

ÔÚ R »·¾³Öа²×° rpart °ü£º

install.packages("rpart")

°´ÕÕ´ËʾÀý rpart-example.R À´ÎªÊ÷Ä£ÐÍ×öºÃ×¼±¸£º

# Source prep.R file to install the dependencies
source("prep.R")

# Import mlflow python package for tracking
library(reticulate)
mlflow <- import("mlflow")

# Load rpart to build a tree model
library(rpart)

# Read in data
wine <- read.csv("wine-quality.csv")

# Build the model
fit <- rpart(quality ~ ., wine)

# Save the model that can be loaded later
saveRDS(fit, "fit.rpart")

# Save the model to mlflow tracking server
mlflow$log_artifact("fit.rpart")

# Plot
jpeg("rplot.jpg")
par(xpd=TRUE)
plot(fit)
text(fit, use.n=TRUE)
dev.off()

# Save the plot to mlflow tracking server
mlflow$log_artifact("rplot.jpg")

R ´úÂë°üÀ¨Èý¸ö²¿·Ö£ºÄ£ÐÍѵÁ·¡¢Í¨¹ý MLflow ʵÏֵŤ¼þ¼Ç¼ÒÔ¼° R °üÒÀÀµÏî°²×°¡£

µÚ 6 ²½£ºÎª MLproject ×¼±¸°üÒÀÀµÏî

ÔÚÇ°ÃæµÄʾÀýÖУ¬ÐèÒª reticulate ºÍ rpart R °ü²ÅÄÜÔËÐдúÂë¡£Òª½«ÕâЩ´úÂë·â×°µ½Ò»¸ö¶ÀÁ¢ÏîÄ¿ÖУ¬Èç¹ûƽ̨ûÓа²×°ÕâЩ°ü£¬ÄÇôӦÔËÐÐijÖֽű¾À´×Ô¶¯°²×°ÕâЩ°ü¡£

½«Ê¹ÓÃÒÔÏ´úÂ롢ͨ¹ý prep.R À´°²×°ÏîÄ¿ËùÐèµÄËùÓÐÌØ¶¨ R °ü£º

# Accept parameters, args[6] is the R package repo url
args <- commandArgs()

# All installed packages
pkgs <- installed.packages()

# List of required packages for this project
reqs <- c("reticulate", "rpart")

# Try to install the dependencies if not installed
sapply(reqs, function(x){
if (!x %in% rownames(pkgs)) {
install.packages(x, repos=c(args[6]))
}
})

µÚ 7 ²½£º²âÊÔ´úÂë

ÔÚ½«ÕâЩ´úÂë·â×°µ½ MLproject ֮ǰ£¬Çë³¢ÊÔͨ¹ýÖ±½Óµ÷Óà Rscript ÃüÁîÀ´²âÊÔÕâЩ´úÂ룬ÈçÏÂËùʾ£º

Rscript rpart-example.R https://cran.r-project.org/

ÔÚ MLflow UI ÖУ¬ÄúÓ¦¸Ã¿´µ½Õâ´ÎÔËÐÐÒѱ»¸ú×Ù£¬ÈçÏÂͼËùʾ£º

µÚ 8 ²½£º´´½¨ MLproject

ÏÖÔÚ£¬ÎÒÃÇÀ´±àд¹æ·¶£¬²¢½«´ËÏîÄ¿·â×°µ½ MLflow ¿Éʶ±ð²¢ÔËÐÐµÄ MLproject ÖС£ÄúÖ»ÐèÒªÔÚͬһ¸öĿ¼Öд´½¨ MLproject Îļþ¡£

name: r_example

entry_points:
main:
parameters:
r-repo: {type: string, default: "https://cran.r-project.org/"}
command: "Rscript rpart-example.R {r-repo}"

´ËÎļþʹÓà main Èë¿Úµã¶¨Òå r_example ÏîÄ¿¡£¸ÃÈë¿ÚµãÖ¸¶¨ÒªÍ¨¹ý mlflow run Ö´ÐеÄÃüÁîºÍ²ÎÊý¡£¶ÔÓÚ´ËÏîÄ¿£¬Rscript ÊÇÓÃÓÚµ÷Óà R Ô´´úÂëµÄ shell ÃüÁî¡£r-repo ²ÎÊý»áÌṩ URL ×Ö·û´®£¬Äú¿ÉÒÔͨ¹ýËüÀ´°²×°´ÓÊô°ü¡£ÒÑÉèÖÃÒ»¸öȱʡֵ¡£½«´Ë²ÎÊý´«µÝÖÁÓÃÓÚÔËÐÐ R Ô´´úÂëµÄÃüÁî¡£

ÏÖÔÚ£¬ÄúÒÑÓµÓÐѵÁ·´ËÊ÷Ä£ÐÍËùÐèµÄËùÓÐÎļþ£¬¿ÉÒÔͨ¹ý´´½¨Ä¿Â¼²¢½«Êý¾ÝºÍ R Ô´´úÂë¸´ÖÆµ½¸ÃĿ¼À´´´½¨ MLproject¡£

.
©¸©¤©¤ R
  ©À©¤©¤ MLproject
  ©À©¤©¤ prep.R
  ©À©¤©¤ rpart-example.R
  ©¸©¤©¤ wine-quality.csv

µÚ 9 ²½£º¼ìÈë²¢²âÊÔ MLproject

¿ÉÒÔ½«ÏÈǰµÄ MLproject ¼ìÈë²¢ÍÆË͵½ GitHub ´æ´¢¿â¡£Ê¹ÓÃÒÔÏÂÃüÁîÀ´²âÊÔ¸ÃÏîÄ¿¡£¿ÉÒÔÔÚ°²×°ÁË R µÄÈÎºÎÆ½Ì¨ÉÏÔËÐиÃÏîÄ¿¡£

mlflow run https://github.com/adrian555/DocsDump#files/mlflow-projects/R

Ò²¿ÉÒÔ´Ó MLflow ¸ú×Ù½çÃæÖв鿴¸ÃÏîÄ¿£¬ÈçÏÂͼËùʾ£º

´ËÊÓͼÓëǰһ´ÎÔËÐУ¨ÎÞ Mlproject ¹æ·¶£©Ö®¼äµÄ²îÒìÊÇ Run Command£¨½«²¶»ñÓÃÓÚÔËÐÐÏîÄ¿µÄÈ·ÇÐÃüÁºÍ Parameters£¨½«×Ô¶¯¼Ç¼´«µÝµ½Èë¿ÚµãµÄÈκβÎÊý£©¡£

½áÊøÓï

ÔÚ±¾½Ì³ÌÖУ¬ÄúÒÑÔÚ R Öгɹ¦µØ´´½¨ÁË MLproject£¬²¢Ê¹Óà MLflow ¸ú×ÙºÍÔËÐÐÁ˸ÃÏîÄ¿¡£´Ë·½·¨Èà R Óû§Äܹ»Ê¹Óà MLflow Tracking ×é¼þ£¬´Ó¶ø¿ÉÒÔ¿ìËÙ¸ú×Ù R Ä£ÐÍ¡£Ëü»¹ÑÝʾÁË MLflow µÄ Projects ×é¼þµÄÓÃ;£¬¼´¶¨ÒåÏîÄ¿²¢Ê¹ÏîÄ¿±ãÓÚÖØÐÂÔËÐС£R Óû§¿ÉÒÔ¿ìËÙÉèÖÃÆäÏîÄ¿£¬²¢ÇÒ¿ÉÒÔʹÓà MLflow ÇáËɸú×ÙºÍÔËÐÐÏîÄ¿¡£

 
   
2308 ´Îä¯ÀÀ       28
Ïà¹ØÎÄÕÂ

»ùÓÚͼ¾í»ýÍøÂçµÄͼÉî¶Èѧϰ
×Ô¶¯¼ÝÊ»ÖеÄ3DÄ¿±ê¼ì²â
¹¤Òµ»úÆ÷ÈË¿ØÖÆÏµÍ³¼Ü¹¹½éÉÜ
ÏîĿʵս£ºÈçºÎ¹¹½¨ÖªÊ¶Í¼Æ×
 
Ïà¹ØÎĵµ

5GÈ˹¤ÖÇÄÜÎïÁªÍøµÄµäÐÍÓ¦ÓÃ
Éî¶ÈѧϰÔÚ×Ô¶¯¼ÝÊ»ÖеÄÓ¦ÓÃ
ͼÉñ¾­ÍøÂçÔÚ½»²æÑ§¿ÆÁìÓòµÄÓ¦ÓÃÑо¿
ÎÞÈË»úϵͳԭÀí
Ïà¹Ø¿Î³Ì

È˹¤ÖÇÄÜ¡¢»úÆ÷ѧϰ&TensorFlow
»úÆ÷ÈËÈí¼þ¿ª·¢¼¼Êõ
È˹¤ÖÇÄÜ£¬»úÆ÷ѧϰºÍÉî¶Èѧϰ
ͼÏñ´¦ÀíËã·¨·½·¨Óëʵ¼ù