官术网_书友最值得收藏!

Putting data into data repositories

Let's say that we have a simple text file:

$ cat blah.txt 
This is an example file.

If this file is part of the data we are utilizing in our ML workflow, we should version it. To version this file in our repository, myrepo, we just need to commit it into that repository:

$ pachctl put-file myrepo master -c -f blah.txt 

The -c flag specifies that we want Pachyderm to open a new commit, insert the file we are referencing, and close the commit all in one shot. The -f flag specifies that we are providing a file.

Note that we are committing a single file to the master branch of a single repository here. However, the Pachyderm API is incredibly flexible. We can commit, delete, or otherwise modify many versioned files in a single commit or over multiple commits. Further, these files could be versioned via a URL, object store link, database dump, and so on.

As a sanity check, we can confirm that our file was versioned in the repository:

$ pachctl list-repo
NAME CREATED SIZE
myrepo 10 minutes ago 25 B
$ pachctl list-file myrepo master
NAME TYPE SIZE
blah.txt file 25 B
主站蜘蛛池模板: 庄河市| 大方县| 深圳市| 盐津县| 从化市| 东港市| 罗平县| 普陀区| 南宁市| 深州市| 江永县| 昌吉市| 青龙| 高安市| 通山县| 榆林市| 沽源县| 休宁县| 宜丰县| 磐安县| 白山市| 富锦市| 安泽县| 大连市| 沁水县| 珲春市| 河曲县| 保靖县| 富锦市| 南京市| 渭源县| 普兰县| 靖远县| 济源市| 克东县| 思茅市| 广饶县| 永和县| 华阴市| 天台县| 洱源县|