官术网_书友最值得收藏!

Putting data into data repositories

Let's say that we have a simple text file:

$ cat blah.txt 
This is an example file.

If this file is part of the data we are utilizing in our ML workflow, we should version it. To version this file in our repository, myrepo, we just need to commit it into that repository:

$ pachctl put-file myrepo master -c -f blah.txt 

The -c flag specifies that we want Pachyderm to open a new commit, insert the file we are referencing, and close the commit all in one shot. The -f flag specifies that we are providing a file.

Note that we are committing a single file to the master branch of a single repository here. However, the Pachyderm API is incredibly flexible. We can commit, delete, or otherwise modify many versioned files in a single commit or over multiple commits. Further, these files could be versioned via a URL, object store link, database dump, and so on.

As a sanity check, we can confirm that our file was versioned in the repository:

$ pachctl list-repo
NAME CREATED SIZE
myrepo 10 minutes ago 25 B
$ pachctl list-file myrepo master
NAME TYPE SIZE
blah.txt file 25 B
主站蜘蛛池模板: 综艺| 和平区| 信宜市| 龙川县| 晴隆县| 专栏| 绥江县| 鹤山市| 鄢陵县| 龙游县| 阳城县| 老河口市| 虞城县| 讷河市| 临泽县| 武城县| 黄骅市| 英德市| 河东区| 泌阳县| 景东| 尚义县| 社旗县| 额尔古纳市| 阿鲁科尔沁旗| 吴旗县| 兴文县| 杂多县| 军事| 宜州市| 紫阳县| 梧州市| 新疆| 鲁甸县| 宁武县| 九江市| 夏津县| 阜宁县| 颍上县| 沂南县| 屯留县|