- Machine Learning With Go
- Daniel Whitenack
- 201字
- 2021-07-08 10:37:30
Putting data into data repositories
Let's say that we have a simple text file:
$ cat blah.txt
This is an example file.
If this file is part of the data we are utilizing in our ML workflow, we should version it. To version this file in our repository, myrepo, we just need to commit it into that repository:
$ pachctl put-file myrepo master -c -f blah.txt
The -c flag specifies that we want Pachyderm to open a new commit, insert the file we are referencing, and close the commit all in one shot. The -f flag specifies that we are providing a file.
Note that we are committing a single file to the master branch of a single repository here. However, the Pachyderm API is incredibly flexible. We can commit, delete, or otherwise modify many versioned files in a single commit or over multiple commits. Further, these files could be versioned via a URL, object store link, database dump, and so on.
As a sanity check, we can confirm that our file was versioned in the repository:
$ pachctl list-repo
NAME CREATED SIZE
myrepo 10 minutes ago 25 B
$ pachctl list-file myrepo master
NAME TYPE SIZE
blah.txt file 25 B
- HTML5移動Web開發技術
- 國際大學生程序設計競賽中山大學內部選拔真題解(二)
- C語言程序設計習題解析與上機指導(第4版)
- SQL語言從入門到精通
- Java EE 7 Performance Tuning and Optimization
- Learning Unreal Engine Android Game Development
- Android傳感器開發與智能設備案例實戰
- 從零開始學Python網絡爬蟲
- Arduino計算機視覺編程
- Arduino可穿戴設備開發
- 超簡單:用Python讓Excel飛起來(實戰150例)
- 數據科學中的實用統計學(第2版)
- Hands-On Dependency Injection in Go
- Python Linux系統管理與自動化運維
- GO語言編程從入門到實踐