- Machine Learning With Go
- Daniel Whitenack
- 201字
- 2021-07-08 10:37:30
Putting data into data repositories
Let's say that we have a simple text file:
$ cat blah.txt
This is an example file.
If this file is part of the data we are utilizing in our ML workflow, we should version it. To version this file in our repository, myrepo, we just need to commit it into that repository:
$ pachctl put-file myrepo master -c -f blah.txt
The -c flag specifies that we want Pachyderm to open a new commit, insert the file we are referencing, and close the commit all in one shot. The -f flag specifies that we are providing a file.
Note that we are committing a single file to the master branch of a single repository here. However, the Pachyderm API is incredibly flexible. We can commit, delete, or otherwise modify many versioned files in a single commit or over multiple commits. Further, these files could be versioned via a URL, object store link, database dump, and so on.
As a sanity check, we can confirm that our file was versioned in the repository:
$ pachctl list-repo
NAME CREATED SIZE
myrepo 10 minutes ago 25 B
$ pachctl list-file myrepo master
NAME TYPE SIZE
blah.txt file 25 B
- Kali Linux Web Penetration Testing Cookbook
- 深入淺出Prometheus:原理、應(yīng)用、源碼與拓展詳解
- Building a Recommendation Engine with Scala
- 高級(jí)C/C++編譯技術(shù)(典藏版)
- 大學(xué)計(jì)算機(jī)基礎(chǔ)實(shí)驗(yàn)指導(dǎo)
- Learning Apache Cassandra
- C#開(kāi)發(fā)案例精粹
- Unity&VR游戲美術(shù)設(shè)計(jì)實(shí)戰(zhàn)
- PyQt編程快速上手
- Visual Basic程序設(shè)計(jì)實(shí)驗(yàn)指導(dǎo)及考試指南
- Java EE架構(gòu)設(shè)計(jì)與開(kāi)發(fā)實(shí)踐
- SEO教程:搜索引擎優(yōu)化入門(mén)與進(jìn)階(第3版)
- 深入理解Java虛擬機(jī):JVM高級(jí)特性與最佳實(shí)踐
- Python高性能編程(第2版)
- WCF全面解析