官术网_书友最值得收藏!

BigQuery public datasets

Google is continually adding publicly available data for developers to use and evaluate BigQuery's capabilities and performance. They can also build demo products based on these public datasets. The user will not be billed for the storage part of these public datasets, but they will be billed for the bytes processed when they run a query on these public datasets. As mentioned previously, the user can use a validator to estimate the number of bytes to be processed for a query.

If you are an IT service provider, then showcase your ideas on Big Data using the public datasets in BigQuery. You can see some of the cool dashboards built for BigQuery data at https://www.bimeanalytics.com/dashboards.

One of the datasets that contains huge data is bigquery-public-data:github_repos, which stores GitHub data for the repositories. One of the tables in the dataset, named files, has over 2 billion records. Querying such large data will give users an idea of the performance of BigQuery. To view that table click on the dropdown menu in the project and choose Display project as shown in the following screenshot:

Enter the project name bigquery-public-data in the dialog box and click on the OK button after choosing the options shown in the screenshot:

Choose the files table in the project bigquery-public-data under the dataset github_repos as shown in the following screenshot. Look at the schema for the table and execute some sample queries in this table to evaluate the performance of BigQuery:

As per a white paper in 2012 (https://cloud.google.com/files/BigQueryTechnicalWP.pdf), BigQuery can complete a full scan of 35 billion rows and return results in tens of seconds without any index for the table.

主站蜘蛛池模板: 新晃| 依兰县| 固原市| 宜昌市| 罗平县| 松桃| 大石桥市| 栖霞市| 汾西县| 黄浦区| 青阳县| 农安县| 中西区| 宣汉县| 全南县| 满城县| 乌审旗| 洛南县| 明光市| 绥滨县| 桦南县| 岳池县| 开原市| 西充县| 芜湖市| 乐东| 监利县| 叶城县| 枝江市| 山阳县| 苏尼特右旗| 吐鲁番市| 汉寿县| 贵阳市| 昌平区| 汪清县| 常熟市| 嵊泗县| 长丰县| 阳山县| 怀化市|