官术网_书友最值得收藏!

The prediction problem

In this version of the problem, we are given a matrix of m users and n items. Each row of the matrix represents a user and each column represents an item. The value of the cell in the ith row and the jth column denotes the rating given by user i to item j. This value is usually denoted as rij

For instance, consider the matrix in the following screenshot:

This matrix has seven users rating six items. Therefore, m = 7 and n = 6. User 1 has given the item 1 a rating of 4. Therefore, r11 = 4.

Let us now consider a more concrete example. Imagine you are Netflix and you have a repository of 20,000 movies and 5,000 users. You have a system in place that records every rating that each user gives to a particular movie. In other words, you have the rating matrix (of shape 5,000 × 20,000) with you.

However, all your users will have seen only a fraction of the movies you have available on your site; therefore, the matrix you have is sparse. In other words, most of the entries in your matrix are empty, as most users have not rated most of your movies.

The prediction problem, therefore, aims to predict these missing values using all the information it has at its disposal (the ratings recorded, data on movies, data on users, and so on). If it is able to predict the missing values accurately, it will be able to give great recommendations. For example, if user i has not used item j, but our system predicts a very high rating (denoted by ij), it is highly likely that i will love j should they discover it through the system.

主站蜘蛛池模板: 兴义市| 贺兰县| 新津县| 巧家县| 永仁县| 三台县| 舟山市| 肃宁县| 昌江| 手游| 通渭县| 荆门市| 额尔古纳市| 磴口县| 瓮安县| 秦皇岛市| 峡江县| 定兴县| 邯郸市| 罗江县| 彰化县| 岳西县| 凤山县| 山丹县| 上栗县| 镇雄县| 宜兰县| 扎鲁特旗| 漯河市| 札达县| 巴彦淖尔市| 德安县| 广饶县| 铜川市| 安多县| 调兵山市| 高密市| 金溪县| 靖宇县| 西丰县| 西乌|