官术网_书友最值得收藏!

Ethical implications of manipulating data

There are many ethical implications and risks when manipulating data that you need to know. We live in a world where most deep learning algorithms will have to be corrected, by re-training them, because it was found that they were biased or unfair. That is very unfortunate; you want to be a person who exercises responsible AI and produces carefully thought out models. 

When manipulating data, be careful about removing outliers from the data just because you think they are decreasing your model's performance. Sometimes, outliers represent information about protected groups or minorities, and removing those perpetuates unfairness and introduces bias toward the majority groups. Avoid removing outliers unless you are absolutely sure that they are errors caused by faulty sensors or human error. 

Be careful of the way you transform the distribution of the data. Altering the distribution is fine in most cases, but if you are dealing with demographic data, you need to pay close attention to what you are transforming.

When dealing with demographic information such as gender, encoding female and male as 0 and 1 could be risky if we are considering proportions; we need to be careful not to promote equality (or inequality) that does not reflect the reality of the community that will use your models. The exception is when our current reality shows unlawful discrimination, exclusion, and bias. Then, our models (based on our data) should not reflect this reality, but the lawful reality that our community wants. That is, we will prepare good data to create models not to perpetuate societal problems, but models that will reflect the society we want to become.

主站蜘蛛池模板: 新绛县| 凤阳县| 星子县| 宁波市| 饶平县| 德江县| 大方县| 花垣县| 苍山县| 大丰市| 鹿泉市| 常熟市| 盘锦市| 庆元县| 墨玉县| 贞丰县| 汤原县| 西吉县| 上虞市| 峨山| 贵阳市| 盐城市| 乐昌市| 石景山区| 昂仁县| 崇仁县| 墨竹工卡县| 武宁县| 湖南省| 克山县| 平乐县| 遵义市| 江油市| 兴和县| 缙云县| 无为县| 柳林县| 泾阳县| 秦安县| 湟源县| 阿克陶县|