- Deep Learning By Example
- Ahmed Menshawy
- 128字
- 2021-06-24 18:52:45
Factorizing
This approach is used to create a numerical categorical feature from any other feature. In pandas, the factorize() function does that. This type of transformation is useful if your feature is an alphanumeric categorical variable. In the Titanic data samples, we can transform the Cabin feature into a categorical feature, representing the letter of the cabin:
# the cabin number is a sequence of of alphanumerical digits, so we are going to create some features
# from the alphabetical part of it
df_titanic_data['CabinLetter'] = df_titanic_data['Cabin'].map(lambda l: get_cabin_letter(l))
df_titanic_data['CabinLetter'] = pd.factorize(df_titanic_data['CabinLetter'])[0]
def get_cabin_letter(cabin_value):
# searching for the letters in the cabin alphanumerical value
letter_match = re.compile("([a-zA-Z]+)").search(cabin_value)
if letter_match:
return letter_match.group()
else:
return 'U'
We can also apply transformations to quantitative features by using one of the following approaches.
推薦閱讀
- 機器學習實戰:基于Sophon平臺的機器學習理論與實踐
- 人工智能超越人類
- ETL with Azure Cookbook
- 走入IBM小型機世界
- 腦動力:PHP函數速查效率手冊
- Expert AWS Development
- AWS Certified SysOps Administrator:Associate Guide
- Windows游戲程序設計基礎
- Ceph:Designing and Implementing Scalable Storage Systems
- 水下無線傳感器網絡的通信與決策技術
- 深度學習與目標檢測
- ASP.NET 2.0 Web開發入門指南
- Microsoft System Center Data Protection Manager Cookbook
- 從零開始學ASP.NET
- Appcelerator Titanium Smartphone App Development Cookbook(Second Edition)