執行
token = Tokenizer(num_words=3000)
token.fit_on_texts(train_text)
出現:
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-250-1f498ce7daab> in <module>() 1 token = Tokenizer(num_words=3000) ----> 2 token.fit_on_texts(train_text) 3 #if lower: 4 # train_text = train_text.lower() C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\preprocessing\text.py in fit_on_texts(self, texts) 186 self.filters, 187 self.lower, --> 188 self.split) 189 for w in seq: 190 if w in self.word_counts: C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\preprocessing\text.py in text_to_word_sequence(text, filters, lower, split) 37 """ 38 if lower: ---> 39 text = text.lower() 40 41 if sys.version_info < (3,) and isinstance(text, unicode): AttributeError: 'float' object has no attribute 'lower'
在轉入資料時因有部分欄位的資料為NaN
因此在執行token.fit_on_texts(train_text)時會發生如上的錯誤,
可以在資料匯入後先將NaN的空值轉換為''空白,就可防止上面的錯誤。
df = pd.read_csv("gender-classifier-DFE-791531.csv",delimiter=',', encoding='latin1')
df['description'] = df['description'].fillna('')