python - 如何匹配文本每個單詞在另一個文本中的單詞,及該單詞對應的值?
問題描述
文本ttt.txt內容:president said would bill program loan farmers corn committee department agriculture usda house 文本sss.txt內容:Topic 0th:
said 0.045193would 0.028879bill 0.011087program 0.010718loan 0.008395farmers 0.008237corn 0.008078committee 0.007022department 0.006811agriculture 0.006653usda 0.006547house 0.006494president
Topic 1th:
said 0.044315shares 0.031928stock 0.028001company 0.023888group 0.017063offer 0.016408share 0.016268dlrs 0.016034corp 0.015520common 0.013463president 0.000047
如何在sss中匹配ttt中每個單詞分別在2個主題下的單詞及對應的值?
問題解答
回答1:# coding: utf8result = {}with open(’ttt.txt’) as f_t, open(’sss.txt’) as f_s: key_set = set(f_t.read().split()) # 將ttt的每個單詞存到key集合 topic = ’’ for line in f_s:if line.startswith(’Topic’): # 儲存每個Topic topic = line.strip() result[topic] = {}else: line_split = line.split() if len(line_split) < 2:line_split.append(’None’) # 防止沒有值的key key, value = line_split if key in key_set: # 如果第一列在key集合內 就收集值result[topic].update({ key: value})print(result)
